Hex-Rays Blog: IDA Pro Tutorials & Reverse Engineering Tips

Disassemblers Aplenty: ARM64 Extensions (SVE, MTE, CSSC), AndeStar™ (V3 and V5), ARC, TriCore & More

Written by Alexandru Petenchea | Jan 30, 2026

These upcoming updates primarily help users working close to the metal, on kernels, firmware, and embedded targets, where stack behavior and control flow need to be trustworthy.

By improving instruction decoding, idiom recognition, and cross-reference generation, these changes turn previously noisy or ambiguous regions of code into something that can be analyzed directly.

ARM64 (Apple extensions and analysis improvements)

SVE Decoding (Apple kernels)

When released, IDA 9.3 improves decoding of Scalable Vector Extension (SVE) and Scalable Matrix Extension (SME) instructions as used in recent Apple kernels.

STR     Z0, [X2]

STR     Z1, [X2,#1,MUL VL]

STR     Z2, [X2,#2,MUL VL]

STR     Z3, [X2,#3,MUL VL]

UZP1    V0.4S, V0.4S, V1.4S

ZIP1    V2.4S, V2.4S, V2.4S


Concretely:

  • Z and P registers. These are SVE’s vector (Z0, Z1, …) and predicate (P0, P1, …) registers.
  • ZA arrays and ZT0 tables (SME-specific). SME introduces new register-like storage (e.g. ZA) that behaves more like a tiled array than a normal register. IDA now prints accesses like: ZA[Wv,#offset], which tells you which tile, which index, and which offset is being accessed.
  • MUL VL scaling. SVE uses the vector length (VL), which is not fixed at compile time.

Memory Tagging Extension (MTE) intrinsics

The ARM64 decompiler now emits explicit intrinsics for Memory Tagging Extension instructions instead of inline assembly blocks. Instructions such as IRG, ADDG, GMI, LDG, STG, and SUBP are translated into corresponding __arm_mte_* intrinsics in the decompiler output.

 

Before:

__int64 __fastcall create_tag(__int64 _X0, unsigned int a2)

{

  __int64 result; // x0

  _X8 = a2;

  __asm { IRG             X0, X0, X8 }

  return result;

}

 

After:

void *__fastcall create_tag(void *a1, unsigned int a2)

{

  return __arm_mte_create_random_tag(a1, a2);

}

 

See the release notes for a full list of supported MTE intrinsics: https://docs.hex-rays.com/release-notes/9_3beta#memory-tagging-extension-mte-intrinsics 

Common Short Sequence Compression (CSSC) decoding & intrinsics

We added decoding and decompiler support for ARMv8 CSSC instructions, including SMAX, SMIN, UMAX, UMIN, ABS, CNT, and CTZ. For scalar general-purpose register variants, the decompiler also emits corresponding intrinsics, improving readability and reducing noise in arithmetic-heavy code paths.

 

ABS             W0, W0

CNT             W21, W10

CTZ             W23, W13

SMAX            WZR, WZR, #0xFFFFFFFF

SMIN            X0, X0, #0

UMAX            XZR, XZR, #0xFFFFFFFFFFFFFFFF

UMIN            X23, X13, X8

Improved address construction from MOV/MOVK/MOVW/MOVT sequences

Address construction using MOV/MOVK (AArch64) and MOVW/MOVT (ARM/Thumb) sequences is now handled more robustly, including cases where instructions are not strictly adjacent.

 

Before

MOV             X8, #0

MOV             W9, #0x64AE

MOVK            X8, #0,LSL#16

MOVK            W9, #0x299E,LSL#16

MOVK            X8, #0,LSL#32

MOVK            X8, #0,LSL#48

 

After

MOV             X8, #(WORD0(unk_16120))

MOV             W9, #(WORD0(0x299E64AE))

MOVK            X8, #(WORD1(unk_16120)),LSL#16

MOVK            W9, #(WORD1(0x299E64AE)),LSL#16

MOVK            X8, #(WORD2(unk_16120)),LSL#32

MOVK            X8, #(WORD3(unk_16120)),LSL#48

 

At the same time we added support for AArch64 MOVW_UABS_G[0-3] relocations, allowing ELF binaries that build addresses via MOVZ/MOVK sequences to produce correct references for each 16-bit chunk and, where applicable, merged full-width addresses.

On Apple platforms, recognition of __auth_stubs sections has improved, and BTI-enabled PLT stubs in ELF binaries are now correctly detected, named, and marked as thunks.

NDS32 (Andes AndeStar™ V3)

We’re introducing a new processor module for the Andes AndeStar V3 NDS32 architecture, along with ELF loader integration. NDS32 ELF binaries are now recognized automatically, with IDA selecting the correct processor and displaying architecture and ABI details directly in the load banner.

It supports NDS32’s mixed 16-bit and 32-bit instruction encodings, covering the base ISA along with commonly used extensions such as DSP, SIMD, ZOL, FPU, and coprocessor instructions. For better readability, by default IDA hides the encoding suffixes of instruction mnemonics (i.e. movi rather than movi55, add rather than add333, etc.) and the dollar prefixes of registers. If needed, these details can be enabled again in the processor module options.

Concretely:

  • Disassembly and instruction coverage.
    Mixed 16-bit and 32-bit instruction encodings, covering the base ISA along with commonly used extensions such as DSP, SIMD, FPU, and coprocessor instructions.
  • Aggressive resolution of indirections. Memory accesses via gp-relative addressing and direct calls assembled from constants at runtime both are tracked, with resolved targets displayed as comments next to the corresponding instructions.
  • ex9.it instruction resolution. The procmod support for resolving ex9.it instructions, which are table-indexed 16-bit opcodes that depend on the ITB (Instruction Table Base). When the ITB value is known (either from symbols or register initialization) IDA can resolve these instructions into their corresponding 32-bit forms. This behavior is optional and controlled via a processor setting.

ARC (Push/Pop idiom recognition)

Clearer prologues and epilogues

IDA 9.3 improves ARC disassembly by recognizing common stack save and restore idioms and presenting them as explicit push and pop instructions. Store and load instructions that update the stack pointer by one word are now rewritten when they semantically represent a push or pop.

 

Before

st.a    r16, [sp,0xC+var_10]

st.a    r17, [sp,0x10+var_14]

st.a    r18, [sp,0x14+var_18]

st.a    r19, [sp,0x18+var_1C]

st.a    r20, [sp,0x1C+var_20]

st.a    r21, [sp,0x20+var_24]

st.a    fp, [sp,0x24+var_28]

 

After

push    r16

push    r17

push    r18

push    r19

push    r20

push    r21

push    fp

Improved stack analysis

Because pushes and pops are now explicit, IDA’s stack tracking logic can reason about ARC functions more reliably. Stack pointer deltas are derived directly from push/pop instructions, which leads to more consistent stack point placement and frame sizing.

This also preserves compatibility with existing ARC analysis features. In particular, ARCompact millicode patterns that rely on recognizing chains of push instructions continue to be detected correctly, even with instruction simplification enabled.

These changes are also reflected in the decompiler.

Andes RISC-V extensions (configurable decoding)

Both the RISC-V disassembler and the decompiler now support the Andes XAndesPerf extension (29 new instructions). We’ve also added processor specific options for vendor extensions (currently Andes XAndesPerf and T-Head XThead). You can toggle those extensions in the RISC-V specific options dialog.

When only Andes decoding is enabled, those words disassemble as nds.* instructions; with only T-Head enabled, they disassemble as th.*. If you’d rather not have these prefixes, you can enable the “Hide extension prefixes” option.

If both vendor extensions are enabled and an encoding collides, the procmod doesn’t silently choose one interpretation: it keeps one as the primary decode and emits the other as an auto-comment (for example: nds.lbgp        a0, 11BC5h ; alt XThead: th.lrb). That makes ambiguity explicit during analysis, which is especially helpful when you’re dealing with incomplete core documentation or binaries built for slightly different configurations.

 

TriCore

IDA 9.3 includes a set of focused analysis improvements for TriCore that mainly affect how values are tracked and how control flow is reconstructed. You can now use the regfinder API programmatically:

 

import ida_idp

import ida_regfinder

 

reg = ida_idp.str2reg(reg)

res = ida_regfinder.find_reg_value(ea, reg)

 

Switch statement detection has also been refined, allowing more jump tables to be recognized and rendered as structured control flow instead of opaque indirect branches.

Wrapping it Up

Taken together, these changes make disassembly more trustworthy and analysis workflows more predictable. As always, feedback from real binaries is invaluable, particularly for edge cases that still don’t decode or analyze as expected.