Disassemblers Aplenty: ARM64 Extensions (SVE, MTE, CSSC), AndeStar™ (V3 and V5), ARC, TriCore & More
These upcoming updates primarily help users working close to the metal, on kernels, firmware, and embedded targets, where stack behavior and control flow need to be trustworthy.
By improving instruction decoding, idiom recognition, and cross-reference generation, these changes turn previously noisy or ambiguous regions of code into something that can be analyzed directly.
ARM64 (Apple extensions and analysis improvements)
SVE Decoding (Apple kernels)
When released, IDA 9.3 improves decoding of Scalable Vector Extension (SVE) and Scalable Matrix Extension (SME) instructions as used in recent Apple kernels.
STR Z0, [X2]
STR Z1, [X2,#1,MUL VL]
STR Z2, [X2,#2,MUL VL]
STR Z3, [X2,#3,MUL VL]
UZP1 V0.4S, V0.4S, V1.4S
ZIP1 V2.4S, V2.4S, V2.4S
Concretely:
- Z and P registers. These are SVE’s vector (Z0, Z1, …) and predicate (P0, P1, …) registers.
- ZA arrays and ZT0 tables (SME-specific). SME introduces new register-like storage (e.g. ZA) that behaves more like a tiled array than a normal register. IDA now prints accesses like: ZA[Wv,#offset], which tells you which tile, which index, and which offset is being accessed.
- MUL VL scaling. SVE uses the vector length (VL), which is not fixed at compile time.
Memory Tagging Extension (MTE) intrinsics
The ARM64 decompiler now emits explicit intrinsics for Memory Tagging Extension instructions instead of inline assembly blocks. Instructions such as IRG, ADDG, GMI, LDG, STG, and SUBP are translated into corresponding __arm_mte_* intrinsics in the decompiler output.
Before:
__int64 __fastcall create_tag(__int64 _X0, unsigned int a2)
{
__int64 result; // x0
_X8 = a2;
__asm { IRG X0, X0, X8 }
return result;
}
After:
void *__fastcall create_tag(void *a1, unsigned int a2)
{
return __arm_mte_create_random_tag(a1, a2);
}
See the release notes for a full list of supported MTE intrinsics: https://docs.hex-rays.com/release-notes/9_3beta#memory-tagging-extension-mte-intrinsics
Common Short Sequence Compression (CSSC) decoding & intrinsics
We added decoding and decompiler support for ARMv8 CSSC instructions, including SMAX, SMIN, UMAX, UMIN, ABS, CNT, and CTZ. For scalar general-purpose register variants, the decompiler also emits corresponding intrinsics, improving readability and reducing noise in arithmetic-heavy code paths.
ABS W0, W0
CNT W21, W10
CTZ W23, W13
SMAX WZR, WZR, #0xFFFFFFFF
SMIN X0, X0, #0
UMAX XZR, XZR, #0xFFFFFFFFFFFFFFFF
UMIN X23, X13, X8
Improved address construction from MOV/MOVK/MOVW/MOVT sequences
Address construction using MOV/MOVK (AArch64) and MOVW/MOVT (ARM/Thumb) sequences is now handled more robustly, including cases where instructions are not strictly adjacent.
Before
MOV X8, #0
MOV W9, #0x64AE
MOVK X8, #0,LSL#16
MOVK W9, #0x299E,LSL#16
MOVK X8, #0,LSL#32
MOVK X8, #0,LSL#48
After
MOV X8, #(WORD0(unk_16120))
MOV W9, #(WORD0(0x299E64AE))
MOVK X8, #(WORD1(unk_16120)),LSL#16
MOVK W9, #(WORD1(0x299E64AE)),LSL#16
MOVK X8, #(WORD2(unk_16120)),LSL#32
MOVK X8, #(WORD3(unk_16120)),LSL#48
At the same time we added support for AArch64 MOVW_UABS_G[0-3] relocations, allowing ELF binaries that build addresses via MOVZ/MOVK sequences to produce correct references for each 16-bit chunk and, where applicable, merged full-width addresses.
On Apple platforms, recognition of __auth_stubs sections has improved, and BTI-enabled PLT stubs in ELF binaries are now correctly detected, named, and marked as thunks.
NDS32 (Andes AndeStar™ V3)
We’re introducing a new processor module for the Andes AndeStar™ V3 NDS32 architecture, along with ELF loader integration. NDS32 ELF binaries are now recognized automatically, with IDA selecting the correct processor and displaying architecture and ABI details directly in the load banner.
It supports NDS32’s mixed 16-bit and 32-bit instruction encodings, covering the base ISA along with commonly used extensions such as DSP, SIMD, ZOL, FPU, and coprocessor instructions. For better readability, by default IDA hides the encoding suffixes of instruction mnemonics (i.e. movi rather than movi55, add rather than add333, etc.) and the dollar prefixes of registers. If needed, these details can be enabled again in the processor module options.

Concretely:
- Disassembly and instruction coverage.
Mixed 16-bit and 32-bit instruction encodings, covering the base ISA along with commonly used extensions such as DSP, SIMD, FPU, and coprocessor instructions.
- Aggressive resolution of indirections. Memory accesses via gp-relative addressing and direct calls assembled from constants at runtime both are tracked, with resolved targets displayed as comments next to the corresponding instructions.
- ex9.it instruction resolution. The procmod support for resolving ex9.it instructions, which are table-indexed 16-bit opcodes that depend on the ITB (Instruction Table Base). When the ITB value is known (either from symbols or register initialization) IDA can resolve these instructions into their corresponding 32-bit forms. This behavior is optional and controlled via a processor setting.

ARC (Push/Pop idiom recognition)
Clearer prologues and epilogues
IDA 9.3 improves ARC disassembly by recognizing common stack save and restore idioms and presenting them as explicit push and pop instructions. Store and load instructions that update the stack pointer by one word are now rewritten when they semantically represent a push or pop.
Before
st.a r16, [sp,0xC+var_10]
st.a r17, [sp,0x10+var_14]
st.a r18, [sp,0x14+var_18]
st.a r19, [sp,0x18+var_1C]
st.a r20, [sp,0x1C+var_20]
st.a r21, [sp,0x20+var_24]
st.a fp, [sp,0x24+var_28]
After
push r16
push r17
push r18
push r19
push r20
push r21
push fp
Improved stack analysis
Because pushes and pops are now explicit, IDA’s stack tracking logic can reason about ARC functions more reliably. Stack pointer deltas are derived directly from push/pop instructions, which leads to more consistent stack point placement and frame sizing.
This also preserves compatibility with existing ARC analysis features. In particular, ARCompact millicode patterns that rely on recognizing chains of push instructions continue to be detected correctly, even with instruction simplification enabled.
These changes are also reflected in the decompiler.
Andes RISC-V extensions (configurable decoding)
Both the RISC-V disassembler and the decompiler now support the Andes XAndesPerf extension (29 new instructions). We’ve also added processor specific options for vendor extensions (currently Andes XAndesPerf and T-Head XThead). You can toggle those extensions in the RISC-V specific options dialog.

When only Andes decoding is enabled, those words disassemble as nds.* instructions; with only T-Head enabled, they disassemble as th.*. If you’d rather not have these prefixes, you can enable the “Hide extension prefixes” option.
If both vendor extensions are enabled and an encoding collides, the procmod doesn’t silently choose one interpretation: it keeps one as the primary decode and emits the other as an auto-comment (for example: nds.lbgp a0, 11BC5h ; alt XThead: th.lrb). That makes ambiguity explicit during analysis, which is especially helpful when you’re dealing with incomplete core documentation or binaries built for slightly different configurations.

TriCore
IDA 9.3 includes a set of focused analysis improvements for TriCore that mainly affect how values are tracked and how control flow is reconstructed. You can now use the regfinder API programmatically:
import ida_idp
import ida_regfinder
reg = ida_idp.str2reg(reg)
res = ida_regfinder.find_reg_value(ea, reg)
Switch statement detection has also been refined, allowing more jump tables to be recognized and rendered as structured control flow instead of opaque indirect branches.
Wrapping it Up
Taken together, these changes make disassembly more trustworthy and analysis workflows more predictable. As always, feedback from real binaries is invaluable, particularly for edge cases that still don’t decode or analyze as expected.