Commit Graph

126 Commits

Author SHA1 Message Date
gingerBill
e94d57f650 Remove dead parameter 2026-06-18 10:58:32 +01:00
gingerBill
e404dafaf0 Merge branch 'bill/rexcode' of https://github.com/odin-lang/Odin into bill/rexcode 2026-06-18 10:49:34 +01:00
Brendan Punsky
4cc6977321 Merge origin/bill/rexcode: struct repack (#raw_union #packed), wasm arch
Merge gingerBill's latest into bill/rexcode. His changes: minimize the
Instruction/Operand structs across ISAs with packed raw-unions (+ the
compiler support for #raw_union #packed), the new core:rexcode/wasm arch
and wasm/module, encode() now returns (byte_count, ok) instead of a Result
struct, decode_one made public, and assorted formatting/inlining.

Conflict: arm64/tests/pipeline_smoke.odin CSEL test -- kept the generated
4-arg inst_csel(dst,src,src2,cond) (mnemonic_builders.odin is generated,
not from Bill's branch) and adopted Bill's (byte_count, success) encode
signature.

Required rebuilding ./odin from the merged source for the packed-union
syntax. Re-validated after the repack: regenerated all artifacts
(idempotent -- no spurious churn), all 10 arches gen/builders/check/test
green, and byte-compared the new arm32 BF + mips PS/MMI/DSP/R6 forms to
confirm no field truncation. arm64/arm32/mips still 100%.
2026-06-18 05:44:48 -04:00
Brendan Punsky
83bdd501a3 rexcode: remove dead BFCSEL else-target scaffolding; tidy mips COPY specgen
BFCSEL's else-target turned out to be the implicit fall-through, so the
BF_BELSE operand encoding, the BFCSEL_ELSE_T32 relocation, and their
encoder/decoder cases were never referenced by any table entry. Remove
them. Also restructure the MSA COPY specgen loop so COPY_U only iterates
.B/.H (COPY_U.W is mips64-only and emitted in the mips64 section), which
drops the spurious 'skipped COPY_U_W' message. No functional change to any
generated encode form; arm64/arm32/mips all still 100%, 461/600/281 tests
green.
2026-06-18 05:29:20 -04:00
Brendan Punsky
c8851c546d rexcode/arm32: BFCSEL -> Branch Future complete, arm32 100%
BFCSEL = bf-point + true-target (hw1, like BF) + 4-bit condition at
hw0[5:2], base 0xF002E001 (hw0[1] is a static marker). The else-target is
the architectural fall-through, so it is not a separate operand -- BFCSEL
is modelled as three operands and reproduces llvm-mc's bytes exactly
(f082e003 / f102e803 / f086e003 across boff/true/cond variations).

Every encodable arm32 Mnemonic now has an encode form (gap = 0). 600
tests green.
2026-06-18 04:55:26 -04:00
Brendan Punsky
808716517e rexcode/arm32: Branch Future BF/BFL/BFLX/BFI_BR encode forms
Reverse-engineered the ARMv8.1-M Branch Future T32 encoding from llvm-mc:
bf-point imm4 = (label-(PC+4))/2 at hw0[10:7]; branch target val =
(label-(PC+4))/2 with J at hw1[11] and imm10 at hw1[10:1]; BFLX/BFX target
is Rm at hw0[3:0]. New REL_BF operand + BF_BOFF/BF_BLOC/BF_RM encodings +
BF_BOFF_T32/BF_BLOC_T32 relocations with resolver. BF=0xF040E001,
BFL=0xF000C001, BFLX=0xF070E001, BFI_BR=0xF060E001.

Tightened the WLSTP/DLSTP masks to mark hw0[6] static (it is always 0 for
valid B/H/W/D sizes) so they no longer shadow the BF register forms.
Byte-exact vs llvm-mc with resolved bf-point/target offsets; 600 tests
green. (BFCSEL still pending -- it adds an else-target + condition.)
2026-06-18 04:47:03 -04:00
Brendan Punsky
c6edd6d5cd rexcode/mips: R5900 MMI MADD/MSUB, RDPGPR/WRPGPR; drop BPOSGE64 -> 100%
PS2 R5900 MMI: MSUB1/MSUBU1 (second-MAC, SPECIAL2 func +0x20 exactly like
the implemented MADD1/MADDU1) and the three-operand MADD_EE/MADDU_EE/
MSUB_EE/MSUBU_EE (write Rd as well as HI/LO; the Rd!=0 form selected by a
less-specific mask after the two-operand MADD/MSUB and PLZCW match).
RDPGPR/WRPGPR (COP0 shadow-GPR move, hand-encoded from the MIPS32r2 manual
since llvm-mc gates them). Drop BPOSGE64: not a real ISA instruction
(DSPControl.pos is 6-bit, only BPOSGE32 exists; llvm rejects it).

Every encodable mips Mnemonic now has an encode form (gap = 0). All
self-consistent and decode-clean; 281 tests green.
2026-06-18 04:17:50 -04:00
Brendan Punsky
61a62185b8 rexcode/mips: R6 compact branches (BEQC/BNEC/BLTC/BGEC/.../BLTZC)
All ten two-/one-register R6 compact branches, byte-exact vs llvm-mc. The
signed forms share POP26/POP27 (opcodes 22/23) with the pre-R6 BLEZL/BGTZL
and with each other; the decode-entry mask sort tries the more-specific
rt=0 / rs=0 forms first, and a small operand-aware hook in
decode_one_inline recovers BGEZC/BLTZC (rs==rt) from the general BGEC/BLTC.

Where R6 reuses a pre-R6/PSP major opcode (BEQC vs ADDI at opcode 8, etc.)
decode is inherently ISA-mode-dependent and resolves to the legacy form;
the R6 encode side is exact. 281 tests green.
2026-06-18 04:12:20 -04:00
Brendan Punsky
ff2bf13121 rexcode/mips: R6 PC-relative loads LWPC/LWUPC/LDPC
New REL19/REL18 operand types + BRANCH_19/BRANCH_18 encodings + REL_PC19/
REL_PC18 relocations (R6 PC-relative semantics: offset is relative to the
instruction's own address, no delay-slot adjustment; LDPC aligns the PC
down to 8 and scales by 8). LWPC (mips32r6), LWUPC/LDPC (mips64r6).
Byte-exact vs llvm-mc and decode-clean; 281 tests green.
2026-06-18 04:05:32 -04:00
Brendan Punsky
eab483a527 rexcode/mips: paired-single FMA + conditional-move forms (spec-derived)
MADD/MSUB/NMADD/NMSUB.PS and MOVN/MOVZ/MOVF/MOVT.PS. This llvm-mc only
knows the .S/.D variants, so these are derived from the llvm-verified
single forms by switching the data-format field to PS (COP1X FMA fmt is
bits 2:0, S=0 -> PS=6; COP1 conditional-move fmt is bits 25:21, S=16 ->
PS=22), per the MIPS64 manual. Same operand slots/masks. Decode-clean and
281 tests green.
2026-06-18 03:59:04 -04:00
Brendan Punsky
09c1d5ba0f rexcode/mips: paired-single FP + mips64 MSA element forms
Parameterize the specgen oracle with a per-family llvm-mc command so
64-bit-FPU and mips64 forms can be assembled. Paired-single CVT_PS_S,
CVT_S_PL/PU, PLL/PLU/PUL/PUU.PS (via -mcpu=mips64r2). mips64-only MSA
INSERT_D and COPY_U_W (via the mips64 triple). Byte-exact vs llvm-mc and
decode-clean; 281 tests green.
2026-06-18 03:55:30 -04:00
Brendan Punsky
f290347c24 rexcode/mips: DSP ASE replicate-immediate forms (REPL.PH/QB)
REPL.PH (signed 10-bit broadcast, reuses MSA_S10) and REPL.QB (8-bit,
reuses MSA_I8). Byte-exact vs llvm-mc including a negative .PH immediate;
281 tests green.
2026-06-18 03:46:50 -04:00
Brendan Punsky
5b91624cd3 rexcode/mips: DSP ASE extract-from-accumulator forms
New EXT_SIZE encoding (5-bit extract size at 25:21). EXTPDP (immediate
size), and the variable forms EXTPDPV / EXTRV_R.W / EXTRV_RS.W / EXTRV_S.H
(extract via a GPR-specified position). Byte-exact vs llvm-mc and decode-
clean; 281 tests green.
2026-06-18 03:45:23 -04:00
Brendan Punsky
82f62ce9a9 rexcode/mips: DSP ASE accumulator multiply-add / shift forms
New AC_NUM (accumulator ac0..ac3 at bits 12:11) and SHILO_IMM (signed
6-bit at 25:20) encodings. DPA/DPAX/DPS/DPSX.W.PH and MAQ_S/MAQ_SA.W.PHL/
PHR (multiply-accumulate into a DSP accumulator), plus MTHLIP, SHILOV and
SHILO (accumulator shift). Spot-checked byte-exact vs llvm-mc and decode-
clean, including a negative SHILO immediate; 281 tests green.
2026-06-18 03:43:05 -04:00
Brendan Punsky
8fed538afc rexcode/mips: MSA branch-on-zero/non-zero forms (BZ/BNZ)
BZ/BNZ .B/.H/.W/.D/.V (branch if any/all elements zero/non-zero): a
specgen branch emitter that derives the opcode+Wt bits then marks the
16-bit PC-relative offset variable, reusing the existing REL16/BRANCH_16
relocation machinery. The offset is emitted as a relocation (label
target). 10 forms, opcode+Wt byte-exact vs llvm-mc and decode-clean.

The R6 two-/one-register compact branches (BEQC/BNEC/BLTC/BGEC/.../BLTZC)
are deferred: they share POP major opcodes disambiguated only by the
rs/rt relationship, which the opcode+mask decode model can't express
without operand-aware logic. 281 tests green.
2026-06-18 03:39:55 -04:00
Brendan Punsky
56cfbc675a rexcode/mips: DSP ASE shift-by-immediate forms
New DSP_SA encoding (shift amount at bits 24:21). SHRA.QB/SHRA_R.QB
(.QB 3-bit), SHRA_R.PH/SHRL.PH (.PH 4-bit). Byte-exact vs llvm-mc;
281 tests green.
2026-06-18 03:33:47 -04:00
Brendan Punsky
c2de507bb0 rexcode/mips: FPU FMA, MSA COPY/INSERT, DSP 2-register, DI/EI/RDHWR
New FR (FP reg at 25:21) encoding for the COP1X 4-register fused
multiply-adds MADD/MSUB/NMADD/NMSUB.S/.D. New GPR_AT_6 / GPR_AT_11
encodings (GPR in a vector-register slot, with correct GPR decode) for
MSA COPY_S/U (lane->GPR) and INSERT (GPR->lane). DSP two-register
PRECEQU/PRECEU (.PH.QBLA/QBRA) and REPLV (.PH/.QB). Control ops DI/EI and
RDHWR. 25 forms; spot-checked byte-exact vs llvm-mc and decode-clean; 281
tests green.
2026-06-18 03:31:40 -04:00
Brendan Punsky
930b988ebf rexcode/mips: FPU conditional-move + convert-to-FP forms
MOVN/MOVZ.S/.D (FP move on GPR nonzero/zero, enc {FD,FS,RT}), MOVF/MOVT.
S/.D (FP move on FP condition code, enc {FD,FS,FCC_BC}), and the
convert-to-FP forms FCVT_D_W/S_D/S_W (cvt.d.w/cvt.s.d/cvt.s.w). 11 forms.
Spot-checked byte-exact vs llvm-mc and decode-clean; 281 tests green.
2026-06-18 03:27:23 -04:00
Brendan Punsky
5b47f0ca29 rexcode/mips: MSA INSVE + DSP ASE 3-register/compare/shift forms
MSA INSVE (.B/.H/.W/.D element insert). DSP ASE three-register ops
(ADDU/SUBU/MULEQ/MULEU/MULQ/PRECRQ*/PICK/CMPGU, enc {RD,RS,RT}), the
variable shifts SHLLV/SHRAV/SHRLV (enc {RD,RT,RS} -- value is Rt, shift is
Rs), and the compares CMP/CMPU (.PH/.QB, {RS,RT}). 38 forms reusing the
existing GPR R-type slots. Spot-checked byte-exact vs llvm-mc; 281 tests
green.
2026-06-18 03:24:20 -04:00
Brendan Punsky
4ab24007b7 rexcode/mips: MSA BIT-shift, element-index, GPR-index, I8 forms
New MSA_BIT_SHIFT / MSA_ELM_IDX / MSA_I8 encodings (the data-format marker
is fixed in the entry bits; the operand drives the low bits; decode infers
df from the marker). SLLI/SRAI/SRLI (.B/.H/.W/.D shift), SPLATI/SLDI
(element index), SPLAT/SLD (GPR index), VSHF (.B/.H/.W/.D shuffle), and
the I8 forms ANDI/ORI/XORI/NORI/BMNZI/BMZI/BSELI.B + SHF.B/H/W. 42 forms.
Spot-checked byte-exact vs llvm-mc and decode-clean across all formats;
281 tests green.
2026-06-18 03:17:39 -04:00
Brendan Punsky
307aa2a9dd rexcode/mips: MSA 3RF/3R/2R/2RF/VEC encode forms (specgen)
New mips specgen (llvm-mc --triple=mips --mattr=+msa as the bits oracle,
big-endian words, empirical masks): vector FP arithmetic/compare FADD/
FSUB/FMUL/FDIV/FMAX/FMIN/FCEQ/FCLE/FCLT/FCNE (.W/.D), dot product DOTP_S/U
(.H/.W/.D), count/popcount NLOC/NLZC/PCNT (.B/.H/.W/.D), one-source FP
FSQRT/FRSQRT/FRCP/FRINT/FTRUNC_S/U/FFINT_S/U (.W/.D), and bit-select
BMNZ/BMZ/BSEL.V. 57 forms reusing the existing WD/WS/WT slots. Spot-
checked byte-exact vs llvm-mc and decode-clean; 281 tests green.
2026-06-18 03:11:41 -04:00
Brendan Punsky
e4cff78a70 rexcode/arm32: document BF family as intentionally unimplemented
The 5 Branch Future mnemonics (BF/BFI_BR/BFL/BFLX/BFCSEL) are left
enum-only on purpose: deprecated ARMv8.1-M, not disassemblable by
llvm-objdump (so unverifiable), and a correct encoder needs dual-offset
PC-relative relocation infrastructure that doesn't exist. Noted in the
enum for future readers.
2026-06-18 03:05:25 -04:00
Brendan Punsky
a63fb51fdd rexcode/arm32: MVE VMLSV/VMLSVA (correct 3-bit Q regs); drop placeholders
Implement VMLSV/VMLSVA (MVE multiply-subtract reduce) properly: new
VN_Q_MVE (Qn at 19:17) and VM_Q_MVE (Qm at 3:1) encodings -- the actual
3-bit MVE Q fields -- with Rd at 15:12 (RDLO_A32). The earlier collision
was from reusing the 4-bit VN_Q (19:16) and RD_T32 (11:8), which place
the fields wrong; byte-exact vs llvm-mc now with distinct Qn/Qm/Rd.

Drop three placeholder/redundant enum entries: VRINT and VPRINT (not real
instructions -- llvm rejects bare 'vrint'; VPRINT is a printf-like debug
pseudo-op), and VRSHL_MVE (the author's own comment marks it a
placeholder; 'vrshl q,q,q' already decodes via VRSHL's MVE form). 600
tests green, verify matches llvm-mc.
2026-06-18 01:58:19 -04:00
Brendan Punsky
239dea4f55 rexcode/arm32: MVE VHCADD (saturating halving complex add) + VCMLA
New MVE_ROT_HCADD (#90/#270 at bit12) and MVE_ROT_CMLA (#0/90/180/270 at
bits 24:23) rotation encodings -- the rotation degrees round-trip
properly (unlike the existing FCMA VCMLA which leaves it unencoded). One
form each with the element-size bits left variable (MVE convention).
Verify round-trips; all rotations byte-exact vs llvm-mc; 600 tests green.

(VMLSV/VMLSVA reduce ops deferred: their format decode-collides with
other MVE encodings given the 4-bit VN_Q vs MVE's 3-bit Qn.)
2026-06-18 01:47:44 -04:00
Brendan Punsky
55463b6719 rexcode/arm32: VMOV (ARM core register to scalar) Dd[lane], Rt
New VMOV_LANE_8/16/32 encodings: Dd at bits 19:16+bit7, lane bits per
element size (.8 = bit21:bit6:bit5 with bit22 size marker; .16 =
bit21:bit6 with bit5 marker; .32 = bit21). Verify round-trips all three
sizes; spot-checked .8 byte-exact incl. max lane; 600 tests green.
2026-06-18 01:34:48 -04:00
Brendan Punsky
5df81b5117 rexcode/arm32: VQDMULH/VQRDMULH by-scalar-lane
New NEON_VM_SCALAR16/32 encodings for the Dm[lane] scalar operand: .16
places Dm in D0..D7 (bits 2:0) with the lane split bit5:bit3, .32 places
Dm in D0..D15 (bits 3:0) with the lane at bit5. VQDMULH_LANE and
VQRDMULH_LANE across .s16/.s32, D and Q destinations (8 forms). Verify
round-trips; spot-checked byte-exact incl. max register/lane and
decode-clean; 600 tests green.
2026-06-18 01:29:19 -04:00
Brendan Punsky
acc14864f3 rexcode/arm32: DCPS1/DCPS2/DCPS3 (debug change PE state)
Fixed T32 encodings (0xF78F8001/2/3), no operands. Verify round-trips;
600 tests green.
2026-06-18 01:25:51 -04:00
Brendan Punsky
b2b14998f7 rexcode/arm32: VRSRA, VRECPE_F/VRSQRTE_F, VPADD_F, VCVTR
VRSRA (NEON rounding shift-right-accumulate, D/Q, mirrors VSRA's raw
imm6 convention), VRECPE_F/VRSQRTE_F (FP reciprocal/rsqrt estimate, D/Q),
VPADD_F (FP pairwise add, f32/f16), and VCVTR (VFP convert-to-integer
using the FPSCR rounding mode; s32/u32 from f32 and f64). Hand-written
mirroring the existing VSRA/VRECPE/VPADD/VCVT forms. Built-in llvm
round-trip verify passes; spot-checked byte-exact; 600 tests green.
2026-06-18 01:22:12 -04:00
Brendan Punsky
59750926d9 rexcode/arm32: unprivileged (translate) post-indexed loads/stores
LDRT/LDRBT/STRT/STRBT (imm12) and LDRHT/STRHT/LDRSBT/LDRSHT (imm8 split):
each is the corresponding post-indexed load/store with the W bit (21)
set. Hand-written, reusing the existing MEM_POST_INDEX encoding. All 8
byte-exact vs llvm-mc and decode-clean; 600 tests green.
2026-06-18 01:17:34 -04:00
Brendan Punsky
6fd233f041 rexcode/arm32: NEON long/wide/compare/shift encode forms (specgen)
New arm32 specgen (llvm-mc --triple=armv8a --mattr=+neon as the bits
oracle, empirical masks): VADDL/VSUBL/VABAL/VABDL (Qd,Dn,Dm) and
VADDW/VSUBW (Qd,Qn,Dm) across s/u 8/16/32; the compare aliases
VCLE/VCLT (= VCGE/VCGT with Vn/Vm swapped) and VACLE/VACLT (= VACGE/VACGT
swapped, f32); and VQRSHL shift-by-vector. 84 forms over 11 mnemonics.
Built-in llvm round-trip verify passes; spot-checked byte-exact with
distinct Q/D registers; 600 tests green.
2026-06-18 01:15:22 -04:00
Brendan Punsky
fe7b81d64f rexcode/arm64: drop vestigial/redundant mnemonics; alias redundant SME names
Remove from the Mnemonic enum: LDARB_X/LDARH_X/STLRB_X/STLRH_X (no
distinct byte/half acquire-release 'X' encoding exists -- LDARB/LDARH/
STLRB/STLRH already cover them), and the 12 redundant SME names
SME_LD1{B,H,W,D,Q}_ZA / SME_ST1{...}_ZA / SME_MOVA_TO_Z / SME_MOVA_TO_ZA
(same instructions as the canonical *_TILE / MOVA_*_FROM_* forms).

The builder generator now emits delegating aliases for the redundant SME
names (inst_sme_ld1b_za :: inst_sme_ld1b_tile, ...), so the convenient
names keep working and resolve to the canonical, decode-unambiguous
encodings. With XAR_Z landed, the arm64 Mnemonic enum is now 100%
covered: every entry has an encode form. 461 tests green.
2026-06-18 00:42:37 -04:00
Brendan Punsky
303fa9e509 rexcode/arm64: SVE2 XAR (exclusive-or and rotate) encode form
XAR Zdn.T, Zdn.T, Zm.T, #rotate across .B/.H/.S/.D. New SVE_XAR_SHIFT
encoding: the rotate amount is V = 2*esize - amount, split across
tszh(23:22):tszl(20:19):imm3(18:16); the element size is selected by the
Z register type on encode and recovered from the highest set bit of
tszh:tszl on decode (so the amount round-trips for every esize).
vec_esize now also handles Z_REG_B/H/S/D. All six representative forms
byte-exact vs llvm-mc and decode-clean; 461 tests green.
2026-06-18 00:39:48 -04:00
Brendan Punsky
33e5202f05 rexcode/arm64: single-structure lane load/store (LD1-4_LANE / ST1-4_LANE)
All eight LD#_LANE / ST#_LANE mnemonics across .B/.H/.S/.D (32 forms).
New NEON_LANE_B/H/S/D encodings split the lane index across Q (bit 30),
S (bit 12) and size (bits 11:10) per element size; the list length and
load/store bit are fixed in the entry bits. All 11 representative forms
(every element size, structure count, and lane extremes) byte-exact vs
llvm-mc and decode-clean; 461 tests green.
2026-06-18 00:21:43 -04:00
Brendan Punsky
2c8768b39a rexcode/arm64: TBL/TBX + structured LD2-4/ST2-4 + LD1R-4R encode forms
Table lookup TBL/TBX (.8b/.16b, single-register table) and the multi-
register structured load/store LD2/LD3/LD4, ST2/ST3/ST4 plus load-and-
replicate LD1R/LD2R/LD3R/LD4R (.16b). Following the existing LD1/ST1
convention: the register list is encoded by its first register, with the
list length + arrangement fixed in the bits. All 13 representative forms
byte-exact vs llvm-mc and decode-clean; 461 tests green.

(The single-lane _LANE variants need the Q:S:size lane-index split and
are left for a follow-up.)
2026-06-18 00:17:12 -04:00
Brendan Punsky
69157b7ec5 rexcode/arm64: SME ADDHA/ADDVA (ZA outer-sum accumulate)
ADDHA/ADDVA ZAda.S, Pn/m, Pm/m, Zn.S via a new ZA_TILE_LOW encoding
(accumulator tile at bits 2:0; Pn at 12:10, Pm at 15:13, Zn at 9:5).
Byte-exact vs llvm-mc and decode-clean across tile/predicate/Zn fields.

The other 11 missing SME enum names (SME_LD1*/ST1*_ZA, SME_MOVA_TO_Z/ZA)
are redundant aliases of the already-implemented SME_LD1*/ST1*_TILE and
SME_MOVA_*_FROM_* forms -- adding duplicate encodings collides in the
decode table (broke a roundtrip test), so they are intentionally left to
the existing canonical forms. 461 tests green.
2026-06-18 00:14:21 -04:00
Brendan Punsky
68aac263d0 rexcode/arm64: SVE FFR/BRKN/CPY/EXT/MOV aliases (10 more, SVE 47/48)
FFR ops (SETFFR/RDFFR/WRFFR) and BRKN (destructive, Pdm re-packs Pd) via
specgen; CPY (predicated from GPR), EXT (destructive, imm8 split via new
SVE_EXT_IMM), MOV-predicated (=SEL with Zm=Zd, via ZD_ZM_DUP), and the
predicate aliases NOT/MOVS/MOV (EOR/ORR/AND with a duplicated predicate
field, via PG4_PM_DUP/PN_PM_DUP/PN_PG_PM_DUP). All byte-exact vs llvm-mc;
the predicate aliases decode to their canonical base op (identical bytes,
as expected). 461 tests green.

(SVE_XAR_Z deferred: its tsz:imm3 shift field does not follow the NEON
immh:immb scheme and needs a bespoke esize-from-Z encoder.)
2026-06-18 00:09:21 -04:00
Brendan Punsky
cd8703acd4 rexcode/arm64: SVE predicated/compare/predicate-logical/SVE2 encode forms (37)
Predicated FP round (FRINTN/P/M/Z/A/X/I, FRECPX), reversed predicated
shifts (ASRR/LSLR/LSRR) and FP (FSUBR/FDIVR), FP compare (FCMEQ/GE/GT/
NE/UO + vs-zero FCMLE/FCMLT), integer compare aliases (CMPLE/LO/LS/LT),
predicate logical (NANDS/NORS/ORNS), predicate break (BRKPA/BRKPB,
BRKA/BRKB + flag-setting BRKAS/BRKBS), SVE2 EOR3/BCAX, INSR, COMPACT.

New specgen SVE section: a generic emitter assembles each form all-zero
then one variant per field at its max (Z 31, 3-bit Pg 7, 4-bit Pd/Pg/Pn/
Pm 15, GPR wzr/xzr) and derives mask = ~union. Operand placements
verified vs llvm-mc: the reversed/destructive ops put Zm at VN (5-9); the
CMPLE/LO/LS/LT aliases swap operands (VM/VN); EOR3/BCAX place the 3rd src
at VM and 4th at VN. All 22 representative forms byte-exact and
decode-clean; 461 tests green. (BRKN + CPY/EXT/MOV/NOT_P/FFR/XAR
stragglers next.)
2026-06-17 23:59:23 -04:00
Brendan Punsky
8006b5f7e2 rexcode/arm64: NEON MOVI/MVNI + FMOV scalar/vector immediate forms
MOVI (8B/16B/4H/8H/2S/4S/2D) and MVNI (4H/8H/2S/4S) via specgen (imm8 in
abc:defgh, cmode/op/Q static per arrangement; .2D probed with all-ones
since its asm immediate is the replicated 64-bit value). FMOV_IMM (scalar
Sd/Dd/Hd, 8-bit float at 20:13 via new FMOV_SCALAR_IMM encoding) and
FMOV_V_IMM (Vd.<2S|4S|2D|4H|8H>, fimm8 in abc:defgh, cmode=1111) hand-
written -- canonical bits with the imm8 fields zeroed (the live float
example would otherwise bake operand bits into the static pattern). All
14 representative forms byte-exact vs llvm-mc and decode-clean; 461 tests
green. (LSL/MSL-shifted MOVI/MVNI variants share the operand signature
and are omitted.)
2026-06-17 23:47:45 -04:00
Brendan Punsky
ab7f20a129 rexcode/arm64: byte/half/signed loads-stores + vector LDP/STP/LDUR/STUR
LDRB/LDRH/STRB/STRH (post-index, pre-index, register-offset),
LDRSB/LDRSH (register-offset, W and X) and LDRSW (register-offset), plus
the vector pair/unscaled forms LDP_V/STP_V (S/D/Q) and LDUR_V/STUR_V
(S/D/Q). Hand-written, reusing the existing OFFSET_BASE_POST/PRE/REG/S9
addressing encodings; canonical bits taken from llvm-mc (operand fields
zeroed). All 23 representative forms byte-exact vs llvm-mc and
decode-clean; 461 tests green.

(LDARB_X/LDARH_X/STLRB_X/STLRH_X left unimplemented: LDARB/LDARH/STLRB/
STLRH are byte/half acquire-release into a W register with no distinct
64-bit 'X' encoding -- these enum entries are vestigial.)
2026-06-17 23:39:01 -04:00
Brendan Punsky
aabcdd41b6 rexcode/arm64: CCMP/CCMN-imm, HINT, MSR-imm, USDOT encode forms
Conditional compare immediate (CCMP_IMM/CCMN_IMM: imm5 at 20:16 via a new
IMM5_HI encoding, bit 11 set), HINT #imm7, MSR <pstatefield>,#imm (new
MSR_PSTATE encoding placing op1 at 18:16 / op2 at 7:5, CRm via the shared
BARRIER_FIELD), and USDOT (I8MM unsigned-by-signed dot product, .2S/.4S).
Hand-written into the core (outside the specgen region). All forms
byte-exact vs llvm-mc and decode-clean; 461 tests green.
2026-06-17 23:34:10 -04:00
Brendan Punsky
c506e6c13b rexcode/arm64: scalar FP round/reciprocal + FP-to-GPR convert forms
Scalar FRINTN/P/M/Z/A/X/I and FRECPX (Sd,Sn / Dd,Dn / Hd,Hn), and the
FP-to-GPR converts FCVTAS/AU/MS/MU/NS/NU/PS/PU (Wd/Xd, Hn/Sn/Dn). All
register-only forms: element type and W/X are selected by static bits, so
specgen derives each from llvm-mc with the convert forms varying Rd and
Rn independently (zero register for the 31 case). H variants tagged FP16.
RD/RN encodings so decode reconstructs the scalar/GPR register class. All
19 representative forms byte-exact vs llvm-mc and decode-clean; 461 tests
green.
2026-06-17 23:27:26 -04:00
Brendan Punsky
06eb3de6a2 rexcode/arm64: NEON copy/permute (MOV/MVN/DUP/INS/EXT) encode forms
MOV_V (ORR alias: source feeds both Vn and Vm via a new VN_VM_DUP
encoding), MVN_V (NOT alias, plain 2-register), DUP_V (element form
Vd.T,Vn.Ts[i] and general form Vd.T,Wn/Xn), INS (element-to-element and
from-GPR), EXT_V (imm4 byte index). Adds a VEC_INDEX operand type plus
NEON_IDX5/NEON_IDX4/NEON_EXT_IDX encodings: the element-size marker rides
in the entry bits, the lane index drives the bits above it, and the
decoder recovers the element size from imm5's marker.

Element size now rides in op.size (B=1/H=2/S=4/D=8) via op_v_elem_b/h/s/d
so the matcher can disambiguate DUP/INS element forms; the builder
generator maps V_ELEM_* to those constructors. specgen derives the mask
by varying registers and each index field to its max -- the GPR-source
forms vary Vd and Rn independently (Rn 31 = wzr/xzr) so the low bit of
each field toggles. All 19 representative forms byte-exact vs llvm-mc and
decode-clean; 461 tests green. (TBL/TBX register-list forms deferred.)
2026-06-17 23:23:44 -04:00
Brendan Punsky
5761c23ba4 rexcode/arm64: NEON narrowing-shift (SHRN/SQSHRN/...) encode forms
SHRN/2, RSHRN/2, SQSHRN/2, UQSHRN/2, SQRSHRN/2, UQRSHRN/2, SQSHRUN/2,
SQRSHRUN/2 (16 forms). Verified against llvm-mc that immh keys on the
NARROW destination element (not the wide source), so the existing
NEON_SHR_IMM encoder/decoder (esize from ops[0]) is already correct --
this is a specgen-only change: right-shift esize now uses ESIZE[dst]
(equal to ESIZE[src] for same-arrangement shifts) plus narrowing
{narrow-dst, wide-src} arrangement tuples. All forms byte-exact and
decode-roundtrip verified across B/H/S element sizes; 461 tests green.
2026-06-17 23:07:38 -04:00
Brendan Punsky
a1e359b64a rexcode/arm64: NEON permute, compare-zero, SXTL/UXTL encode forms
ZIP1/2, UZP1/2, TRN1/2 (three-same permute); CMLE/CMLT and FP
FCMLE/FCMLT (compare against zero, with the literal #0 / #0.0 operand);
SXTL/SXTL2/UXTL/UXTL2 (= SSHLL/USHLL #0, plain 2-register widen, shift
implicit in the static bits). All reuse the VD/VN/VM register slots, so
no encoder change. specgen gains an emit_cmp0 shape plus permute and
widen families. All forms byte-exact vs llvm-mc; 461 tests green.
2026-06-17 23:03:53 -04:00
gingerBill
ad12645423 Mock out more custom sections 2026-06-17 14:04:19 +01:00
gingerBill
b6fdabc874 Parse custom sections target_features and name 2026-06-17 13:42:26 +01:00
gingerBill
53fe193868 Use log2 for the alignment, remove unneeded code 2026-06-17 12:39:45 +01:00
gingerBill
5670aa7604 Minor changes 2026-06-17 12:24:25 +01:00
gingerBill
b5810fea69 Remove dead parameters 2026-06-17 12:17:04 +01:00
gingerBill
45b1bfe96a Minor reorganization 2026-06-17 12:14:06 +01:00