Commit Graph

17684 Commits

Author SHA1 Message Date
Brendan Punsky
e4cff78a70 rexcode/arm32: document BF family as intentionally unimplemented
The 5 Branch Future mnemonics (BF/BFI_BR/BFL/BFLX/BFCSEL) are left
enum-only on purpose: deprecated ARMv8.1-M, not disassemblable by
llvm-objdump (so unverifiable), and a correct encoder needs dual-offset
PC-relative relocation infrastructure that doesn't exist. Noted in the
enum for future readers.
2026-06-18 03:05:25 -04:00
Brendan Punsky
a63fb51fdd rexcode/arm32: MVE VMLSV/VMLSVA (correct 3-bit Q regs); drop placeholders
Implement VMLSV/VMLSVA (MVE multiply-subtract reduce) properly: new
VN_Q_MVE (Qn at 19:17) and VM_Q_MVE (Qm at 3:1) encodings -- the actual
3-bit MVE Q fields -- with Rd at 15:12 (RDLO_A32). The earlier collision
was from reusing the 4-bit VN_Q (19:16) and RD_T32 (11:8), which place
the fields wrong; byte-exact vs llvm-mc now with distinct Qn/Qm/Rd.

Drop three placeholder/redundant enum entries: VRINT and VPRINT (not real
instructions -- llvm rejects bare 'vrint'; VPRINT is a printf-like debug
pseudo-op), and VRSHL_MVE (the author's own comment marks it a
placeholder; 'vrshl q,q,q' already decodes via VRSHL's MVE form). 600
tests green, verify matches llvm-mc.
2026-06-18 01:58:19 -04:00
Brendan Punsky
239dea4f55 rexcode/arm32: MVE VHCADD (saturating halving complex add) + VCMLA
New MVE_ROT_HCADD (#90/#270 at bit12) and MVE_ROT_CMLA (#0/90/180/270 at
bits 24:23) rotation encodings -- the rotation degrees round-trip
properly (unlike the existing FCMA VCMLA which leaves it unencoded). One
form each with the element-size bits left variable (MVE convention).
Verify round-trips; all rotations byte-exact vs llvm-mc; 600 tests green.

(VMLSV/VMLSVA reduce ops deferred: their format decode-collides with
other MVE encodings given the 4-bit VN_Q vs MVE's 3-bit Qn.)
2026-06-18 01:47:44 -04:00
Brendan Punsky
55463b6719 rexcode/arm32: VMOV (ARM core register to scalar) Dd[lane], Rt
New VMOV_LANE_8/16/32 encodings: Dd at bits 19:16+bit7, lane bits per
element size (.8 = bit21:bit6:bit5 with bit22 size marker; .16 =
bit21:bit6 with bit5 marker; .32 = bit21). Verify round-trips all three
sizes; spot-checked .8 byte-exact incl. max lane; 600 tests green.
2026-06-18 01:34:48 -04:00
Brendan Punsky
5df81b5117 rexcode/arm32: VQDMULH/VQRDMULH by-scalar-lane
New NEON_VM_SCALAR16/32 encodings for the Dm[lane] scalar operand: .16
places Dm in D0..D7 (bits 2:0) with the lane split bit5:bit3, .32 places
Dm in D0..D15 (bits 3:0) with the lane at bit5. VQDMULH_LANE and
VQRDMULH_LANE across .s16/.s32, D and Q destinations (8 forms). Verify
round-trips; spot-checked byte-exact incl. max register/lane and
decode-clean; 600 tests green.
2026-06-18 01:29:19 -04:00
Brendan Punsky
acc14864f3 rexcode/arm32: DCPS1/DCPS2/DCPS3 (debug change PE state)
Fixed T32 encodings (0xF78F8001/2/3), no operands. Verify round-trips;
600 tests green.
2026-06-18 01:25:51 -04:00
Brendan Punsky
b2b14998f7 rexcode/arm32: VRSRA, VRECPE_F/VRSQRTE_F, VPADD_F, VCVTR
VRSRA (NEON rounding shift-right-accumulate, D/Q, mirrors VSRA's raw
imm6 convention), VRECPE_F/VRSQRTE_F (FP reciprocal/rsqrt estimate, D/Q),
VPADD_F (FP pairwise add, f32/f16), and VCVTR (VFP convert-to-integer
using the FPSCR rounding mode; s32/u32 from f32 and f64). Hand-written
mirroring the existing VSRA/VRECPE/VPADD/VCVT forms. Built-in llvm
round-trip verify passes; spot-checked byte-exact; 600 tests green.
2026-06-18 01:22:12 -04:00
Brendan Punsky
59750926d9 rexcode/arm32: unprivileged (translate) post-indexed loads/stores
LDRT/LDRBT/STRT/STRBT (imm12) and LDRHT/STRHT/LDRSBT/LDRSHT (imm8 split):
each is the corresponding post-indexed load/store with the W bit (21)
set. Hand-written, reusing the existing MEM_POST_INDEX encoding. All 8
byte-exact vs llvm-mc and decode-clean; 600 tests green.
2026-06-18 01:17:34 -04:00
Brendan Punsky
6fd233f041 rexcode/arm32: NEON long/wide/compare/shift encode forms (specgen)
New arm32 specgen (llvm-mc --triple=armv8a --mattr=+neon as the bits
oracle, empirical masks): VADDL/VSUBL/VABAL/VABDL (Qd,Dn,Dm) and
VADDW/VSUBW (Qd,Qn,Dm) across s/u 8/16/32; the compare aliases
VCLE/VCLT (= VCGE/VCGT with Vn/Vm swapped) and VACLE/VACLT (= VACGE/VACGT
swapped, f32); and VQRSHL shift-by-vector. 84 forms over 11 mnemonics.
Built-in llvm round-trip verify passes; spot-checked byte-exact with
distinct Q/D registers; 600 tests green.
2026-06-18 01:15:22 -04:00
Brendan Punsky
fe7b81d64f rexcode/arm64: drop vestigial/redundant mnemonics; alias redundant SME names
Remove from the Mnemonic enum: LDARB_X/LDARH_X/STLRB_X/STLRH_X (no
distinct byte/half acquire-release 'X' encoding exists -- LDARB/LDARH/
STLRB/STLRH already cover them), and the 12 redundant SME names
SME_LD1{B,H,W,D,Q}_ZA / SME_ST1{...}_ZA / SME_MOVA_TO_Z / SME_MOVA_TO_ZA
(same instructions as the canonical *_TILE / MOVA_*_FROM_* forms).

The builder generator now emits delegating aliases for the redundant SME
names (inst_sme_ld1b_za :: inst_sme_ld1b_tile, ...), so the convenient
names keep working and resolve to the canonical, decode-unambiguous
encodings. With XAR_Z landed, the arm64 Mnemonic enum is now 100%
covered: every entry has an encode form. 461 tests green.
2026-06-18 00:42:37 -04:00
Brendan Punsky
303fa9e509 rexcode/arm64: SVE2 XAR (exclusive-or and rotate) encode form
XAR Zdn.T, Zdn.T, Zm.T, #rotate across .B/.H/.S/.D. New SVE_XAR_SHIFT
encoding: the rotate amount is V = 2*esize - amount, split across
tszh(23:22):tszl(20:19):imm3(18:16); the element size is selected by the
Z register type on encode and recovered from the highest set bit of
tszh:tszl on decode (so the amount round-trips for every esize).
vec_esize now also handles Z_REG_B/H/S/D. All six representative forms
byte-exact vs llvm-mc and decode-clean; 461 tests green.
2026-06-18 00:39:48 -04:00
Brendan Punsky
33e5202f05 rexcode/arm64: single-structure lane load/store (LD1-4_LANE / ST1-4_LANE)
All eight LD#_LANE / ST#_LANE mnemonics across .B/.H/.S/.D (32 forms).
New NEON_LANE_B/H/S/D encodings split the lane index across Q (bit 30),
S (bit 12) and size (bits 11:10) per element size; the list length and
load/store bit are fixed in the entry bits. All 11 representative forms
(every element size, structure count, and lane extremes) byte-exact vs
llvm-mc and decode-clean; 461 tests green.
2026-06-18 00:21:43 -04:00
Brendan Punsky
2c8768b39a rexcode/arm64: TBL/TBX + structured LD2-4/ST2-4 + LD1R-4R encode forms
Table lookup TBL/TBX (.8b/.16b, single-register table) and the multi-
register structured load/store LD2/LD3/LD4, ST2/ST3/ST4 plus load-and-
replicate LD1R/LD2R/LD3R/LD4R (.16b). Following the existing LD1/ST1
convention: the register list is encoded by its first register, with the
list length + arrangement fixed in the bits. All 13 representative forms
byte-exact vs llvm-mc and decode-clean; 461 tests green.

(The single-lane _LANE variants need the Q:S:size lane-index split and
are left for a follow-up.)
2026-06-18 00:17:12 -04:00
Brendan Punsky
69157b7ec5 rexcode/arm64: SME ADDHA/ADDVA (ZA outer-sum accumulate)
ADDHA/ADDVA ZAda.S, Pn/m, Pm/m, Zn.S via a new ZA_TILE_LOW encoding
(accumulator tile at bits 2:0; Pn at 12:10, Pm at 15:13, Zn at 9:5).
Byte-exact vs llvm-mc and decode-clean across tile/predicate/Zn fields.

The other 11 missing SME enum names (SME_LD1*/ST1*_ZA, SME_MOVA_TO_Z/ZA)
are redundant aliases of the already-implemented SME_LD1*/ST1*_TILE and
SME_MOVA_*_FROM_* forms -- adding duplicate encodings collides in the
decode table (broke a roundtrip test), so they are intentionally left to
the existing canonical forms. 461 tests green.
2026-06-18 00:14:21 -04:00
Brendan Punsky
68aac263d0 rexcode/arm64: SVE FFR/BRKN/CPY/EXT/MOV aliases (10 more, SVE 47/48)
FFR ops (SETFFR/RDFFR/WRFFR) and BRKN (destructive, Pdm re-packs Pd) via
specgen; CPY (predicated from GPR), EXT (destructive, imm8 split via new
SVE_EXT_IMM), MOV-predicated (=SEL with Zm=Zd, via ZD_ZM_DUP), and the
predicate aliases NOT/MOVS/MOV (EOR/ORR/AND with a duplicated predicate
field, via PG4_PM_DUP/PN_PM_DUP/PN_PG_PM_DUP). All byte-exact vs llvm-mc;
the predicate aliases decode to their canonical base op (identical bytes,
as expected). 461 tests green.

(SVE_XAR_Z deferred: its tsz:imm3 shift field does not follow the NEON
immh:immb scheme and needs a bespoke esize-from-Z encoder.)
2026-06-18 00:09:21 -04:00
Brendan Punsky
cd8703acd4 rexcode/arm64: SVE predicated/compare/predicate-logical/SVE2 encode forms (37)
Predicated FP round (FRINTN/P/M/Z/A/X/I, FRECPX), reversed predicated
shifts (ASRR/LSLR/LSRR) and FP (FSUBR/FDIVR), FP compare (FCMEQ/GE/GT/
NE/UO + vs-zero FCMLE/FCMLT), integer compare aliases (CMPLE/LO/LS/LT),
predicate logical (NANDS/NORS/ORNS), predicate break (BRKPA/BRKPB,
BRKA/BRKB + flag-setting BRKAS/BRKBS), SVE2 EOR3/BCAX, INSR, COMPACT.

New specgen SVE section: a generic emitter assembles each form all-zero
then one variant per field at its max (Z 31, 3-bit Pg 7, 4-bit Pd/Pg/Pn/
Pm 15, GPR wzr/xzr) and derives mask = ~union. Operand placements
verified vs llvm-mc: the reversed/destructive ops put Zm at VN (5-9); the
CMPLE/LO/LS/LT aliases swap operands (VM/VN); EOR3/BCAX place the 3rd src
at VM and 4th at VN. All 22 representative forms byte-exact and
decode-clean; 461 tests green. (BRKN + CPY/EXT/MOV/NOT_P/FFR/XAR
stragglers next.)
2026-06-17 23:59:23 -04:00
Brendan Punsky
8006b5f7e2 rexcode/arm64: NEON MOVI/MVNI + FMOV scalar/vector immediate forms
MOVI (8B/16B/4H/8H/2S/4S/2D) and MVNI (4H/8H/2S/4S) via specgen (imm8 in
abc:defgh, cmode/op/Q static per arrangement; .2D probed with all-ones
since its asm immediate is the replicated 64-bit value). FMOV_IMM (scalar
Sd/Dd/Hd, 8-bit float at 20:13 via new FMOV_SCALAR_IMM encoding) and
FMOV_V_IMM (Vd.<2S|4S|2D|4H|8H>, fimm8 in abc:defgh, cmode=1111) hand-
written -- canonical bits with the imm8 fields zeroed (the live float
example would otherwise bake operand bits into the static pattern). All
14 representative forms byte-exact vs llvm-mc and decode-clean; 461 tests
green. (LSL/MSL-shifted MOVI/MVNI variants share the operand signature
and are omitted.)
2026-06-17 23:47:45 -04:00
Brendan Punsky
ab7f20a129 rexcode/arm64: byte/half/signed loads-stores + vector LDP/STP/LDUR/STUR
LDRB/LDRH/STRB/STRH (post-index, pre-index, register-offset),
LDRSB/LDRSH (register-offset, W and X) and LDRSW (register-offset), plus
the vector pair/unscaled forms LDP_V/STP_V (S/D/Q) and LDUR_V/STUR_V
(S/D/Q). Hand-written, reusing the existing OFFSET_BASE_POST/PRE/REG/S9
addressing encodings; canonical bits taken from llvm-mc (operand fields
zeroed). All 23 representative forms byte-exact vs llvm-mc and
decode-clean; 461 tests green.

(LDARB_X/LDARH_X/STLRB_X/STLRH_X left unimplemented: LDARB/LDARH/STLRB/
STLRH are byte/half acquire-release into a W register with no distinct
64-bit 'X' encoding -- these enum entries are vestigial.)
2026-06-17 23:39:01 -04:00
Brendan Punsky
aabcdd41b6 rexcode/arm64: CCMP/CCMN-imm, HINT, MSR-imm, USDOT encode forms
Conditional compare immediate (CCMP_IMM/CCMN_IMM: imm5 at 20:16 via a new
IMM5_HI encoding, bit 11 set), HINT #imm7, MSR <pstatefield>,#imm (new
MSR_PSTATE encoding placing op1 at 18:16 / op2 at 7:5, CRm via the shared
BARRIER_FIELD), and USDOT (I8MM unsigned-by-signed dot product, .2S/.4S).
Hand-written into the core (outside the specgen region). All forms
byte-exact vs llvm-mc and decode-clean; 461 tests green.
2026-06-17 23:34:10 -04:00
Brendan Punsky
c506e6c13b rexcode/arm64: scalar FP round/reciprocal + FP-to-GPR convert forms
Scalar FRINTN/P/M/Z/A/X/I and FRECPX (Sd,Sn / Dd,Dn / Hd,Hn), and the
FP-to-GPR converts FCVTAS/AU/MS/MU/NS/NU/PS/PU (Wd/Xd, Hn/Sn/Dn). All
register-only forms: element type and W/X are selected by static bits, so
specgen derives each from llvm-mc with the convert forms varying Rd and
Rn independently (zero register for the 31 case). H variants tagged FP16.
RD/RN encodings so decode reconstructs the scalar/GPR register class. All
19 representative forms byte-exact vs llvm-mc and decode-clean; 461 tests
green.
2026-06-17 23:27:26 -04:00
Brendan Punsky
06eb3de6a2 rexcode/arm64: NEON copy/permute (MOV/MVN/DUP/INS/EXT) encode forms
MOV_V (ORR alias: source feeds both Vn and Vm via a new VN_VM_DUP
encoding), MVN_V (NOT alias, plain 2-register), DUP_V (element form
Vd.T,Vn.Ts[i] and general form Vd.T,Wn/Xn), INS (element-to-element and
from-GPR), EXT_V (imm4 byte index). Adds a VEC_INDEX operand type plus
NEON_IDX5/NEON_IDX4/NEON_EXT_IDX encodings: the element-size marker rides
in the entry bits, the lane index drives the bits above it, and the
decoder recovers the element size from imm5's marker.

Element size now rides in op.size (B=1/H=2/S=4/D=8) via op_v_elem_b/h/s/d
so the matcher can disambiguate DUP/INS element forms; the builder
generator maps V_ELEM_* to those constructors. specgen derives the mask
by varying registers and each index field to its max -- the GPR-source
forms vary Vd and Rn independently (Rn 31 = wzr/xzr) so the low bit of
each field toggles. All 19 representative forms byte-exact vs llvm-mc and
decode-clean; 461 tests green. (TBL/TBX register-list forms deferred.)
2026-06-17 23:23:44 -04:00
Brendan Punsky
5761c23ba4 rexcode/arm64: NEON narrowing-shift (SHRN/SQSHRN/...) encode forms
SHRN/2, RSHRN/2, SQSHRN/2, UQSHRN/2, SQRSHRN/2, UQRSHRN/2, SQSHRUN/2,
SQRSHRUN/2 (16 forms). Verified against llvm-mc that immh keys on the
NARROW destination element (not the wide source), so the existing
NEON_SHR_IMM encoder/decoder (esize from ops[0]) is already correct --
this is a specgen-only change: right-shift esize now uses ESIZE[dst]
(equal to ESIZE[src] for same-arrangement shifts) plus narrowing
{narrow-dst, wide-src} arrangement tuples. All forms byte-exact and
decode-roundtrip verified across B/H/S element sizes; 461 tests green.
2026-06-17 23:07:38 -04:00
Brendan Punsky
a1e359b64a rexcode/arm64: NEON permute, compare-zero, SXTL/UXTL encode forms
ZIP1/2, UZP1/2, TRN1/2 (three-same permute); CMLE/CMLT and FP
FCMLE/FCMLT (compare against zero, with the literal #0 / #0.0 operand);
SXTL/SXTL2/UXTL/UXTL2 (= SSHLL/USHLL #0, plain 2-register widen, shift
implicit in the static bits). All reuse the VD/VN/VM register slots, so
no encoder change. specgen gains an emit_cmp0 shape plus permute and
widen families. All forms byte-exact vs llvm-mc; 461 tests green.
2026-06-17 23:03:53 -04:00
Brendan Punsky
55a141be4f rexcode/arm64: NEON widening left-shift (SSHLL/USHLL) encode forms
Adds SSHLL/SSHLL2/USHLL/USHLL2 (12 forms) via specgen, reusing NEON_SHL_IMM (left shifts need no esize; the size marker is in bits). specgen's shift shape generalized to arrangement pairs {dst, src} with the shift element size taken from the source.

Verified: encode matches llvm-mc + decode recovers mnemonic + amount (sshll/sshll2/ushll across widths); arm64 check + 461 tests pass.
2026-06-16 02:22:37 -04:00
Brendan Punsky
e52953c7ff rexcode/arm64: NEON shift-by-immediate encode forms + encoder extension
First encoder-extension family. Adds Operand_Type VEC_SHIFT and Operand_Encoding NEON_SHL_IMM/NEON_SHR_IMM: the element-size marker bit sits in the entry's bits, the encoder packs the amount into the low immh:immb bits (left = shift; right = esize - shift, esize from the vector operand via vec_esize/form.ops[0]), and the decoder recovers esize from immh to compute the amount.

Adds 13 mnemonics (91 forms) via specgen: left SHL/SLI/SQSHLU/SQSHL, right SSHR/USHR/SRSHR/URSHR/SSRA/USRA/SRSRA/URSRA/SRI. specgen derives bits/mask empirically by varying registers AND the shift (canon = operand bits zero; other extreme sets all shift bits), so per-arrangement immh discrimination + the growing shift-field width fall out automatically.

Verified end-to-end: encode matches llvm-mc byte-for-byte AND decode recovers mnemonic + amount (sshr/shl/sli/ushr/srsra across B/H/S/D); arm64 check + 461 tests pass.

First of the encoder-extension phase ([[rexcode-encode-coverage]]); CCMP_IMM imm5@20:16 pattern generalizes here.
2026-06-16 02:19:03 -04:00
Brendan Punsky
ff3a1acdc7 rexcode/arm64: NEON FP widen/narrow (FCVTL/FCVTN/FCVTXN) encode forms
Adds 6 mnemonics (10 forms) via specgen: FCVTL/FCVTL2 (widen half->single, single->double), FCVTN/FCVTN2 and FCVTXN/FCVTXN2 (narrow), with FP16 (V_4H_FP16/V_8H_FP16) on the half side. Completes the register-only NEON phase. Verified: decode round-trips, arm64 check + 461 tests pass.
2026-06-15 21:39:29 -04:00
Brendan Punsky
77c0265df9 rexcode/arm64: NEON ABS/NEG + FP vector-convert encode forms
Adds 14 mnemonics (74 forms) via specgen: integer two-register ABS/NEG, and the FP vector-convert (register form) family FCVTAS/AU/MS/MU/NS/NU/PS/PU/ZS/ZU + SCVTF/UCVTF. SP/DP .NEON, half-precision .FP16; the fixed-point (#fbits) convert forms come later with the immediate phase.

Verified: decode round-trips incl. FP16 (abs/neg/fcvtzs.8h/scvtf), arm64 check + 461 tests pass.
2026-06-15 21:37:31 -04:00
Brendan Punsky
7cd39f1d0d rexcode/arm64: NEON FP two-register + FP across-lanes encode forms
Adds 16 mnemonics (72 forms) via specgen: FP two-register FABS/FNEG/FSQRT, FRINTA/I/M/N/P/X/Z, FRECPE/FRSQRTE; and FP across-lanes FMAXV/FMINV/FMAXNMV/FMINNMV (scalar S/H dst). SP/DP are .NEON, half-precision .FP16.

specgen per-form feature now scans all operands for an FP16 arrangement (handles scalar-dst across-lanes, where FP16 lives on the source). Verified: decode round-trips incl. FP16 (fabs/fsqrt.8h/frecpe/fmaxv), arm64 check + 461 tests pass.
2026-06-15 21:35:41 -04:00
Brendan Punsky
57fbe873d8 rexcode/arm64: NEON floating-point three-same encode forms (incl. FP16)
Adds 17 FP three-same mnemonics (85 forms) via specgen: FMAX/FMIN/FMAXNM/FMINNM, FMULX/FRECPS/FRSQRTS, FACGE/FACGT, FCMEQ/FCMGE/FCMGT (register form), FADDP/FMAXP/FMINP/FMAXNMP/FMINNMP. Single/double forms (2S/4S/2D) are .NEON; half-precision (4H/8H) use the distinct V_4H_FP16/V_8H_FP16 operand types and are tagged .FP16.

specgen gains: --mattr=+fullfp16, FP16 arrangement tokens, per-form feature tagging (derived from the operand arrangement), and per-family arrangement lists. Verified: decode round-trips incl. an FP16 form (fmax/fcmeq.4h/fmulx/faddp), arm64 check + 461 tests pass.
2026-06-15 21:33:25 -04:00
Brendan Punsky
824421853f rexcode/arm64: NEON across-lanes + pairwise-long encode forms
Adds 11 mnemonics (59 forms) via specgen: across-lanes reductions ADDV/SMAXV/SMINV/UMAXV/UMINV (scalar B/H/S dst) and SADDLV/UADDLV (widened scalar dst), plus pairwise-long SADDLP/UADDLP/SADALP/UADALP. The scalar destination packs into the same VD field, so still no encoder change.

specgen.emit generalized to accept scalar-register operand tokens (B/H/S/D) alongside vector arrangements. Verified: decode round-trips (ADDV/SADDLV/UMINV/SADDLP), arm64 check + 461 tests pass.
2026-06-15 21:27:40 -04:00
Brendan Punsky
7ebe042277 rexcode/arm64: NEON wide / narrow / XTN encode forms
Adds 24 more mixed-arrangement mnemonics (72 forms) via specgen: three-different wide (SADDW/UADDW/SSUBW/USUBW), narrowing-halving (ADDHN/SUBHN/RADDHN/RSUBHN), and two-register narrowing (XTN/SQXTN/UQXTN/SQXTUN), plus their high-half '2' variants. All register-only (VD/VN/VM or VD/VN), no encoder change.

specgen refactored to a general arrangement-tuple mechanism (uniform + long/wide/narrow/XTN share one emit path). Verified: decode round-trips (SADDW/ADDHN/XTN/SQXTUN2), arm64 check + 461 tests pass.
2026-06-15 21:22:07 -04:00
Brendan Punsky
f78a3a5573 rexcode/arm64: NEON three-different (long) encode forms
Adds 26 widening long mnemonics (72 forms) via specgen: SADDL/UADDL/SSUBL/USUBL, SMULL/UMULL, SMLAL/UMLAL/SMLSL/UMLSL, SQDMULL/SQDMLAL/SQDMLSL and their high-half '2' variants. Destination arrangement is wider than the sources (Vd.8H, Vn.8B, Vm.8B; the '2' forms read the high half). Encoding stays VD/VN/VM, so no encoder change.

specgen gains a mixed-arrangement THREE_DIFF shape (low/high source-half pairs). Verified: decode round-trips (SMULL/SADDL2/SQDMULL/UMLAL), arm64 check + 461 tests pass.
2026-06-15 21:16:56 -04:00
Brendan Punsky
00b666bbc0 rexcode/arm64: NEON pairwise + variable-shift encode forms
Adds 9 register-only three-same mnemonics (59 forms) via specgen: ADDP/SMAXP/SMINP/UMAXP/UMINP (pairwise) and SSHL/USHL/SRSHL/URSHL (per-lane variable shift). Verified: decode round-trips (ADDP/SSHL/SMAXP/URSHL), arm64 check + 461 tests pass.

Skipped the already-implemented logical/compare/mul forms (AND_V/ORR_V/EOR_V/BIC_V/ORN_V/BSL/BIT/BIF/CMEQ/CMGT/CMHI/MUL_V) to avoid duplicate keys.
2026-06-15 13:01:26 -04:00
Brendan Punsky
d83065e3b8 rexcode/arm64: NEON two-register-misc encode forms
Adds 10 Advanced-SIMD two-register-misc mnemonics (34 forms across valid arrangements) via specgen: NOT/RBIT/REV16/REV32/REV64/CLS/CLZ/CNT/URECPE/URSQRTE. llvm-mc filters illegal arrangements (CNT/NOT only 8B/16B, URECPE/URSQRTE only 2S/4S, ...).

specgen.lua generalized to a SHAPE table (THREE_SAME / TWO_SAME), so adding a family is one row. Verified: decode round-trips (NOT/REV64/CNT/URECPE), arm64 check + 461 tests pass.
2026-06-15 12:57:37 -04:00
Brendan Punsky
e21fa59733 rexcode/arm64: NEON three-same (integer) encode forms + specgen tool
Adds 25 Advanced-SIMD three-same integer mnemonics (153 forms across arrangements) to ENCODING_TABLE: SHADD/UHADD/SHSUB/UHSUB/SRHADD/URHADD, SQADD/UQADD/SQSUB/UQSUB, SMAX/UMAX/SMIN/UMIN, SABD/UABD/SABA/UABA, MLA/MLS, CMGE/CMHS/CMTST, SQDMULH/SQRDMULH.

Introduces tablegen/specgen.lua: compact specs (mnemonic + llvm name + arrangements) -> ENCODING_TABLE blocks, with bits taken from llvm-mc (the oracle) and mask derived empirically (vary registers 0..31). Invalid arrangements are auto-detected via llvm-mc and skipped. Output fills the SPECGEN:BEGIN..END region of encoding_table.odin in place; the hand-written core is untouched.

Verified: decode round-trips for SHADD/SQADD/CMGE/SQRDMULH; arm64 check + 461 tests pass; builders auto-generate (780 -> 805). Caveat: NEON builders currently collapse arrangements (one Register param per V operand) so inst_<mnem> exposes only the first arrangement -- an arrangement-aware builder-gen pass follows.

Author: Brendan Punsky (machine git config user.name is the login 'Flāvius').
2026-06-15 12:52:10 -04:00
Brendan Punsky
7b588d0818 rexcode/arm64: implement CCMP/CCMN register-form encode forms
First entries of the encode-coverage effort. Adds CCMP_REG/CCMN_REG (W/X) to ENCODING_TABLE with llvm-mc-verified bit patterns; the table metaprogram regenerates the encode/decode blobs and the typed builders auto-generate (inst_ccmp_reg/inst_ccmn_reg).

Verified: encode matches llvm-mc (CCMP X1,X2,#3,EQ=0xFA420023; CCMN W5,W6,#7,NE=0x3A4610A7), decode round-trips, arm64 check+tests pass. Needs no encoder change (reuses RN/RM/NZCV_FIELD/COND_HI). The imm5 forms (immediate at bits 20:16) need a new Operand_Encoding and follow separately.

Workflow proven: llvm-mc as the encoding oracle -> SoT entry -> regen -> builder auto-generates -> verify.
2026-06-15 12:52:10 -04:00
Brendan Punsky
47fc72e0ba rexcode: 100% generated mnemonic-builder coverage; drop hand-written collisions
Every mnemonic with an encode form now has a generated inst_<mnem>/emit_<mnem> overload group. The per-arch generators map ALL operand types — nothing is skipped: arm64 gains shifted/extended registers (multi-param via op_shifted/op_extended), SVE Z-regs + predicates, SME tile/slice, NEON arrangements/lanes, bitmask/sysreg/pattern immediates and condition codes (427 -> 777 mnemonics); arm32 gains shifted/register-shifted regs, register lists, NEON lanes and all encoded-immediate subclasses (479 -> 592); x86 gains m80 and descriptor-table memory operands — FBLD/FBSTP, LGDT/SGDT/LIDT/SIDT, FLD/FSTP, far-indirect JMP/CALL, BOUND (1167 -> 1175).

Mnemonic-specific builders are now fully generated, not hand-written: deleted the hand-written helpers the generated groups collided with — riscv inst_jal/inst_jalr, arm64 inst_b_cond/inst_cbz/inst_tbz/inst_csel, mos6502 inst_tst — and let the generators own those names (arm64 also gains inst_cbnz/tbnz/csinc/csinv/csneg). Updated the affected test call-sites. The generic operand-shape helpers (inst_r_r, inst_r_r_i, inst_ldst, ...) remain as delegation targets.

Decode-only mnemonics with no encode form are correctly left without builders. ppc/ppc_vle/rsp/mos65816 were already complete.

All 10 ISAs: structure + compile + tests pass; generators idempotent.
2026-06-15 12:52:10 -04:00
Brendan Punsky
1b72d425d4 rexcode: add typed per-mnemonic builders for all arches; CWD-independent regen
Add generated mnemonic_builders.odin (inst_<mnem>/emit_<mnem> typed overload sets) for arm32, arm64, mips, riscv, ppc, ppc_vle, rsp, mos6502 and mos65816, matching the existing x86 builders. Each is produced by a per-arch tools/gen_mnemonic_builders.odin that walks ENCODE_FORMS and maps operand types to typed params + op_* constructors.

Anchor every generator's output via #directory so regeneration is CWD-independent; previously the bare "mnemonic_builders.odin" path wrote to the current directory and misfired when run from the repo root.

Wire a --builders task into build.lua (folded into 'all', covered by --idempotent, enforced by the structural invariants) and document it in the README.
2026-06-15 12:52:10 -04:00
gingerBill
693fc1ec18 Allow for struct #raw_union #packed 2026-06-15 14:42:38 +01:00
gingerBill
182f234ed2 Minimize rsp Instruction and Operand 2026-06-15 14:37:25 +01:00
gingerBill
4f96105520 Minimize riscv Instruction and Operand 2026-06-15 14:36:10 +01:00
gingerBill
a839f5e833 Minimize ppc_vle Instruction and Operand 2026-06-15 14:35:34 +01:00
gingerBill
7a17144aa1 Minimize ppc Instruction and Operand 2026-06-15 14:34:45 +01:00
gingerBill
b006a8853e Minimize mos65816 Instruction and Operand 2026-06-15 14:32:42 +01:00
gingerBill
59c4292224 Minimize mos6502 Instruction and Operand 2026-06-15 14:30:32 +01:00
gingerBill
406dfbe86d Minimize mips Instruction and Operand 2026-06-15 14:29:14 +01:00
gingerBill
6527f90181 MInimize arm64 Instruction and Operand 2026-06-15 14:27:49 +01:00
gingerBill
7aaef31bb3 Correct sizes of arm32 Instruction and Operand 2026-06-15 14:24:05 +01:00
gingerBill
f895e96bde Add benchmark flag for x86 tests to just test that 2026-06-15 14:17:11 +01:00
gingerBill
2dd262ea10 x86: improve benchmark test do not run the code on Windows since it relies in SysV 2026-06-15 14:14:38 +01:00