Commit Graph

8 Commits

Author SHA1 Message Date
Brendan Punsky
69157b7ec5 rexcode/arm64: SME ADDHA/ADDVA (ZA outer-sum accumulate)
ADDHA/ADDVA ZAda.S, Pn/m, Pm/m, Zn.S via a new ZA_TILE_LOW encoding
(accumulator tile at bits 2:0; Pn at 12:10, Pm at 15:13, Zn at 9:5).
Byte-exact vs llvm-mc and decode-clean across tile/predicate/Zn fields.

The other 11 missing SME enum names (SME_LD1*/ST1*_ZA, SME_MOVA_TO_Z/ZA)
are redundant aliases of the already-implemented SME_LD1*/ST1*_TILE and
SME_MOVA_*_FROM_* forms -- adding duplicate encodings collides in the
decode table (broke a roundtrip test), so they are intentionally left to
the existing canonical forms. 461 tests green.
2026-06-18 00:14:21 -04:00
Brendan Punsky
68aac263d0 rexcode/arm64: SVE FFR/BRKN/CPY/EXT/MOV aliases (10 more, SVE 47/48)
FFR ops (SETFFR/RDFFR/WRFFR) and BRKN (destructive, Pdm re-packs Pd) via
specgen; CPY (predicated from GPR), EXT (destructive, imm8 split via new
SVE_EXT_IMM), MOV-predicated (=SEL with Zm=Zd, via ZD_ZM_DUP), and the
predicate aliases NOT/MOVS/MOV (EOR/ORR/AND with a duplicated predicate
field, via PG4_PM_DUP/PN_PM_DUP/PN_PG_PM_DUP). All byte-exact vs llvm-mc;
the predicate aliases decode to their canonical base op (identical bytes,
as expected). 461 tests green.

(SVE_XAR_Z deferred: its tsz:imm3 shift field does not follow the NEON
immh:immb scheme and needs a bespoke esize-from-Z encoder.)
2026-06-18 00:09:21 -04:00
Brendan Punsky
8006b5f7e2 rexcode/arm64: NEON MOVI/MVNI + FMOV scalar/vector immediate forms
MOVI (8B/16B/4H/8H/2S/4S/2D) and MVNI (4H/8H/2S/4S) via specgen (imm8 in
abc:defgh, cmode/op/Q static per arrangement; .2D probed with all-ones
since its asm immediate is the replicated 64-bit value). FMOV_IMM (scalar
Sd/Dd/Hd, 8-bit float at 20:13 via new FMOV_SCALAR_IMM encoding) and
FMOV_V_IMM (Vd.<2S|4S|2D|4H|8H>, fimm8 in abc:defgh, cmode=1111) hand-
written -- canonical bits with the imm8 fields zeroed (the live float
example would otherwise bake operand bits into the static pattern). All
14 representative forms byte-exact vs llvm-mc and decode-clean; 461 tests
green. (LSL/MSL-shifted MOVI/MVNI variants share the operand signature
and are omitted.)
2026-06-17 23:47:45 -04:00
Brendan Punsky
aabcdd41b6 rexcode/arm64: CCMP/CCMN-imm, HINT, MSR-imm, USDOT encode forms
Conditional compare immediate (CCMP_IMM/CCMN_IMM: imm5 at 20:16 via a new
IMM5_HI encoding, bit 11 set), HINT #imm7, MSR <pstatefield>,#imm (new
MSR_PSTATE encoding placing op1 at 18:16 / op2 at 7:5, CRm via the shared
BARRIER_FIELD), and USDOT (I8MM unsigned-by-signed dot product, .2S/.4S).
Hand-written into the core (outside the specgen region). All forms
byte-exact vs llvm-mc and decode-clean; 461 tests green.
2026-06-17 23:34:10 -04:00
Brendan Punsky
06eb3de6a2 rexcode/arm64: NEON copy/permute (MOV/MVN/DUP/INS/EXT) encode forms
MOV_V (ORR alias: source feeds both Vn and Vm via a new VN_VM_DUP
encoding), MVN_V (NOT alias, plain 2-register), DUP_V (element form
Vd.T,Vn.Ts[i] and general form Vd.T,Wn/Xn), INS (element-to-element and
from-GPR), EXT_V (imm4 byte index). Adds a VEC_INDEX operand type plus
NEON_IDX5/NEON_IDX4/NEON_EXT_IDX encodings: the element-size marker rides
in the entry bits, the lane index drives the bits above it, and the
decoder recovers the element size from imm5's marker.

Element size now rides in op.size (B=1/H=2/S=4/D=8) via op_v_elem_b/h/s/d
so the matcher can disambiguate DUP/INS element forms; the builder
generator maps V_ELEM_* to those constructors. specgen derives the mask
by varying registers and each index field to its max -- the GPR-source
forms vary Vd and Rn independently (Rn 31 = wzr/xzr) so the low bit of
each field toggles. All 19 representative forms byte-exact vs llvm-mc and
decode-clean; 461 tests green. (TBL/TBX register-list forms deferred.)
2026-06-17 23:23:44 -04:00
Brendan Punsky
e52953c7ff rexcode/arm64: NEON shift-by-immediate encode forms + encoder extension
First encoder-extension family. Adds Operand_Type VEC_SHIFT and Operand_Encoding NEON_SHL_IMM/NEON_SHR_IMM: the element-size marker bit sits in the entry's bits, the encoder packs the amount into the low immh:immb bits (left = shift; right = esize - shift, esize from the vector operand via vec_esize/form.ops[0]), and the decoder recovers esize from immh to compute the amount.

Adds 13 mnemonics (91 forms) via specgen: left SHL/SLI/SQSHLU/SQSHL, right SSHR/USHR/SRSHR/URSHR/SSRA/USRA/SRSRA/URSRA/SRI. specgen derives bits/mask empirically by varying registers AND the shift (canon = operand bits zero; other extreme sets all shift bits), so per-arrangement immh discrimination + the growing shift-field width fall out automatically.

Verified end-to-end: encode matches llvm-mc byte-for-byte AND decode recovers mnemonic + amount (sshr/shl/sli/ushr/srsra across B/H/S/D); arm64 check + 461 tests pass.

First of the encoder-extension phase ([[rexcode-encode-coverage]]); CCMP_IMM imm5@20:16 pattern generalizes here.
2026-06-16 02:19:03 -04:00
Flāvius
a4f08f8307 Load rexcode encode/decode tables from committed binary blobs
Each ISA's hand-written ENCODING_TABLE (the single source of truth) now lives
in a per-arch tablegen/ metaprogram that flattens it and serializes committed
binary blobs; the library #loads those into @(rodata) at compile time rather
than compiling a table body. No arch keeps encoding_table.odin or
decoding_tables.odin -- only a generated tables.odin loader and tables/*.bin.

* Two-stage, type-checked pipeline: tablegen Stage A emits human-readable
  generated Odin, which compiles and serializes the blobs in Stage B.
* encode() goes through encoding_forms(m); decoders are unchanged apart from
  x86's flattened 2-D index. Decode tables are byte-identical to the old ones.
* build.lua: a LuaJIT driver for the metaprograms, validations, and tests,
  with cross-platform gating and a clear report.
* Docs refreshed; the obsolete forward-looking plan in cross_arch_design.md
  trimmed to what was actually built.
* Attribution headers added to all rexcode source files; the generators emit
  them so generated files keep them.
2026-06-15 07:43:29 -04:00
gingerBill
d6ae77b67e core:rexcode 2026-06-14 16:30:18 +01:00