Commit Graph

9 Commits

Author SHA1 Message Date
Brendan Punsky
a63fb51fdd rexcode/arm32: MVE VMLSV/VMLSVA (correct 3-bit Q regs); drop placeholders
Implement VMLSV/VMLSVA (MVE multiply-subtract reduce) properly: new
VN_Q_MVE (Qn at 19:17) and VM_Q_MVE (Qm at 3:1) encodings -- the actual
3-bit MVE Q fields -- with Rd at 15:12 (RDLO_A32). The earlier collision
was from reusing the 4-bit VN_Q (19:16) and RD_T32 (11:8), which place
the fields wrong; byte-exact vs llvm-mc now with distinct Qn/Qm/Rd.

Drop three placeholder/redundant enum entries: VRINT and VPRINT (not real
instructions -- llvm rejects bare 'vrint'; VPRINT is a printf-like debug
pseudo-op), and VRSHL_MVE (the author's own comment marks it a
placeholder; 'vrshl q,q,q' already decodes via VRSHL's MVE form). 600
tests green, verify matches llvm-mc.
2026-06-18 01:58:19 -04:00
Brendan Punsky
239dea4f55 rexcode/arm32: MVE VHCADD (saturating halving complex add) + VCMLA
New MVE_ROT_HCADD (#90/#270 at bit12) and MVE_ROT_CMLA (#0/90/180/270 at
bits 24:23) rotation encodings -- the rotation degrees round-trip
properly (unlike the existing FCMA VCMLA which leaves it unencoded). One
form each with the element-size bits left variable (MVE convention).
Verify round-trips; all rotations byte-exact vs llvm-mc; 600 tests green.

(VMLSV/VMLSVA reduce ops deferred: their format decode-collides with
other MVE encodings given the 4-bit VN_Q vs MVE's 3-bit Qn.)
2026-06-18 01:47:44 -04:00
Brendan Punsky
55463b6719 rexcode/arm32: VMOV (ARM core register to scalar) Dd[lane], Rt
New VMOV_LANE_8/16/32 encodings: Dd at bits 19:16+bit7, lane bits per
element size (.8 = bit21:bit6:bit5 with bit22 size marker; .16 =
bit21:bit6 with bit5 marker; .32 = bit21). Verify round-trips all three
sizes; spot-checked .8 byte-exact incl. max lane; 600 tests green.
2026-06-18 01:34:48 -04:00
Brendan Punsky
5df81b5117 rexcode/arm32: VQDMULH/VQRDMULH by-scalar-lane
New NEON_VM_SCALAR16/32 encodings for the Dm[lane] scalar operand: .16
places Dm in D0..D7 (bits 2:0) with the lane split bit5:bit3, .32 places
Dm in D0..D15 (bits 3:0) with the lane at bit5. VQDMULH_LANE and
VQRDMULH_LANE across .s16/.s32, D and Q destinations (8 forms). Verify
round-trips; spot-checked byte-exact incl. max register/lane and
decode-clean; 600 tests green.
2026-06-18 01:29:19 -04:00
Brendan Punsky
acc14864f3 rexcode/arm32: DCPS1/DCPS2/DCPS3 (debug change PE state)
Fixed T32 encodings (0xF78F8001/2/3), no operands. Verify round-trips;
600 tests green.
2026-06-18 01:25:51 -04:00
Brendan Punsky
b2b14998f7 rexcode/arm32: VRSRA, VRECPE_F/VRSQRTE_F, VPADD_F, VCVTR
VRSRA (NEON rounding shift-right-accumulate, D/Q, mirrors VSRA's raw
imm6 convention), VRECPE_F/VRSQRTE_F (FP reciprocal/rsqrt estimate, D/Q),
VPADD_F (FP pairwise add, f32/f16), and VCVTR (VFP convert-to-integer
using the FPSCR rounding mode; s32/u32 from f32 and f64). Hand-written
mirroring the existing VSRA/VRECPE/VPADD/VCVT forms. Built-in llvm
round-trip verify passes; spot-checked byte-exact; 600 tests green.
2026-06-18 01:22:12 -04:00
Brendan Punsky
59750926d9 rexcode/arm32: unprivileged (translate) post-indexed loads/stores
LDRT/LDRBT/STRT/STRBT (imm12) and LDRHT/STRHT/LDRSBT/LDRSHT (imm8 split):
each is the corresponding post-indexed load/store with the W bit (21)
set. Hand-written, reusing the existing MEM_POST_INDEX encoding. All 8
byte-exact vs llvm-mc and decode-clean; 600 tests green.
2026-06-18 01:17:34 -04:00
Brendan Punsky
6fd233f041 rexcode/arm32: NEON long/wide/compare/shift encode forms (specgen)
New arm32 specgen (llvm-mc --triple=armv8a --mattr=+neon as the bits
oracle, empirical masks): VADDL/VSUBL/VABAL/VABDL (Qd,Dn,Dm) and
VADDW/VSUBW (Qd,Qn,Dm) across s/u 8/16/32; the compare aliases
VCLE/VCLT (= VCGE/VCGT with Vn/Vm swapped) and VACLE/VACLT (= VACGE/VACGT
swapped, f32); and VQRSHL shift-by-vector. 84 forms over 11 mnemonics.
Built-in llvm round-trip verify passes; spot-checked byte-exact with
distinct Q/D registers; 600 tests green.
2026-06-18 01:15:22 -04:00
Flāvius
a4f08f8307 Load rexcode encode/decode tables from committed binary blobs
Each ISA's hand-written ENCODING_TABLE (the single source of truth) now lives
in a per-arch tablegen/ metaprogram that flattens it and serializes committed
binary blobs; the library #loads those into @(rodata) at compile time rather
than compiling a table body. No arch keeps encoding_table.odin or
decoding_tables.odin -- only a generated tables.odin loader and tables/*.bin.

* Two-stage, type-checked pipeline: tablegen Stage A emits human-readable
  generated Odin, which compiles and serializes the blobs in Stage B.
* encode() goes through encoding_forms(m); decoders are unchanged apart from
  x86's flattened 2-D index. Decode tables are byte-identical to the old ones.
* build.lua: a LuaJIT driver for the metaprograms, validations, and tests,
  with cross-platform gating and a clear report.
* Docs refreshed; the obsolete forward-looking plan in cross_arch_design.md
  trimmed to what was actually built.
* Attribution headers added to all rexcode source files; the generators emit
  them so generated files keep them.
2026-06-15 07:43:29 -04:00