Commit Graph

60 Commits

Author SHA1 Message Date
Brendan Punsky
55a141be4f rexcode/arm64: NEON widening left-shift (SSHLL/USHLL) encode forms
Adds SSHLL/SSHLL2/USHLL/USHLL2 (12 forms) via specgen, reusing NEON_SHL_IMM (left shifts need no esize; the size marker is in bits). specgen's shift shape generalized to arrangement pairs {dst, src} with the shift element size taken from the source.

Verified: encode matches llvm-mc + decode recovers mnemonic + amount (sshll/sshll2/ushll across widths); arm64 check + 461 tests pass.
2026-06-16 02:22:37 -04:00
Brendan Punsky
e52953c7ff rexcode/arm64: NEON shift-by-immediate encode forms + encoder extension
First encoder-extension family. Adds Operand_Type VEC_SHIFT and Operand_Encoding NEON_SHL_IMM/NEON_SHR_IMM: the element-size marker bit sits in the entry's bits, the encoder packs the amount into the low immh:immb bits (left = shift; right = esize - shift, esize from the vector operand via vec_esize/form.ops[0]), and the decoder recovers esize from immh to compute the amount.

Adds 13 mnemonics (91 forms) via specgen: left SHL/SLI/SQSHLU/SQSHL, right SSHR/USHR/SRSHR/URSHR/SSRA/USRA/SRSRA/URSRA/SRI. specgen derives bits/mask empirically by varying registers AND the shift (canon = operand bits zero; other extreme sets all shift bits), so per-arrangement immh discrimination + the growing shift-field width fall out automatically.

Verified end-to-end: encode matches llvm-mc byte-for-byte AND decode recovers mnemonic + amount (sshr/shl/sli/ushr/srsra across B/H/S/D); arm64 check + 461 tests pass.

First of the encoder-extension phase ([[rexcode-encode-coverage]]); CCMP_IMM imm5@20:16 pattern generalizes here.
2026-06-16 02:19:03 -04:00
Brendan Punsky
ff3a1acdc7 rexcode/arm64: NEON FP widen/narrow (FCVTL/FCVTN/FCVTXN) encode forms
Adds 6 mnemonics (10 forms) via specgen: FCVTL/FCVTL2 (widen half->single, single->double), FCVTN/FCVTN2 and FCVTXN/FCVTXN2 (narrow), with FP16 (V_4H_FP16/V_8H_FP16) on the half side. Completes the register-only NEON phase. Verified: decode round-trips, arm64 check + 461 tests pass.
2026-06-15 21:39:29 -04:00
Brendan Punsky
77c0265df9 rexcode/arm64: NEON ABS/NEG + FP vector-convert encode forms
Adds 14 mnemonics (74 forms) via specgen: integer two-register ABS/NEG, and the FP vector-convert (register form) family FCVTAS/AU/MS/MU/NS/NU/PS/PU/ZS/ZU + SCVTF/UCVTF. SP/DP .NEON, half-precision .FP16; the fixed-point (#fbits) convert forms come later with the immediate phase.

Verified: decode round-trips incl. FP16 (abs/neg/fcvtzs.8h/scvtf), arm64 check + 461 tests pass.
2026-06-15 21:37:31 -04:00
Brendan Punsky
7cd39f1d0d rexcode/arm64: NEON FP two-register + FP across-lanes encode forms
Adds 16 mnemonics (72 forms) via specgen: FP two-register FABS/FNEG/FSQRT, FRINTA/I/M/N/P/X/Z, FRECPE/FRSQRTE; and FP across-lanes FMAXV/FMINV/FMAXNMV/FMINNMV (scalar S/H dst). SP/DP are .NEON, half-precision .FP16.

specgen per-form feature now scans all operands for an FP16 arrangement (handles scalar-dst across-lanes, where FP16 lives on the source). Verified: decode round-trips incl. FP16 (fabs/fsqrt.8h/frecpe/fmaxv), arm64 check + 461 tests pass.
2026-06-15 21:35:41 -04:00
Brendan Punsky
57fbe873d8 rexcode/arm64: NEON floating-point three-same encode forms (incl. FP16)
Adds 17 FP three-same mnemonics (85 forms) via specgen: FMAX/FMIN/FMAXNM/FMINNM, FMULX/FRECPS/FRSQRTS, FACGE/FACGT, FCMEQ/FCMGE/FCMGT (register form), FADDP/FMAXP/FMINP/FMAXNMP/FMINNMP. Single/double forms (2S/4S/2D) are .NEON; half-precision (4H/8H) use the distinct V_4H_FP16/V_8H_FP16 operand types and are tagged .FP16.

specgen gains: --mattr=+fullfp16, FP16 arrangement tokens, per-form feature tagging (derived from the operand arrangement), and per-family arrangement lists. Verified: decode round-trips incl. an FP16 form (fmax/fcmeq.4h/fmulx/faddp), arm64 check + 461 tests pass.
2026-06-15 21:33:25 -04:00
Brendan Punsky
824421853f rexcode/arm64: NEON across-lanes + pairwise-long encode forms
Adds 11 mnemonics (59 forms) via specgen: across-lanes reductions ADDV/SMAXV/SMINV/UMAXV/UMINV (scalar B/H/S dst) and SADDLV/UADDLV (widened scalar dst), plus pairwise-long SADDLP/UADDLP/SADALP/UADALP. The scalar destination packs into the same VD field, so still no encoder change.

specgen.emit generalized to accept scalar-register operand tokens (B/H/S/D) alongside vector arrangements. Verified: decode round-trips (ADDV/SADDLV/UMINV/SADDLP), arm64 check + 461 tests pass.
2026-06-15 21:27:40 -04:00
Brendan Punsky
7ebe042277 rexcode/arm64: NEON wide / narrow / XTN encode forms
Adds 24 more mixed-arrangement mnemonics (72 forms) via specgen: three-different wide (SADDW/UADDW/SSUBW/USUBW), narrowing-halving (ADDHN/SUBHN/RADDHN/RSUBHN), and two-register narrowing (XTN/SQXTN/UQXTN/SQXTUN), plus their high-half '2' variants. All register-only (VD/VN/VM or VD/VN), no encoder change.

specgen refactored to a general arrangement-tuple mechanism (uniform + long/wide/narrow/XTN share one emit path). Verified: decode round-trips (SADDW/ADDHN/XTN/SQXTUN2), arm64 check + 461 tests pass.
2026-06-15 21:22:07 -04:00
Brendan Punsky
f78a3a5573 rexcode/arm64: NEON three-different (long) encode forms
Adds 26 widening long mnemonics (72 forms) via specgen: SADDL/UADDL/SSUBL/USUBL, SMULL/UMULL, SMLAL/UMLAL/SMLSL/UMLSL, SQDMULL/SQDMLAL/SQDMLSL and their high-half '2' variants. Destination arrangement is wider than the sources (Vd.8H, Vn.8B, Vm.8B; the '2' forms read the high half). Encoding stays VD/VN/VM, so no encoder change.

specgen gains a mixed-arrangement THREE_DIFF shape (low/high source-half pairs). Verified: decode round-trips (SMULL/SADDL2/SQDMULL/UMLAL), arm64 check + 461 tests pass.
2026-06-15 21:16:56 -04:00
Brendan Punsky
00b666bbc0 rexcode/arm64: NEON pairwise + variable-shift encode forms
Adds 9 register-only three-same mnemonics (59 forms) via specgen: ADDP/SMAXP/SMINP/UMAXP/UMINP (pairwise) and SSHL/USHL/SRSHL/URSHL (per-lane variable shift). Verified: decode round-trips (ADDP/SSHL/SMAXP/URSHL), arm64 check + 461 tests pass.

Skipped the already-implemented logical/compare/mul forms (AND_V/ORR_V/EOR_V/BIC_V/ORN_V/BSL/BIT/BIF/CMEQ/CMGT/CMHI/MUL_V) to avoid duplicate keys.
2026-06-15 13:01:26 -04:00
Brendan Punsky
d83065e3b8 rexcode/arm64: NEON two-register-misc encode forms
Adds 10 Advanced-SIMD two-register-misc mnemonics (34 forms across valid arrangements) via specgen: NOT/RBIT/REV16/REV32/REV64/CLS/CLZ/CNT/URECPE/URSQRTE. llvm-mc filters illegal arrangements (CNT/NOT only 8B/16B, URECPE/URSQRTE only 2S/4S, ...).

specgen.lua generalized to a SHAPE table (THREE_SAME / TWO_SAME), so adding a family is one row. Verified: decode round-trips (NOT/REV64/CNT/URECPE), arm64 check + 461 tests pass.
2026-06-15 12:57:37 -04:00
Brendan Punsky
e21fa59733 rexcode/arm64: NEON three-same (integer) encode forms + specgen tool
Adds 25 Advanced-SIMD three-same integer mnemonics (153 forms across arrangements) to ENCODING_TABLE: SHADD/UHADD/SHSUB/UHSUB/SRHADD/URHADD, SQADD/UQADD/SQSUB/UQSUB, SMAX/UMAX/SMIN/UMIN, SABD/UABD/SABA/UABA, MLA/MLS, CMGE/CMHS/CMTST, SQDMULH/SQRDMULH.

Introduces tablegen/specgen.lua: compact specs (mnemonic + llvm name + arrangements) -> ENCODING_TABLE blocks, with bits taken from llvm-mc (the oracle) and mask derived empirically (vary registers 0..31). Invalid arrangements are auto-detected via llvm-mc and skipped. Output fills the SPECGEN:BEGIN..END region of encoding_table.odin in place; the hand-written core is untouched.

Verified: decode round-trips for SHADD/SQADD/CMGE/SQRDMULH; arm64 check + 461 tests pass; builders auto-generate (780 -> 805). Caveat: NEON builders currently collapse arrangements (one Register param per V operand) so inst_<mnem> exposes only the first arrangement -- an arrangement-aware builder-gen pass follows.

Author: Brendan Punsky (machine git config user.name is the login 'Flāvius').
2026-06-15 12:52:10 -04:00
Brendan Punsky
7b588d0818 rexcode/arm64: implement CCMP/CCMN register-form encode forms
First entries of the encode-coverage effort. Adds CCMP_REG/CCMN_REG (W/X) to ENCODING_TABLE with llvm-mc-verified bit patterns; the table metaprogram regenerates the encode/decode blobs and the typed builders auto-generate (inst_ccmp_reg/inst_ccmn_reg).

Verified: encode matches llvm-mc (CCMP X1,X2,#3,EQ=0xFA420023; CCMN W5,W6,#7,NE=0x3A4610A7), decode round-trips, arm64 check+tests pass. Needs no encoder change (reuses RN/RM/NZCV_FIELD/COND_HI). The imm5 forms (immediate at bits 20:16) need a new Operand_Encoding and follow separately.

Workflow proven: llvm-mc as the encoding oracle -> SoT entry -> regen -> builder auto-generates -> verify.
2026-06-15 12:52:10 -04:00
Brendan Punsky
47fc72e0ba rexcode: 100% generated mnemonic-builder coverage; drop hand-written collisions
Every mnemonic with an encode form now has a generated inst_<mnem>/emit_<mnem> overload group. The per-arch generators map ALL operand types — nothing is skipped: arm64 gains shifted/extended registers (multi-param via op_shifted/op_extended), SVE Z-regs + predicates, SME tile/slice, NEON arrangements/lanes, bitmask/sysreg/pattern immediates and condition codes (427 -> 777 mnemonics); arm32 gains shifted/register-shifted regs, register lists, NEON lanes and all encoded-immediate subclasses (479 -> 592); x86 gains m80 and descriptor-table memory operands — FBLD/FBSTP, LGDT/SGDT/LIDT/SIDT, FLD/FSTP, far-indirect JMP/CALL, BOUND (1167 -> 1175).

Mnemonic-specific builders are now fully generated, not hand-written: deleted the hand-written helpers the generated groups collided with — riscv inst_jal/inst_jalr, arm64 inst_b_cond/inst_cbz/inst_tbz/inst_csel, mos6502 inst_tst — and let the generators own those names (arm64 also gains inst_cbnz/tbnz/csinc/csinv/csneg). Updated the affected test call-sites. The generic operand-shape helpers (inst_r_r, inst_r_r_i, inst_ldst, ...) remain as delegation targets.

Decode-only mnemonics with no encode form are correctly left without builders. ppc/ppc_vle/rsp/mos65816 were already complete.

All 10 ISAs: structure + compile + tests pass; generators idempotent.
2026-06-15 12:52:10 -04:00
Brendan Punsky
1b72d425d4 rexcode: add typed per-mnemonic builders for all arches; CWD-independent regen
Add generated mnemonic_builders.odin (inst_<mnem>/emit_<mnem> typed overload sets) for arm32, arm64, mips, riscv, ppc, ppc_vle, rsp, mos6502 and mos65816, matching the existing x86 builders. Each is produced by a per-arch tools/gen_mnemonic_builders.odin that walks ENCODE_FORMS and maps operand types to typed params + op_* constructors.

Anchor every generator's output via #directory so regeneration is CWD-independent; previously the bare "mnemonic_builders.odin" path wrote to the current directory and misfired when run from the repo root.

Wire a --builders task into build.lua (folded into 'all', covered by --idempotent, enforced by the structural invariants) and document it in the README.
2026-06-15 12:52:10 -04:00
gingerBill
182f234ed2 Minimize rsp Instruction and Operand 2026-06-15 14:37:25 +01:00
gingerBill
4f96105520 Minimize riscv Instruction and Operand 2026-06-15 14:36:10 +01:00
gingerBill
a839f5e833 Minimize ppc_vle Instruction and Operand 2026-06-15 14:35:34 +01:00
gingerBill
7a17144aa1 Minimize ppc Instruction and Operand 2026-06-15 14:34:45 +01:00
gingerBill
b006a8853e Minimize mos65816 Instruction and Operand 2026-06-15 14:32:42 +01:00
gingerBill
59c4292224 Minimize mos6502 Instruction and Operand 2026-06-15 14:30:32 +01:00
gingerBill
406dfbe86d Minimize mips Instruction and Operand 2026-06-15 14:29:14 +01:00
gingerBill
6527f90181 MInimize arm64 Instruction and Operand 2026-06-15 14:27:49 +01:00
gingerBill
7aaef31bb3 Correct sizes of arm32 Instruction and Operand 2026-06-15 14:24:05 +01:00
gingerBill
f895e96bde Add benchmark flag for x86 tests to just test that 2026-06-15 14:17:11 +01:00
gingerBill
2dd262ea10 x86: improve benchmark test do not run the code on Windows since it relies in SysV 2026-06-15 14:14:38 +01:00
gingerBill
61c869833e Minimize x86.Instruction size to be 64-bytes rather than 72-bytes 2026-06-15 14:13:56 +01:00
gingerBill
b733f7d7a4 Use @(rodata) where appropriate for the table generation 2026-06-15 13:56:24 +01:00
gingerBill
5400b0f610 rexcode/x86: Add more @(rodata) usage 2026-06-15 13:49:57 +01:00
gingerBill
b7f585f448 Merge branch 'bill/rexcode' of https://github.com/odin-lang/Odin into bill/rexcode 2026-06-15 13:35:28 +01:00
Flāvius
a4f08f8307 Load rexcode encode/decode tables from committed binary blobs
Each ISA's hand-written ENCODING_TABLE (the single source of truth) now lives
in a per-arch tablegen/ metaprogram that flattens it and serializes committed
binary blobs; the library #loads those into @(rodata) at compile time rather
than compiling a table body. No arch keeps encoding_table.odin or
decoding_tables.odin -- only a generated tables.odin loader and tables/*.bin.

* Two-stage, type-checked pipeline: tablegen Stage A emits human-readable
  generated Odin, which compiles and serializes the blobs in Stage B.
* encode() goes through encoding_forms(m); decoders are unchanged apart from
  x86's flattened 2-D index. Decode tables are byte-identical to the old ones.
* build.lua: a LuaJIT driver for the metaprograms, validations, and tests,
  with cross-platform gating and a clear report.
* Docs refreshed; the obsolete forward-looking plan in cross_arch_design.md
  trimmed to what was actually built.
* Attribution headers added to all rexcode source files; the generators emit
  them so generated files keep them.
2026-06-15 07:43:29 -04:00
gingerBill
75a8639426 Make @(rodata) 2026-06-15 12:22:37 +01:00
gingerBill
ecf9a305ee Add @(require_results) to register procedures 2026-06-14 22:00:37 +01:00
gingerBill
5c9cd0146d Add @(require_results) to operand procedures 2026-06-14 21:57:27 +01:00
gingerBill
611cc807cd Add @(require_results) to instruction procedures 2026-06-14 21:54:24 +01:00
gingerBill
ced500fc94 Add fmt formatting to the Instruction.operands 2026-06-14 21:52:14 +01:00
gingerBill
c8ed0d89ed Alignment fields in the decode entry type for ppc 2026-06-14 21:23:07 +01:00
gingerBill
695dd62b58 Align instruction helpers 2026-06-14 21:01:51 +01:00
gingerBill
15a426c6b3 Minor code style changes 2026-06-14 21:00:38 +01:00
gingerBill
2f0c1457e5 Make x86 decoding tables very uniform and orderly 2026-06-14 20:46:27 +01:00
gingerBill
19bc584a0d Manually format the x86 encoding table 2026-06-14 20:05:10 +01:00
gingerBill
ce3ff285b6 Minor style improvement 2026-06-14 19:50:32 +01:00
gingerBill
efa535eec2 Minor clean up of the mnemonics code 2026-06-14 19:48:02 +01:00
gingerBill
d75624ccbd Add @(require_results) where appropriate to isa 2026-06-14 19:41:05 +01:00
gingerBill
340fb4f697 Clean up x86 decoding tables 2026-06-14 19:39:34 +01:00
gingerBill
2f5d548471 Minimize rsp decoding tables 2026-06-14 19:35:30 +01:00
gingerBill
253c1570d8 Minimize riscv decoding tables 2026-06-14 19:34:16 +01:00
gingerBill
d74693eb0f Minimize ppc_vle decoding tables 2026-06-14 19:32:53 +01:00
gingerBill
dacb5e3a17 Minimize PPC decoding tables 2026-06-14 19:30:42 +01:00
gingerBill
cd4b4e1f36 Minimize mos65816 decoding tables 2026-06-14 19:24:16 +01:00