mirrors/Odin - Odin - Kyren's Code

mirror of https://github.com/odin-lang/Odin.git synced 2026-06-19 16:42:33 +00:00

Author	SHA1	Message	Date
Brendan Punsky	e4cff78a70	rexcode/arm32: document BF family as intentionally unimplemented The 5 Branch Future mnemonics (BF/BFI_BR/BFL/BFLX/BFCSEL) are left enum-only on purpose: deprecated ARMv8.1-M, not disassemblable by llvm-objdump (so unverifiable), and a correct encoder needs dual-offset PC-relative relocation infrastructure that doesn't exist. Noted in the enum for future readers.	2026-06-18 03:05:25 -04:00
Brendan Punsky	a63fb51fdd	rexcode/arm32: MVE VMLSV/VMLSVA (correct 3-bit Q regs); drop placeholders Implement VMLSV/VMLSVA (MVE multiply-subtract reduce) properly: new VN_Q_MVE (Qn at 19:17) and VM_Q_MVE (Qm at 3:1) encodings -- the actual 3-bit MVE Q fields -- with Rd at 15:12 (RDLO_A32). The earlier collision was from reusing the 4-bit VN_Q (19:16) and RD_T32 (11:8), which place the fields wrong; byte-exact vs llvm-mc now with distinct Qn/Qm/Rd. Drop three placeholder/redundant enum entries: VRINT and VPRINT (not real instructions -- llvm rejects bare 'vrint'; VPRINT is a printf-like debug pseudo-op), and VRSHL_MVE (the author's own comment marks it a placeholder; 'vrshl q,q,q' already decodes via VRSHL's MVE form). 600 tests green, verify matches llvm-mc.	2026-06-18 01:58:19 -04:00
Brendan Punsky	239dea4f55	rexcode/arm32: MVE VHCADD (saturating halving complex add) + VCMLA New MVE_ROT_HCADD (#90/#270 at bit12) and MVE_ROT_CMLA (#0/90/180/270 at bits 24:23) rotation encodings -- the rotation degrees round-trip properly (unlike the existing FCMA VCMLA which leaves it unencoded). One form each with the element-size bits left variable (MVE convention). Verify round-trips; all rotations byte-exact vs llvm-mc; 600 tests green. (VMLSV/VMLSVA reduce ops deferred: their format decode-collides with other MVE encodings given the 4-bit VN_Q vs MVE's 3-bit Qn.)	2026-06-18 01:47:44 -04:00
Brendan Punsky	55463b6719	rexcode/arm32: VMOV (ARM core register to scalar) Dd[lane], Rt New VMOV_LANE_8/16/32 encodings: Dd at bits 19:16+bit7, lane bits per element size (.8 = bit21:bit6:bit5 with bit22 size marker; .16 = bit21:bit6 with bit5 marker; .32 = bit21). Verify round-trips all three sizes; spot-checked .8 byte-exact incl. max lane; 600 tests green.	2026-06-18 01:34:48 -04:00
Brendan Punsky	5df81b5117	rexcode/arm32: VQDMULH/VQRDMULH by-scalar-lane New NEON_VM_SCALAR16/32 encodings for the Dm[lane] scalar operand: .16 places Dm in D0..D7 (bits 2:0) with the lane split bit5:bit3, .32 places Dm in D0..D15 (bits 3:0) with the lane at bit5. VQDMULH_LANE and VQRDMULH_LANE across .s16/.s32, D and Q destinations (8 forms). Verify round-trips; spot-checked byte-exact incl. max register/lane and decode-clean; 600 tests green.	2026-06-18 01:29:19 -04:00
Brendan Punsky	acc14864f3	rexcode/arm32: DCPS1/DCPS2/DCPS3 (debug change PE state) Fixed T32 encodings (0xF78F8001/2/3), no operands. Verify round-trips; 600 tests green.	2026-06-18 01:25:51 -04:00
Brendan Punsky	b2b14998f7	rexcode/arm32: VRSRA, VRECPE_F/VRSQRTE_F, VPADD_F, VCVTR VRSRA (NEON rounding shift-right-accumulate, D/Q, mirrors VSRA's raw imm6 convention), VRECPE_F/VRSQRTE_F (FP reciprocal/rsqrt estimate, D/Q), VPADD_F (FP pairwise add, f32/f16), and VCVTR (VFP convert-to-integer using the FPSCR rounding mode; s32/u32 from f32 and f64). Hand-written mirroring the existing VSRA/VRECPE/VPADD/VCVT forms. Built-in llvm round-trip verify passes; spot-checked byte-exact; 600 tests green.	2026-06-18 01:22:12 -04:00
Brendan Punsky	59750926d9	rexcode/arm32: unprivileged (translate) post-indexed loads/stores LDRT/LDRBT/STRT/STRBT (imm12) and LDRHT/STRHT/LDRSBT/LDRSHT (imm8 split): each is the corresponding post-indexed load/store with the W bit (21) set. Hand-written, reusing the existing MEM_POST_INDEX encoding. All 8 byte-exact vs llvm-mc and decode-clean; 600 tests green.	2026-06-18 01:17:34 -04:00
Brendan Punsky	6fd233f041	rexcode/arm32: NEON long/wide/compare/shift encode forms (specgen) New arm32 specgen (llvm-mc --triple=armv8a --mattr=+neon as the bits oracle, empirical masks): VADDL/VSUBL/VABAL/VABDL (Qd,Dn,Dm) and VADDW/VSUBW (Qd,Qn,Dm) across s/u 8/16/32; the compare aliases VCLE/VCLT (= VCGE/VCGT with Vn/Vm swapped) and VACLE/VACLT (= VACGE/VACGT swapped, f32); and VQRSHL shift-by-vector. 84 forms over 11 mnemonics. Built-in llvm round-trip verify passes; spot-checked byte-exact with distinct Q/D registers; 600 tests green.	2026-06-18 01:15:22 -04:00
Brendan Punsky	fe7b81d64f	rexcode/arm64: drop vestigial/redundant mnemonics; alias redundant SME names Remove from the Mnemonic enum: LDARB_X/LDARH_X/STLRB_X/STLRH_X (no distinct byte/half acquire-release 'X' encoding exists -- LDARB/LDARH/ STLRB/STLRH already cover them), and the 12 redundant SME names SME_LD1{B,H,W,D,Q}_ZA / SME_ST1{...}_ZA / SME_MOVA_TO_Z / SME_MOVA_TO_ZA (same instructions as the canonical _TILE / MOVA__FROM_* forms). The builder generator now emits delegating aliases for the redundant SME names (inst_sme_ld1b_za :: inst_sme_ld1b_tile, ...), so the convenient names keep working and resolve to the canonical, decode-unambiguous encodings. With XAR_Z landed, the arm64 Mnemonic enum is now 100% covered: every entry has an encode form. 461 tests green.	2026-06-18 00:42:37 -04:00
Brendan Punsky	303fa9e509	rexcode/arm64: SVE2 XAR (exclusive-or and rotate) encode form XAR Zdn.T, Zdn.T, Zm.T, #rotate across .B/.H/.S/.D. New SVE_XAR_SHIFT encoding: the rotate amount is V = 2*esize - amount, split across tszh(23:22):tszl(20:19):imm3(18:16); the element size is selected by the Z register type on encode and recovered from the highest set bit of tszh:tszl on decode (so the amount round-trips for every esize). vec_esize now also handles Z_REG_B/H/S/D. All six representative forms byte-exact vs llvm-mc and decode-clean; 461 tests green.	2026-06-18 00:39:48 -04:00
Brendan Punsky	33e5202f05	rexcode/arm64: single-structure lane load/store (LD1-4_LANE / ST1-4_LANE) All eight LD#_LANE / ST#_LANE mnemonics across .B/.H/.S/.D (32 forms). New NEON_LANE_B/H/S/D encodings split the lane index across Q (bit 30), S (bit 12) and size (bits 11:10) per element size; the list length and load/store bit are fixed in the entry bits. All 11 representative forms (every element size, structure count, and lane extremes) byte-exact vs llvm-mc and decode-clean; 461 tests green.	2026-06-18 00:21:43 -04:00
Brendan Punsky	2c8768b39a	rexcode/arm64: TBL/TBX + structured LD2-4/ST2-4 + LD1R-4R encode forms Table lookup TBL/TBX (.8b/.16b, single-register table) and the multi- register structured load/store LD2/LD3/LD4, ST2/ST3/ST4 plus load-and- replicate LD1R/LD2R/LD3R/LD4R (.16b). Following the existing LD1/ST1 convention: the register list is encoded by its first register, with the list length + arrangement fixed in the bits. All 13 representative forms byte-exact vs llvm-mc and decode-clean; 461 tests green. (The single-lane _LANE variants need the Q:S:size lane-index split and are left for a follow-up.)	2026-06-18 00:17:12 -04:00
Brendan Punsky	69157b7ec5	rexcode/arm64: SME ADDHA/ADDVA (ZA outer-sum accumulate) ADDHA/ADDVA ZAda.S, Pn/m, Pm/m, Zn.S via a new ZA_TILE_LOW encoding (accumulator tile at bits 2:0; Pn at 12:10, Pm at 15:13, Zn at 9:5). Byte-exact vs llvm-mc and decode-clean across tile/predicate/Zn fields. The other 11 missing SME enum names (SME_LD1/ST1_ZA, SME_MOVA_TO_Z/ZA) are redundant aliases of the already-implemented SME_LD1/ST1_TILE and SME_MOVA__FROM_ forms -- adding duplicate encodings collides in the decode table (broke a roundtrip test), so they are intentionally left to the existing canonical forms. 461 tests green.	2026-06-18 00:14:21 -04:00
Brendan Punsky	68aac263d0	rexcode/arm64: SVE FFR/BRKN/CPY/EXT/MOV aliases (10 more, SVE 47/48) FFR ops (SETFFR/RDFFR/WRFFR) and BRKN (destructive, Pdm re-packs Pd) via specgen; CPY (predicated from GPR), EXT (destructive, imm8 split via new SVE_EXT_IMM), MOV-predicated (=SEL with Zm=Zd, via ZD_ZM_DUP), and the predicate aliases NOT/MOVS/MOV (EOR/ORR/AND with a duplicated predicate field, via PG4_PM_DUP/PN_PM_DUP/PN_PG_PM_DUP). All byte-exact vs llvm-mc; the predicate aliases decode to their canonical base op (identical bytes, as expected). 461 tests green. (SVE_XAR_Z deferred: its tsz:imm3 shift field does not follow the NEON immh:immb scheme and needs a bespoke esize-from-Z encoder.)	2026-06-18 00:09:21 -04:00
Brendan Punsky	cd8703acd4	rexcode/arm64: SVE predicated/compare/predicate-logical/SVE2 encode forms (37) Predicated FP round (FRINTN/P/M/Z/A/X/I, FRECPX), reversed predicated shifts (ASRR/LSLR/LSRR) and FP (FSUBR/FDIVR), FP compare (FCMEQ/GE/GT/ NE/UO + vs-zero FCMLE/FCMLT), integer compare aliases (CMPLE/LO/LS/LT), predicate logical (NANDS/NORS/ORNS), predicate break (BRKPA/BRKPB, BRKA/BRKB + flag-setting BRKAS/BRKBS), SVE2 EOR3/BCAX, INSR, COMPACT. New specgen SVE section: a generic emitter assembles each form all-zero then one variant per field at its max (Z 31, 3-bit Pg 7, 4-bit Pd/Pg/Pn/ Pm 15, GPR wzr/xzr) and derives mask = ~union. Operand placements verified vs llvm-mc: the reversed/destructive ops put Zm at VN (5-9); the CMPLE/LO/LS/LT aliases swap operands (VM/VN); EOR3/BCAX place the 3rd src at VM and 4th at VN. All 22 representative forms byte-exact and decode-clean; 461 tests green. (BRKN + CPY/EXT/MOV/NOT_P/FFR/XAR stragglers next.)	2026-06-17 23:59:23 -04:00
Brendan Punsky	8006b5f7e2	rexcode/arm64: NEON MOVI/MVNI + FMOV scalar/vector immediate forms MOVI (8B/16B/4H/8H/2S/4S/2D) and MVNI (4H/8H/2S/4S) via specgen (imm8 in abc:defgh, cmode/op/Q static per arrangement; .2D probed with all-ones since its asm immediate is the replicated 64-bit value). FMOV_IMM (scalar Sd/Dd/Hd, 8-bit float at 20:13 via new FMOV_SCALAR_IMM encoding) and FMOV_V_IMM (Vd.<2S\|4S\|2D\|4H\|8H>, fimm8 in abc:defgh, cmode=1111) hand- written -- canonical bits with the imm8 fields zeroed (the live float example would otherwise bake operand bits into the static pattern). All 14 representative forms byte-exact vs llvm-mc and decode-clean; 461 tests green. (LSL/MSL-shifted MOVI/MVNI variants share the operand signature and are omitted.)	2026-06-17 23:47:45 -04:00
Brendan Punsky	ab7f20a129	rexcode/arm64: byte/half/signed loads-stores + vector LDP/STP/LDUR/STUR LDRB/LDRH/STRB/STRH (post-index, pre-index, register-offset), LDRSB/LDRSH (register-offset, W and X) and LDRSW (register-offset), plus the vector pair/unscaled forms LDP_V/STP_V (S/D/Q) and LDUR_V/STUR_V (S/D/Q). Hand-written, reusing the existing OFFSET_BASE_POST/PRE/REG/S9 addressing encodings; canonical bits taken from llvm-mc (operand fields zeroed). All 23 representative forms byte-exact vs llvm-mc and decode-clean; 461 tests green. (LDARB_X/LDARH_X/STLRB_X/STLRH_X left unimplemented: LDARB/LDARH/STLRB/ STLRH are byte/half acquire-release into a W register with no distinct 64-bit 'X' encoding -- these enum entries are vestigial.)	2026-06-17 23:39:01 -04:00
Brendan Punsky	aabcdd41b6	rexcode/arm64: CCMP/CCMN-imm, HINT, MSR-imm, USDOT encode forms Conditional compare immediate (CCMP_IMM/CCMN_IMM: imm5 at 20:16 via a new IMM5_HI encoding, bit 11 set), HINT #imm7, MSR <pstatefield>,#imm (new MSR_PSTATE encoding placing op1 at 18:16 / op2 at 7:5, CRm via the shared BARRIER_FIELD), and USDOT (I8MM unsigned-by-signed dot product, .2S/.4S). Hand-written into the core (outside the specgen region). All forms byte-exact vs llvm-mc and decode-clean; 461 tests green.	2026-06-17 23:34:10 -04:00
Brendan Punsky	c506e6c13b	rexcode/arm64: scalar FP round/reciprocal + FP-to-GPR convert forms Scalar FRINTN/P/M/Z/A/X/I and FRECPX (Sd,Sn / Dd,Dn / Hd,Hn), and the FP-to-GPR converts FCVTAS/AU/MS/MU/NS/NU/PS/PU (Wd/Xd, Hn/Sn/Dn). All register-only forms: element type and W/X are selected by static bits, so specgen derives each from llvm-mc with the convert forms varying Rd and Rn independently (zero register for the 31 case). H variants tagged FP16. RD/RN encodings so decode reconstructs the scalar/GPR register class. All 19 representative forms byte-exact vs llvm-mc and decode-clean; 461 tests green.	2026-06-17 23:27:26 -04:00
Brendan Punsky	06eb3de6a2	rexcode/arm64: NEON copy/permute (MOV/MVN/DUP/INS/EXT) encode forms MOV_V (ORR alias: source feeds both Vn and Vm via a new VN_VM_DUP encoding), MVN_V (NOT alias, plain 2-register), DUP_V (element form Vd.T,Vn.Ts[i] and general form Vd.T,Wn/Xn), INS (element-to-element and from-GPR), EXT_V (imm4 byte index). Adds a VEC_INDEX operand type plus NEON_IDX5/NEON_IDX4/NEON_EXT_IDX encodings: the element-size marker rides in the entry bits, the lane index drives the bits above it, and the decoder recovers the element size from imm5's marker. Element size now rides in op.size (B=1/H=2/S=4/D=8) via op_v_elem_b/h/s/d so the matcher can disambiguate DUP/INS element forms; the builder generator maps V_ELEM_* to those constructors. specgen derives the mask by varying registers and each index field to its max -- the GPR-source forms vary Vd and Rn independently (Rn 31 = wzr/xzr) so the low bit of each field toggles. All 19 representative forms byte-exact vs llvm-mc and decode-clean; 461 tests green. (TBL/TBX register-list forms deferred.)	2026-06-17 23:23:44 -04:00
Brendan Punsky	5761c23ba4	rexcode/arm64: NEON narrowing-shift (SHRN/SQSHRN/...) encode forms SHRN/2, RSHRN/2, SQSHRN/2, UQSHRN/2, SQRSHRN/2, UQRSHRN/2, SQSHRUN/2, SQRSHRUN/2 (16 forms). Verified against llvm-mc that immh keys on the NARROW destination element (not the wide source), so the existing NEON_SHR_IMM encoder/decoder (esize from ops[0]) is already correct -- this is a specgen-only change: right-shift esize now uses ESIZE[dst] (equal to ESIZE[src] for same-arrangement shifts) plus narrowing {narrow-dst, wide-src} arrangement tuples. All forms byte-exact and decode-roundtrip verified across B/H/S element sizes; 461 tests green.	2026-06-17 23:07:38 -04:00
Brendan Punsky	a1e359b64a	rexcode/arm64: NEON permute, compare-zero, SXTL/UXTL encode forms ZIP1/2, UZP1/2, TRN1/2 (three-same permute); CMLE/CMLT and FP FCMLE/FCMLT (compare against zero, with the literal #0 / #0.0 operand); SXTL/SXTL2/UXTL/UXTL2 (= SSHLL/USHLL #0, plain 2-register widen, shift implicit in the static bits). All reuse the VD/VN/VM register slots, so no encoder change. specgen gains an emit_cmp0 shape plus permute and widen families. All forms byte-exact vs llvm-mc; 461 tests green.	2026-06-17 23:03:53 -04:00
Brendan Punsky	55a141be4f	rexcode/arm64: NEON widening left-shift (SSHLL/USHLL) encode forms Adds SSHLL/SSHLL2/USHLL/USHLL2 (12 forms) via specgen, reusing NEON_SHL_IMM (left shifts need no esize; the size marker is in bits). specgen's shift shape generalized to arrangement pairs {dst, src} with the shift element size taken from the source. Verified: encode matches llvm-mc + decode recovers mnemonic + amount (sshll/sshll2/ushll across widths); arm64 check + 461 tests pass.	2026-06-16 02:22:37 -04:00
Brendan Punsky	e52953c7ff	rexcode/arm64: NEON shift-by-immediate encode forms + encoder extension First encoder-extension family. Adds Operand_Type VEC_SHIFT and Operand_Encoding NEON_SHL_IMM/NEON_SHR_IMM: the element-size marker bit sits in the entry's bits, the encoder packs the amount into the low immh:immb bits (left = shift; right = esize - shift, esize from the vector operand via vec_esize/form.ops[0]), and the decoder recovers esize from immh to compute the amount. Adds 13 mnemonics (91 forms) via specgen: left SHL/SLI/SQSHLU/SQSHL, right SSHR/USHR/SRSHR/URSHR/SSRA/USRA/SRSRA/URSRA/SRI. specgen derives bits/mask empirically by varying registers AND the shift (canon = operand bits zero; other extreme sets all shift bits), so per-arrangement immh discrimination + the growing shift-field width fall out automatically. Verified end-to-end: encode matches llvm-mc byte-for-byte AND decode recovers mnemonic + amount (sshr/shl/sli/ushr/srsra across B/H/S/D); arm64 check + 461 tests pass. First of the encoder-extension phase ([[rexcode-encode-coverage]]); CCMP_IMM imm5@20:16 pattern generalizes here.	2026-06-16 02:19:03 -04:00
Brendan Punsky	ff3a1acdc7	rexcode/arm64: NEON FP widen/narrow (FCVTL/FCVTN/FCVTXN) encode forms Adds 6 mnemonics (10 forms) via specgen: FCVTL/FCVTL2 (widen half->single, single->double), FCVTN/FCVTN2 and FCVTXN/FCVTXN2 (narrow), with FP16 (V_4H_FP16/V_8H_FP16) on the half side. Completes the register-only NEON phase. Verified: decode round-trips, arm64 check + 461 tests pass.	2026-06-15 21:39:29 -04:00
Brendan Punsky	77c0265df9	rexcode/arm64: NEON ABS/NEG + FP vector-convert encode forms Adds 14 mnemonics (74 forms) via specgen: integer two-register ABS/NEG, and the FP vector-convert (register form) family FCVTAS/AU/MS/MU/NS/NU/PS/PU/ZS/ZU + SCVTF/UCVTF. SP/DP .NEON, half-precision .FP16; the fixed-point (#fbits) convert forms come later with the immediate phase. Verified: decode round-trips incl. FP16 (abs/neg/fcvtzs.8h/scvtf), arm64 check + 461 tests pass.	2026-06-15 21:37:31 -04:00
Brendan Punsky	7cd39f1d0d	rexcode/arm64: NEON FP two-register + FP across-lanes encode forms Adds 16 mnemonics (72 forms) via specgen: FP two-register FABS/FNEG/FSQRT, FRINTA/I/M/N/P/X/Z, FRECPE/FRSQRTE; and FP across-lanes FMAXV/FMINV/FMAXNMV/FMINNMV (scalar S/H dst). SP/DP are .NEON, half-precision .FP16. specgen per-form feature now scans all operands for an FP16 arrangement (handles scalar-dst across-lanes, where FP16 lives on the source). Verified: decode round-trips incl. FP16 (fabs/fsqrt.8h/frecpe/fmaxv), arm64 check + 461 tests pass.	2026-06-15 21:35:41 -04:00
Brendan Punsky	57fbe873d8	rexcode/arm64: NEON floating-point three-same encode forms (incl. FP16) Adds 17 FP three-same mnemonics (85 forms) via specgen: FMAX/FMIN/FMAXNM/FMINNM, FMULX/FRECPS/FRSQRTS, FACGE/FACGT, FCMEQ/FCMGE/FCMGT (register form), FADDP/FMAXP/FMINP/FMAXNMP/FMINNMP. Single/double forms (2S/4S/2D) are .NEON; half-precision (4H/8H) use the distinct V_4H_FP16/V_8H_FP16 operand types and are tagged .FP16. specgen gains: --mattr=+fullfp16, FP16 arrangement tokens, per-form feature tagging (derived from the operand arrangement), and per-family arrangement lists. Verified: decode round-trips incl. an FP16 form (fmax/fcmeq.4h/fmulx/faddp), arm64 check + 461 tests pass.	2026-06-15 21:33:25 -04:00
Brendan Punsky	824421853f	rexcode/arm64: NEON across-lanes + pairwise-long encode forms Adds 11 mnemonics (59 forms) via specgen: across-lanes reductions ADDV/SMAXV/SMINV/UMAXV/UMINV (scalar B/H/S dst) and SADDLV/UADDLV (widened scalar dst), plus pairwise-long SADDLP/UADDLP/SADALP/UADALP. The scalar destination packs into the same VD field, so still no encoder change. specgen.emit generalized to accept scalar-register operand tokens (B/H/S/D) alongside vector arrangements. Verified: decode round-trips (ADDV/SADDLV/UMINV/SADDLP), arm64 check + 461 tests pass.	2026-06-15 21:27:40 -04:00
Brendan Punsky	7ebe042277	rexcode/arm64: NEON wide / narrow / XTN encode forms Adds 24 more mixed-arrangement mnemonics (72 forms) via specgen: three-different wide (SADDW/UADDW/SSUBW/USUBW), narrowing-halving (ADDHN/SUBHN/RADDHN/RSUBHN), and two-register narrowing (XTN/SQXTN/UQXTN/SQXTUN), plus their high-half '2' variants. All register-only (VD/VN/VM or VD/VN), no encoder change. specgen refactored to a general arrangement-tuple mechanism (uniform + long/wide/narrow/XTN share one emit path). Verified: decode round-trips (SADDW/ADDHN/XTN/SQXTUN2), arm64 check + 461 tests pass.	2026-06-15 21:22:07 -04:00
Brendan Punsky	f78a3a5573	rexcode/arm64: NEON three-different (long) encode forms Adds 26 widening long mnemonics (72 forms) via specgen: SADDL/UADDL/SSUBL/USUBL, SMULL/UMULL, SMLAL/UMLAL/SMLSL/UMLSL, SQDMULL/SQDMLAL/SQDMLSL and their high-half '2' variants. Destination arrangement is wider than the sources (Vd.8H, Vn.8B, Vm.8B; the '2' forms read the high half). Encoding stays VD/VN/VM, so no encoder change. specgen gains a mixed-arrangement THREE_DIFF shape (low/high source-half pairs). Verified: decode round-trips (SMULL/SADDL2/SQDMULL/UMLAL), arm64 check + 461 tests pass.	2026-06-15 21:16:56 -04:00
Brendan Punsky	00b666bbc0	rexcode/arm64: NEON pairwise + variable-shift encode forms Adds 9 register-only three-same mnemonics (59 forms) via specgen: ADDP/SMAXP/SMINP/UMAXP/UMINP (pairwise) and SSHL/USHL/SRSHL/URSHL (per-lane variable shift). Verified: decode round-trips (ADDP/SSHL/SMAXP/URSHL), arm64 check + 461 tests pass. Skipped the already-implemented logical/compare/mul forms (AND_V/ORR_V/EOR_V/BIC_V/ORN_V/BSL/BIT/BIF/CMEQ/CMGT/CMHI/MUL_V) to avoid duplicate keys.	2026-06-15 13:01:26 -04:00
Brendan Punsky	d83065e3b8	rexcode/arm64: NEON two-register-misc encode forms Adds 10 Advanced-SIMD two-register-misc mnemonics (34 forms across valid arrangements) via specgen: NOT/RBIT/REV16/REV32/REV64/CLS/CLZ/CNT/URECPE/URSQRTE. llvm-mc filters illegal arrangements (CNT/NOT only 8B/16B, URECPE/URSQRTE only 2S/4S, ...). specgen.lua generalized to a SHAPE table (THREE_SAME / TWO_SAME), so adding a family is one row. Verified: decode round-trips (NOT/REV64/CNT/URECPE), arm64 check + 461 tests pass.	2026-06-15 12:57:37 -04:00
Brendan Punsky	e21fa59733	rexcode/arm64: NEON three-same (integer) encode forms + specgen tool Adds 25 Advanced-SIMD three-same integer mnemonics (153 forms across arrangements) to ENCODING_TABLE: SHADD/UHADD/SHSUB/UHSUB/SRHADD/URHADD, SQADD/UQADD/SQSUB/UQSUB, SMAX/UMAX/SMIN/UMIN, SABD/UABD/SABA/UABA, MLA/MLS, CMGE/CMHS/CMTST, SQDMULH/SQRDMULH. Introduces tablegen/specgen.lua: compact specs (mnemonic + llvm name + arrangements) -> ENCODING_TABLE blocks, with bits taken from llvm-mc (the oracle) and mask derived empirically (vary registers 0..31). Invalid arrangements are auto-detected via llvm-mc and skipped. Output fills the SPECGEN:BEGIN..END region of encoding_table.odin in place; the hand-written core is untouched. Verified: decode round-trips for SHADD/SQADD/CMGE/SQRDMULH; arm64 check + 461 tests pass; builders auto-generate (780 -> 805). Caveat: NEON builders currently collapse arrangements (one Register param per V operand) so inst_<mnem> exposes only the first arrangement -- an arrangement-aware builder-gen pass follows. Author: Brendan Punsky (machine git config user.name is the login 'Flāvius').	2026-06-15 12:52:10 -04:00
Brendan Punsky	7b588d0818	rexcode/arm64: implement CCMP/CCMN register-form encode forms First entries of the encode-coverage effort. Adds CCMP_REG/CCMN_REG (W/X) to ENCODING_TABLE with llvm-mc-verified bit patterns; the table metaprogram regenerates the encode/decode blobs and the typed builders auto-generate (inst_ccmp_reg/inst_ccmn_reg). Verified: encode matches llvm-mc (CCMP X1,X2,#3,EQ=0xFA420023; CCMN W5,W6,#7,NE=0x3A4610A7), decode round-trips, arm64 check+tests pass. Needs no encoder change (reuses RN/RM/NZCV_FIELD/COND_HI). The imm5 forms (immediate at bits 20:16) need a new Operand_Encoding and follow separately. Workflow proven: llvm-mc as the encoding oracle -> SoT entry -> regen -> builder auto-generates -> verify.	2026-06-15 12:52:10 -04:00
Brendan Punsky	47fc72e0ba	rexcode: 100% generated mnemonic-builder coverage; drop hand-written collisions Every mnemonic with an encode form now has a generated inst_<mnem>/emit_<mnem> overload group. The per-arch generators map ALL operand types — nothing is skipped: arm64 gains shifted/extended registers (multi-param via op_shifted/op_extended), SVE Z-regs + predicates, SME tile/slice, NEON arrangements/lanes, bitmask/sysreg/pattern immediates and condition codes (427 -> 777 mnemonics); arm32 gains shifted/register-shifted regs, register lists, NEON lanes and all encoded-immediate subclasses (479 -> 592); x86 gains m80 and descriptor-table memory operands — FBLD/FBSTP, LGDT/SGDT/LIDT/SIDT, FLD/FSTP, far-indirect JMP/CALL, BOUND (1167 -> 1175). Mnemonic-specific builders are now fully generated, not hand-written: deleted the hand-written helpers the generated groups collided with — riscv inst_jal/inst_jalr, arm64 inst_b_cond/inst_cbz/inst_tbz/inst_csel, mos6502 inst_tst — and let the generators own those names (arm64 also gains inst_cbnz/tbnz/csinc/csinv/csneg). Updated the affected test call-sites. The generic operand-shape helpers (inst_r_r, inst_r_r_i, inst_ldst, ...) remain as delegation targets. Decode-only mnemonics with no encode form are correctly left without builders. ppc/ppc_vle/rsp/mos65816 were already complete. All 10 ISAs: structure + compile + tests pass; generators idempotent.	2026-06-15 12:52:10 -04:00
Brendan Punsky	1b72d425d4	rexcode: add typed per-mnemonic builders for all arches; CWD-independent regen Add generated mnemonic_builders.odin (inst_<mnem>/emit_<mnem> typed overload sets) for arm32, arm64, mips, riscv, ppc, ppc_vle, rsp, mos6502 and mos65816, matching the existing x86 builders. Each is produced by a per-arch tools/gen_mnemonic_builders.odin that walks ENCODE_FORMS and maps operand types to typed params + op_* constructors. Anchor every generator's output via #directory so regeneration is CWD-independent; previously the bare "mnemonic_builders.odin" path wrote to the current directory and misfired when run from the repo root. Wire a --builders task into build.lua (folded into 'all', covered by --idempotent, enforced by the structural invariants) and document it in the README.	2026-06-15 12:52:10 -04:00
gingerBill	693fc1ec18	Allow for `struct #raw_union #packed`	2026-06-15 14:42:38 +01:00
gingerBill	182f234ed2	Minimize rsp Instruction and Operand	2026-06-15 14:37:25 +01:00
gingerBill	4f96105520	Minimize riscv Instruction and Operand	2026-06-15 14:36:10 +01:00
gingerBill	a839f5e833	Minimize ppc_vle Instruction and Operand	2026-06-15 14:35:34 +01:00
gingerBill	7a17144aa1	Minimize ppc Instruction and Operand	2026-06-15 14:34:45 +01:00
gingerBill	b006a8853e	Minimize mos65816 Instruction and Operand	2026-06-15 14:32:42 +01:00
gingerBill	59c4292224	Minimize mos6502 Instruction and Operand	2026-06-15 14:30:32 +01:00
gingerBill	406dfbe86d	Minimize mips Instruction and Operand	2026-06-15 14:29:14 +01:00
gingerBill	6527f90181	MInimize arm64 Instruction and Operand	2026-06-15 14:27:49 +01:00
gingerBill	7aaef31bb3	Correct sizes of arm32 Instruction and Operand	2026-06-15 14:24:05 +01:00
gingerBill	f895e96bde	Add `benchmark` flag for x86 tests to just test that	2026-06-15 14:17:11 +01:00
gingerBill	2dd262ea10	x86: improve benchmark test do not run the code on Windows since it relies in SysV	2026-06-15 14:14:38 +01:00

1 2 3 4 5 ...

17684 Commits