Commit Graph

8 Commits

Author SHA1 Message Date
Brendan Punsky
078015bc34 rexcode/x86: pre-matched encode hint + repair the typed builders
Targeted branchless revert + the pre-matched form fast path, and a fix
for a pre-existing bug the latter surfaced.

(a) Revert the two speculative-write spots from the prior branchless pass
    (legacy-prefix emission, widened displacement store, ENCODE_TAIL_SLACK)
    back to predicted branches. In real streams a legacy prefix is almost
    always absent and disp size is stable, so those branches are ~free and
    the unconditional stores only added work. Every class got faster
    (RET 19->17.5, MOV r,r 52->46.6, VADDPS 42.8->39.3 ns).

(b) Pre-matched form hint. Instruction.enc_hint (in the existing 11-byte
    padding, idx+1 biased; 0 = matcher path) lets a typed builder that maps
    to a single value-independent form bake the global form index, so
    encode() skips the O(forms) match scan -- and, in a varied stream, its
    unpredictable branches. Generated for non-immediate forms only (value-
    dependent imm8/imm32 selection stays on the matcher). On a 100k mixed
    typed-builder stream: 47.3 -> 30.2 ns/inst (-36%), byte-identical to the
    matcher path -- ~2x the original baseline for codegen.

Repair the typed inst_/emit_ builders. They were non-functional: the
generator cast the hw-only typed enum straight to Register
(Register(GPR64.RAX) -> class 0), so every typed-builder operand was
rejected by the matcher (encode returned empty). Untested because the
suite builds via the generic constructors. Now they build through the
class-correct op_gpr64/op_xmm/... path (op_* already used by 3+ operand
builders), emit_ reuses inst_, and a new 30-case consistency suite
asserts typed == generic (llvm-verified) and hint == matcher.

gen/builders/check/test/idempotent all green; 2276 cases.
2026-06-18 21:04:18 -04:00
Brendan Punsky
8387731357 rexcode/x86: branchless hot paths + single-pass operand resolution
Three layers on the x86 encode/decode hot paths, all byte-exact (2246
LLVM-verified cases) and roundtrip-clean:

1. Branchless: legacy-prefix emission (speculative write + conditional
   advance), REX/VEX/EVEX extension-bit accumulation (gate-and-mask),
   ModRM mod/disp-size selection (cmov selects), displacement emission
   (widened store + ENCODE_TAIL_SLACK); decoder REX/VEX/EVEX register
   extensions (arithmetic instead of if/+=8).

2. Resolve-operands-once: the previous code re-derived each user operand
   ~5-10x per instruction (a fresh O(n) scan of enc.ops per emission
   pass). Now resolved into a [4]^Operand map a single time.

3. Single-pass gather: fold the opcode-+rb and ModR/M slot-detection
   scans into that one resolve pass (3 enc.enc passes -> 1).

Net on a 100k mixed-instruction benchmark: encode ~58 -> ~54 ns/inst
(best 52). Branchless alone was a ~7% encode regression (predicted
branches, nothing to recover); the algorithmic passes recovered it and
beat baseline.
2026-06-18 20:16:26 -04:00
Brendan Punsky
95df04fbe1 rexcode: re-house ISA packages under core:rexcode/isa/<arch>
Move all ten ISA packages (x86, arm32, arm64, mips, riscv, ppc, ppc_vle,
rsp, mos6502, mos65816) from core/rexcode/<arch> to core/rexcode/isa/<arch>,
so the import pattern is now `import "core:rexcode/isa/x86"`. The shared
core stays at core:rexcode/isa.

Mechanical: relative `import "../isa"` / "../../isa" -> absolute
"core:rexcode/isa" (the only path that survives the move; the "../" and
"../.." self/generated imports move with their packages). build.lua now
builds paths as <root>/isa/<name>; stale `cd <arch>` hints in the verify
tools and the doc.odin paths updated.

WASM stays at core/rexcode/wasm for now -- it is an IR, not an ISA, and
will move under the forthcoming core:rexcode/ir once that layer lands.

All 10 arches gen/builders/check/test green; import core:rexcode/isa/x86
verified working; wasm still compiles.
2026-06-18 19:03:27 -04:00
gingerBill
c9ce8794c7 Replace -> isa.Result with -> (byte_code: u32, ok: bool) 2026-06-15 21:43:58 +01:00
Flāvius
a4f08f8307 Load rexcode encode/decode tables from committed binary blobs
Each ISA's hand-written ENCODING_TABLE (the single source of truth) now lives
in a per-arch tablegen/ metaprogram that flattens it and serializes committed
binary blobs; the library #loads those into @(rodata) at compile time rather
than compiling a table body. No arch keeps encoding_table.odin or
decoding_tables.odin -- only a generated tables.odin loader and tables/*.bin.

* Two-stage, type-checked pipeline: tablegen Stage A emits human-readable
  generated Odin, which compiles and serializes the blobs in Stage B.
* encode() goes through encoding_forms(m); decoders are unchanged apart from
  x86's flattened 2-D index. Decode tables are byte-identical to the old ones.
* build.lua: a LuaJIT driver for the metaprograms, validations, and tests,
  with cross-platform gating and a clear report.
* Docs refreshed; the obsolete forward-looking plan in cross_arch_design.md
  trimmed to what was actually built.
* Attribution headers added to all rexcode source files; the generators emit
  them so generated files keep them.
2026-06-15 07:43:29 -04:00
gingerBill
d75624ccbd Add @(require_results) where appropriate to isa 2026-06-14 19:41:05 +01:00
gingerBill
2e58cc51d9 Improve mnemonic_builders for x86 2026-06-14 17:03:19 +01:00
gingerBill
d6ae77b67e core:rexcode 2026-06-14 16:30:18 +01:00