Commit Graph

3 Commits

Author SHA1 Message Date
Brendan Punsky
49787b7de4 rexcode/x86: buffer-sizing helpers for encode and decode
Give callers a clean way to pre-size their own buffers so the encode/decode
hot paths never allocate or resize, instead of decode() silently reserving the
caller's arrays itself (removed). The library allocates nothing -- these only
grow the caller's own dynamic arrays, and only when not already big enough
(Odin's reserve no-ops when capacity already suffices).

Size-only helpers (caller manages its own memory), keyed off the input slice:
  encode_max_code_size(instructions)            - exact code bytes
  encode_max_relocation_count(instructions)     - exact reloc upper bound
  decode_max_instruction_count(data)            - exact ceiling (1 byte/inst)
  decode_estimate_instruction_count(data)       - typical estimate (~3 B/inst)

Reserve helpers (pre-size the caller's dynamic arrays; nil to skip an array):
  encode_reserve(code, relocs, instructions)
      code is a [dynamic]u8 grown by LENGTH (so code[:] is a valid emit
      target); relocs reserved by capacity on top of existing elements.
  decode_reserve(instructions, inst_info, label_defs, data, exact=false)
      reserves capacity on top of existing; exact=true for the ceiling.

Error arrays grow only on the failure path, so they are intentionally not
covered. check/test green; 2282 cases; exercised end-to-end (the [dynamic]u8
code pattern, factor-in-existing, nil args, exact ceiling, reserve no-op).
2026-06-19 03:48:36 -04:00
Brendan Punsky
8387731357 rexcode/x86: branchless hot paths + single-pass operand resolution
Three layers on the x86 encode/decode hot paths, all byte-exact (2246
LLVM-verified cases) and roundtrip-clean:

1. Branchless: legacy-prefix emission (speculative write + conditional
   advance), REX/VEX/EVEX extension-bit accumulation (gate-and-mask),
   ModRM mod/disp-size selection (cmov selects), displacement emission
   (widened store + ENCODE_TAIL_SLACK); decoder REX/VEX/EVEX register
   extensions (arithmetic instead of if/+=8).

2. Resolve-operands-once: the previous code re-derived each user operand
   ~5-10x per instruction (a fresh O(n) scan of enc.ops per emission
   pass). Now resolved into a [4]^Operand map a single time.

3. Single-pass gather: fold the opcode-+rb and ModR/M slot-detection
   scans into that one resolve pass (3 enc.enc passes -> 1).

Net on a 100k mixed-instruction benchmark: encode ~58 -> ~54 ns/inst
(best 52). Branchless alone was a ~7% encode regression (predicted
branches, nothing to recover); the algorithmic passes recovered it and
beat baseline.
2026-06-18 20:16:26 -04:00
Brendan Punsky
95df04fbe1 rexcode: re-house ISA packages under core:rexcode/isa/<arch>
Move all ten ISA packages (x86, arm32, arm64, mips, riscv, ppc, ppc_vle,
rsp, mos6502, mos65816) from core/rexcode/<arch> to core/rexcode/isa/<arch>,
so the import pattern is now `import "core:rexcode/isa/x86"`. The shared
core stays at core:rexcode/isa.

Mechanical: relative `import "../isa"` / "../../isa" -> absolute
"core:rexcode/isa" (the only path that survives the move; the "../" and
"../.." self/generated imports move with their packages). build.lua now
builds paths as <root>/isa/<name>; stale `cd <arch>` hints in the verify
tools and the doc.odin paths updated.

WASM stays at core/rexcode/wasm for now -- it is an IR, not an ISA, and
will move under the forthcoming core:rexcode/ir once that layer lands.

All 10 arches gen/builders/check/test green; import core:rexcode/isa/x86
verified working; wasm still compiles.
2026-06-18 19:03:27 -04:00