Files
Odin/core/rexcode
Brendan Punsky 3341898437 rexcode/x86: flatten per-instruction loops + hint immediates (~1.5x encode)
Data-oriented pass on the encode hot path. Profiling showed bounds checks
already elided by -o:speed; the cost was per-instruction loop/scan machinery
and immediates falling off the hint path.

- Gather the immediate slot in the single resolve pass and emit it straight-
  line (no scan over enc.enc); likewise drive the legacy REX prefix from the
  precomputed reg/mr/opr slots instead of a per-form scan.
- Fold the separate needs_66 (GPR16) and SPL/BPL/SIL/DIL operand loops into
  the resolve pass, so user operands are visited exactly once. This was the
  big one: mov r,r 27 -> 21 ns.
- Gate the whole legacy-prefix block on a single flags!=0 test (a legacy
  prefix is almost always absent) instead of four branches per instruction.
- Make immediate forms hintable. A typed immediate builder names its width
  (inst_add_r32_imm32), the matcher already keys off the operand's declared
  size, so baking the form is byte-identical AND drops immediates from the
  full match scan: mov r32,imm32 55.7 -> 17.8 ns (3.1x).

Floor (no-op) 14.55 -> 10.3 ns; realistic immediate-heavy typed mix
30 -> 20.5 ns/inst (~49 M inst/s). gen/builders/check/test/idempotent green;
2282 cases (typed==generic byte-identical, incl. the new immediate cases).
2026-06-19 01:56:05 -04:00
..
2026-06-18 15:21:05 +01:00