mirror of
https://github.com/odin-lang/Odin.git
synced 2026-06-19 16:42:33 +00:00
Three layers on the x86 encode/decode hot paths, all byte-exact (2246 LLVM-verified cases) and roundtrip-clean: 1. Branchless: legacy-prefix emission (speculative write + conditional advance), REX/VEX/EVEX extension-bit accumulation (gate-and-mask), ModRM mod/disp-size selection (cmov selects), displacement emission (widened store + ENCODE_TAIL_SLACK); decoder REX/VEX/EVEX register extensions (arithmetic instead of if/+=8). 2. Resolve-operands-once: the previous code re-derived each user operand ~5-10x per instruction (a fresh O(n) scan of enc.ops per emission pass). Now resolved into a [4]^Operand map a single time. 3. Single-pass gather: fold the opcode-+rb and ModR/M slot-detection scans into that one resolve pass (3 enc.enc passes -> 1). Net on a 100k mixed-instruction benchmark: encode ~58 -> ~54 ns/inst (best 52). Branchless alone was a ~7% encode regression (predicted branches, nothing to recover); the algorithmic passes recovered it and beat baseline.