Extend emit_recipe to the full ModR/M + SIB + displacement addressing (register
direct, RIP-relative, absolute [disp32], and base/index/scale/disp), mirroring
the interpreter byte-for-byte, and drop the caller's reg-direct guard so memory
operands take the fast path too. Only a label/relative immediate (a relocation)
still falls back.
Realistic immediate-heavy mix: ~20.1 -> ~12.9 ns/inst vs the pre-recipe base
(~1.55x, 50 -> 77 M/s). Byte-exact across 2282 + idempotent.