New NEON_VM_SCALAR16/32 encodings for the Dm[lane] scalar operand: .16
places Dm in D0..D7 (bits 2:0) with the lane split bit5:bit3, .32 places
Dm in D0..D15 (bits 3:0) with the lane at bit5. VQDMULH_LANE and
VQRDMULH_LANE across .s16/.s32, D and Q destinations (8 forms). Verify
round-trips; spot-checked byte-exact incl. max register/lane and
decode-clean; 600 tests green.
Each ISA's hand-written ENCODING_TABLE (the single source of truth) now lives
in a per-arch tablegen/ metaprogram that flattens it and serializes committed
binary blobs; the library #loads those into @(rodata) at compile time rather
than compiling a table body. No arch keeps encoding_table.odin or
decoding_tables.odin -- only a generated tables.odin loader and tables/*.bin.
* Two-stage, type-checked pipeline: tablegen Stage A emits human-readable
generated Odin, which compiles and serializes the blobs in Stage B.
* encode() goes through encoding_forms(m); decoders are unchanged apart from
x86's flattened 2-D index. Decode tables are byte-identical to the old ones.
* build.lua: a LuaJIT driver for the metaprograms, validations, and tests,
with cross-platform gating and a clear report.
* Docs refreshed; the obsolete forward-looking plan in cross_arch_design.md
trimmed to what was actually built.
* Attribution headers added to all rexcode source files; the generators emit
them so generated files keep them.