core:rexcode

This commit is contained in:
gingerBill
2026-06-14 16:30:18 +01:00
parent 4b482366c1
commit d6ae77b67e
194 changed files with 107075 additions and 0 deletions

View File

@@ -0,0 +1,469 @@
# rexcode — Cross-Architecture API Design
> How to grow rexcode from an x86-only encoder/decoder into a multi-target
> library (x86, RISC-V, ARM64, MIPS, …) **without** flattening every
> architecture to a lowest common denominator and **without** adding
> runtime overhead to the single-target hot path.
>
> Companion to [x86_api.md](x86_api.md). Written ahead of the RISC-V
> subpackage.
---
## 0. The guiding principle
> **Share the bookkeeping, specialize the bytes.**
An encoder/decoder is two things stitched together:
1. **Orchestration & bookkeeping** — labels, relocations, the two-pass
encode/decode loops, error/result reporting, the print framework,
buffer management, the table-gen tooling pattern. This is *the same
problem on every ISA* and should be written once.
2. **The instruction model & the bytes** — what a register/memory/operand
*is*, what the encoding tables look like, and the actual
bit/byte-twiddling of `encode_one`/`decode_one`. This is *irreducibly
per-architecture* and must stay native and zero-cost.
Every decision below follows from drawing the line in exactly that place.
We do **not** try to invent one `Instruction` type that fits all ISAs —
that path forces x86's `segment`/SIB and ARM's writeback and RISC-V's
split immediates into one bloated struct, and it is precisely the
"compromise performance/effectiveness" outcome to avoid. Instead, each
arch owns its concrete types, and uniformity comes from a **naming
contract** (§6) plus a small **shared core** (§4) plus **opt-in**
generic glue (§5, §7).
---
## 1. The universal shape
Strip away the x86 specifics and every target needs the same nine things:
| # | Concept | Example in x86 |
|---|---|---|
| 1 | A **register** = (class, hw number, size) | `Register` distinct u16 |
| 2 | **Operands** tagged reg / mem / imm / relative | `Operand` + `Operand_Kind` |
| 3 | An **instruction** = mnemonic + operands + flags | `Instruction` |
| 4 | A **mnemonic** enum | `Mnemonic` (u16, INVALID=0) |
| 5 | **Labels** + forward refs + named labels | `Label_Definition`, `Label_Map` |
| 6 | **Relocations** left over after local resolution | `Relocation` |
| 7 | `encode([]Inst) -> bytes (+relocs +errors)` | `encode()` |
| 8 | `decode(bytes) -> []Inst (+info +labels +errors)` | `decode()` |
| 9 | `print([]Inst) -> text (+tokens)` | `print()`/`tprint()`/… |
Plus two cross-cutting concerns: **errors/result** reporting and a
**table-driven core** fed by **codegen tooling**.
The *shape* of items 59 (their signatures and the types they pass around)
is architecture-independent. That is the surface we standardize.
---
## 2. Where architectures actually diverge
This is the heart of the analysis. Ranked from "diverges hardest" to
"barely diverges."
### 2.1 Encoding mechanics — **maximal divergence**
| ISA | Width | Mechanism |
|---|---|---|
| x86 | 115 B, variable | legacy prefixes → REX/VEX/EVEX → escape → opcode → ModRM → SIB → disp → imm |
| RISC-V | 4 B (2 B for "C") | pack fixed bitfields; ~6 formats (R/I/S/B/U/J) |
| ARM64 | 4 B fixed | pack per-class bitfields; many classes; bitmask-imm encoder |
| MIPS | 4 B fixed | 3 formats (R/I/J), very regular |
`encode()`'s ~500-line body and the whole `Encoding`/`Encoding_Flags`
schema (esc/prefix/vex_*) are **x86-only**. RISC-V's `encode_one` is a
dozen lines of shifts. **Conclusion: the `encode_one`/`decode_one` core
and the `Encoding` struct do not generalize — but the loop that drives
them does (§7).**
### 2.2 Memory addressing — **high divergence**
| ISA | Addressing modes |
|---|---|
| x86 | `[base + index*scale + disp32]`, RIP-relative, segment override, addr-size override |
| RISC-V | `disp12(base)` only — no index, no scale |
| MIPS | `imm16(base)` only |
| ARM64 | `[base]`, `[base,#imm]`, `[base,Xm{,LSL#n}]`, `[base,Wm,SXTW]`, pre/post-index `[base,#imm]!` / `[base],#imm`, PC-rel literal |
The x86 `Memory` bit_field (with `segment`, `addr_size_override`,
index+scale) is deeply x86-flavored. RISC-V's memory is `{base, i32 disp}`.
ARM adds **writeback** (a mode x86 cannot express) and extend/shift on the
index. **Conclusion: `Memory` is per-arch.** What generalizes is only the
*role*: a `MEMORY`-kind operand carrying an arch-defined payload.
### 2.3 Immediates & operand size — **moderate divergence**
- The *value* (an `i64`) generalizes perfectly.
- The *encoding* does not: RISC-V scatters immediate bits across fields
(B-type, J-type) and shifts them; ARM has bitmask-immediate and shifted
forms. All of that lives inside `encode_one`; the `Operand` just holds
the clean value.
- **Size association differs:** x86 carries an explicit `size: u8` and
uses it to select an encoding; RISC-V/ARM bake width into the mnemonic
(`LW` vs `LD`, `W0` vs `X0`). Keep `size` in the shared operand shape as
a *carrier*; let each arch decide how much it matters.
### 2.4 Relocations — **moderate divergence (structurally aligned)**
The `Relocation` *struct* (offset, symbol/label, addend, type, size)
mirrors ELF `rela` and is universal. The *type enum* is per-arch and much
larger on RISC-V (paired `PCREL_HI20`/`PCREL_LO12`, `CALL`, `BRANCH`,
`JAL`, `HI20`, `LO12_I/S`, …) because PC-relative addressing needs
instruction *pairs* (AUIPC+ADDI). **Conclusion: share the struct shape,
make the type enum a per-arch parameter.**
### 2.5 Registers — **low/structural divergence**
The `(class, hw_number)`-packed `distinct u16` scheme generalizes well.
What differs:
- x86: REX/EVEX extension bits, AH↔SPL aliasing, RIP pseudo-reg.
- RISC-V: clean 5-bit fields, `x0`=hardwired zero, ABI names
(`zero/ra/sp/gp/tp/t0../s0../a0..`), separate `f`/`v` files.
- ARM64: reg #31 means **SP or XZR depending on instruction** (a
decode/print-time disambiguation x86 never needs); `w`/`x` and
`b/h/s/d/q` views.
**Conclusion: share the *layout convention* + `reg_hw`/`reg_class`
accessors; per-arch owns classes, enums, names, and extension semantics.**
### 2.6 Mnemonics — **content differs, shape identical**
Per-arch `enum u16`, `INVALID=0`. Nothing to share but the convention.
### 2.7 Labels — **no divergence**
`labels.odin` is pure bookkeeping. The array-index model
(`Label_Definition`, `label`, `label_forward`, `label_set_at`,
`Label_Map`, `label_named`, `label_reserve`, `label_set`) lives in
`isa/labels.odin` and is parametric over the Instruction type. **Fully
shared.** Each arch's `encode()` rewrites label_defs from instruction
indices to byte offsets between pass 1 and pass 2.
### 2.8 Errors / Result — **low divergence**
`Result` is universal. `Error` is universal in shape. `Error_Code` splits
into a **shared core** (`NONE, BUFFER_OVERFLOW, INVALID_MNEMONIC,
NO_MATCHING_ENCODING, BUFFER_TOO_SHORT, INVALID_OPCODE, LABEL_OUT_OF_RANGE,
…`) and **arch-specific** extras (`INVALID_MODRM/SIB/VEX/EVEX,
TOO_MANY_PREFIXES` on x86; RISC-V would add `MISALIGNED_IMMEDIATE`,
`INVALID_ROUNDING_MODE`, …).
### 2.9 Printer — **framework universal, formatting per-arch**
Shareable: `Token`, `Token_Kind` (the kinds are generic), `Print_Options`,
the builder/number-formatting helpers, and the whole family of output
sinks (`sbprint/print/aprint/tprint/bprint/fprint/wprint` + `ln`). Per-arch:
`register_name`, `print_memory` (syntax differs wildly),
`mnemonic_to_string`, and the size-suffix convention (x86's `.b/.w/.d` is
x86-only; RISC-V puts width in the mnemonic).
### Divergence summary
| Component | Verdict | What's shared | What's per-arch |
|---|---|---|---|
| Labels | ✅ shared | everything | — |
| Result / Error struct | ✅ shared | struct shapes | error-code extras |
| Relocation struct | ✅ shared | struct shape | type enum |
| Printer framework | ◑ split | tokens, options, sinks, num-fmt | reg/mem/mnemonic formatting |
| Register scheme | ◑ split | layout + `reg_hw`/`reg_class` | classes, enums, names, ext bits |
| Operand model | ◑ split | kind tag + union discipline + `size` carrier | `Memory`, `flags` payloads |
| Encode/decode **driver** | ◑ shared via generics | two-pass loops, label/reloc resolution | the per-instruction hook |
| `Instruction` | ✗ per-arch | shape convention only | concrete struct |
| `Mnemonic` | ✗ per-arch | convention (u16, INVALID=0) | the enum |
| `Encoding` + tables | ✗ per-arch | codegen *pattern* | schema + data |
| `encode_one`/`decode_one` | ✗ per-arch | nothing | all of it |
| Memory addressing | ✗ per-arch | operand *role* | the model |
---
## 3. Why not the "obvious" unifications
Three tempting designs that **violate** the no-compromise rule:
1. **One universal `Operand`/`Memory` for all ISAs.** Forces the union of
x86 SIB+segment, ARM writeback+extend, and RISC-V's nothing into a
single struct. Bloats every operand, leaks `segment` into RISC-V, and
still can't represent ARM writeback cleanly. ✗
2. **A runtime `interface`/vtable the encoder calls per instruction.**
Adds an indirect call to the hottest loop (x86 does ~17 M inst/s — a
per-instruction `proc` pointer is a measurable tax) and defeats
inlining. ✗ on the default path.
3. **`any`/tagged-union `Instruction` passed through a generic `encode`.**
Same monomorphization loss + runtime type checks in the hot loop. ✗
The design instead gets uniformity from **compile-time** mechanisms
(naming contract + parametric polymorphism), and reserves runtime dispatch
for an **opt-in** facade (§5.3) that only multi-target *tools* pay for.
---
## 4. Proposed package layout
```
rexcode/
isa/ # shared, architecture-independent core
labels.odin # Label, Label_Definition, Label_Map, resolution
reloc.odin # Relocation (type field is generic/u8)
status.odin # Result, Error, shared Error_Code core
print.odin # Token, Token_Kind, Print_Options, sinks, num-fmt
register.odin # distinct-u16 layout convention + reg_hw/reg_class
pipeline.odin # parametric encode_stream/decode_stream (§7)
target.odin # optional runtime Target vtable (§5.3)
x86/ # exists today; refactor to import isa
registers.odin operands.odin instructions.odin mnemonics.odin
encoding_types.odin encoder.odin decoder.odin printer.odin
encoding_table.odin decoding_tables.odin mnemonic_builders.odin
tests/ tools/
riscv/ # next: same shape as x86/
registers.odin operands.odin instructions.odin mnemonics.odin
encoding_types.odin encoder.odin decoder.odin printer.odin
encoding_table.odin decoding_tables.odin mnemonic_builders.odin
tests/ tools/
arm64/ mips/ … # future, same template
```
- **`isa` depends on nothing.** Each arch package depends on `isa` and
**re-exports** the shared types (e.g. `x86.Result`, `x86.Label_Map`)
so a consumer of `x86` sees one coherent namespace and never imports
`isa` directly unless writing arch-generic tooling.
- Each arch package is **self-contained** (its own tests/tools), matching
the move already done for x86.
---
## 5. Three layers of generality (pick per use case)
### 5.1 Layer A — direct single-arch use (default, zero overhead)
```odin
import "rexcode/x86"
code: [4096]u8
res := x86.encode(insts[:], labels[:], code[:], &relocs, &errors)
```
Fully static, fully inlined, exactly as fast as today. **99% of consumers
live here.**
### 5.2 Layer B — source-portable code via the naming contract
Because every arch package exposes the *same names with the same
signatures* (§6), code that only touches the shared vocabulary
(`Label_Map`, `encode`, `tprint`, `Result`, `Relocation`) can be written
against `import arch "rexcode/x86"` and re-pointed at `rexcode/riscv` by
changing one import — as long as the arch-specific operand construction is
isolated (e.g. behind your own per-arch helper). Still 100% compile-time,
zero overhead.
### 5.3 Layer C — runtime multi-target facade (opt-in, for tools)
For a disassembler or JIT that selects the arch *at runtime*, `isa`
provides a vtable populated by each arch:
```odin
// isa/target.odin
Target :: struct {
name: string,
decode: proc(data: []u8, out: ^Decoded) -> Result, // bytes → generic Decoded
print: proc(d: ^Decoded, opts: ^Print_Options) -> string,
inst_align: u32, // 1 for x86, 4 for riscv/arm64/mips
max_inst: u32, // 15 for x86, 4 for riscv (8 for C-pairs), 4 for arm64
}
// each arch: x86.TARGET: isa.Target = { … }
```
This boundary trades in **bytes and a generic `Decoded` view**, not the
concrete `Instruction`, so it never forces a unified instruction struct.
It carries a proc-pointer indirection — acceptable for a tool that has
already paid a `switch arch` somewhere, and never on Layer A's path.
---
## 6. The naming contract (the most important artifact)
Every architecture package **MUST** expose these names with these
signatures. This is what makes the family feel like one library and what
the RISC-V implementation is built against as a checklist.
### Types (concrete per arch, identical names)
```
Register Memory Operand Operand_Kind
Instruction Mnemonic Encoding Instruction_Info
```
### Re-exported shared types (from `isa`)
```
Label Label_Definition Label_Map LABEL_UNDEFINED
Relocation Relocation_Type Error Error_Code Result
Token Token_Kind Print_Options DEFAULT_PRINT_OPTIONS
```
### Operand constructors
```
op_reg(r) op_mem(m, size) op_imm(v, size) op_label(id, size)
mem_*(…) # arch-specific set; at minimum mem_base_disp
# (mem_base in x86 is an accessor, not a constructor;
# use mem_base_only for the no-displacement case)
op_<class>(typed) # typed safe constructors where the arch has classes
```
### Instruction builders & emitters
Builder names spell out each operand kind separated by underscores
(matches x86's existing convention):
```
inst_none / inst_r / inst_r_r / inst_r_i / inst_r_m / inst_m_r / …
emit_none / emit_r / emit_rr / emit_ri / emit_rm / emit_mr / …
# NB: emit_* uses concatenated suffixes (legacy x86 spelling)
inst_<mnemonic>(…) / emit_<mnemonic>(…) # generated typed overloads
```
### Entry points (identical signatures across arches)
```odin
encode(instructions: []Instruction, label_defs: []Label_Definition,
code: []u8, relocs: ^[dynamic]Relocation, errors: ^[dynamic]Error,
resolve := true, base_address: u64 = 0) -> Result
decode(data: []u8, relocs: []Relocation,
instructions: ^[dynamic]Instruction, inst_info: ^[dynamic]Instruction_Info,
label_defs: ^[dynamic]Label_Definition, errors: ^[dynamic]Error) -> Result
print/println/aprint/tprint/bprint/fprint/wprint(+ln)(
instructions: []Instruction, inst_info: []Instruction_Info,
label_defs: []Label_Definition, tokens=nil, options=nil, label_names=nil)
```
### Register/label/print helpers
```
reg_hw reg_class reg_size register_name mnemonic_to_string
label label_forward label_named label_reserve label_set
```
> Anything an arch genuinely lacks (e.g. RISC-V has no `mem_base_index`)
> is simply **absent**, not stubbed. Portable (Layer B) code stays within
> the intersection; arch-aware code uses the extras.
---
## 7. Zero-cost code reuse via parametric polymorphism
The encode/decode **drivers** are arch-independent control flow. Factor
them into `isa` as procedures generic over the instruction type `$I`,
parameterized by an arch-provided per-instruction hook. Odin monomorphizes
these at compile time → **no runtime cost, real code sharing.**
```odin
// isa/pipeline.odin (sketch)
encode_stream :: proc(
instructions: []$I,
label_defs: []Label_Definition,
code: []u8,
relocs: ^[dynamic]Relocation,
errors: ^[dynamic]Error,
encode_one: proc(inst: ^I, out: []u8, code_pos: u32,
relocs: ^[dynamic]Relocation, errors: ^[dynamic]Error) -> (n: u32, ok: bool),
resolve := true, base_address: u64 = 0,
) -> Result {
// PASS 1: for each inst → record offset, call encode_one, advance
// PASS 1.5: rewrite label_defs inst-index → byte-offset (identical on every arch)
// PASS 2: resolve relocations / patch / spill unresolved (identical on every arch)
}
```
x86's current `encode()` becomes a thin wrapper that passes its
`encode_one` (the prefix/ModRM/SIB body); RISC-V's wrapper passes its
12-line bitfield packer. The label/relocation machinery — the part that's
easy to get subtly wrong — is written and tested **once**.
Caveats (arch-specific passes that stay out of the shared driver):
- **RISC-V pseudo-ops** (`li`, `call`, `la`, `j`) expand to 12 real
instructions; needs an arch pre-lowering pass.
- **Branch relaxation** (short↔long form) is arch-specific.
- **ARM literal pools / constant islands** are an extra emission phase.
These plug in *around* the shared driver, not inside it.
---
## 8. Concrete RISC-V mapping (RV64GC as the first target)
What each contract item becomes, to validate the design before coding:
| Contract item | RISC-V realization |
|---|---|
| `Register` | `distinct u16`, classes `REG_X` (x031), `REG_F` (f031), `REG_V` (v031). No REX/EVEX bits. `x0` semantic = zero. |
| typed enums | `XREG{ZERO,RA,SP,GP,TP,T0,T1,T2,S0,S1,A0..A7,S2..S11,T3..T6}`, `FREG`, `VREG` |
| `Memory` | `struct { base: Register, disp: i32 }` — no index/scale/segment |
| `mem_*` | `mem_base(base)`, `mem_base_disp(base, disp)` only |
| `Operand` | same kind-tagged shape; `size` mostly informational (width is in the mnemonic) |
| `Mnemonic` | `enum u16` — RV32I/64I + M,A,F,D,C,V (`ADDI, LW, LD, BEQ, JAL, AUIPC, FADD_D, …`) |
| `Encoding` | `struct { format: Format, opcode, funct3, funct7: u8, … }`, `Format{R,I,S,B,U,J,R4,…}` |
| `encode_one` | switch on `format`, pack fields, scatter immediate bits |
| `Encoding_Flags` | tiny (e.g. `is_compressible`, `rounding_ok`) vs x86's 11 fields |
| `Relocation_Type` | `R_RISCV_BRANCH, JAL, CALL, PCREL_HI20, PCREL_LO12_I/S, HI20, LO12_I/S, RVC_BRANCH/JUMP, …` |
| `Instruction_Info` | `offset`, `is_compressed: bool`, rounding mode — no prefix/VEX fields |
| printer | `register_name` uses ABI names; `print_memory` emits `disp(base)`; width lives in the mnemonic (no `.b/.w` suffix) |
| tables | `gen_decode_tables` becomes near-trivial: a fixed-field instruction decodes by `(opcode, funct3, funct7)` keys |
| `MAX_INST_SIZE` | `4` (or `8` to cover a compressed pair); `inst_align` = 2 |
Notable RISC-V-only concerns the design already accommodates:
- **Split immediates** → hidden in `encode_one`; operand stays a clean value.
- **Paired PC-relative relocs** (AUIPC+ADDI) → expressed via the shared
`Relocation` struct with RISC-V's type enum; resolution of the *pair* is
a RISC-V detail layered on the shared reloc list.
- **Compressed (C) extension** → variable 2/4-byte width handled by
`decode_one` returning a length, exactly like x86's variable length —
the shared decode driver already threads instruction length.
If RISC-V slots cleanly into the contract (it does above), the contract is
sound for the regular fixed-width ISAs (ARM64, MIPS) too.
---
## 9. Recommended next steps
1. **Stabilize x86 first.** Resolve the constructor-rename drift noted in
[x86_api.md](x86_api.md#known-drift) (tests/README vs `operands.odin`)
so x86 is the clean reference the contract is extracted from.
2. **Extract `isa`** by lifting the *already-arch-independent* files:
`labels.odin`, the `Relocation`/`Error`/`Result` types, and the printer
framework (tokens/options/sinks/number-formatting). Make `x86`
re-export them. This is a low-risk refactor that proves the split.
3. **Add the parametric `encode_stream`/`decode_stream`** to `isa` and
reduce x86's `encode`/`decode` to wrappers. Validate against the
existing test suite (same bytes out).
4. **Write the RISC-V package against the contract** (§6) and the mapping
(§8), reusing `isa` wholesale. Build its `encoding_table.odin` by hand,
then port the two generators.
5. **Only if a runtime-multi-target tool appears**, add the `Target`
vtable (§5.3). Don't build it speculatively.
The deliverable order matters: every step is independently shippable, and
x86 keeps working (and keeps its performance) throughout.
---
## 10. One-paragraph summary
Make `isa` own the parts that are the same on every ISA — labels,
relocations, errors/result, the print framework, and (via Odin
parametric polymorphism) the encode/decode driver loops. Make each arch
package own its registers, memory model, operands, mnemonics, encoding
tables, and the actual `encode_one`/`decode_one` bytes. Bind the family
together with a strict **naming contract** so packages are drop-in
swappable at source level with zero runtime cost, and reserve a single
opt-in runtime `Target` vtable for the rare tool that needs to choose an
architecture dynamically. x86 keeps every cycle of its current
performance; RISC-V (and later ARM/MIPS) gets the boring 60% for free and
writes only the 40% that is genuinely its own.

View File

@@ -0,0 +1,79 @@
# MIPS targets and extensions — platform catalog
> What's worth supporting in `rexcode/mips/` (or a sibling subpackage) and
> what isn't, framed around the actual hardware that runs MIPS.
## Mainline consoles (MIPS-family CPUs)
| Platform | CPU | Base ISA | Custom extension | Status |
|---|---|---|---|---|
| **PS1 / PSX** | Sony R3000A | MIPS I (no MMU) | **GTE** (COP2) — geometry transformation engine | ✅ done |
| **PSX IOP / PS3 IOP** | LSI CW33300 / "IOP" | MIPS I | (none — same as PS1 CPU) | ✅ covered by MIPS I |
| **N64** | NEC VR4300i | MIPS III + partial MIPS IV FPU | none on main CPU | ✅ covered by MIPS III + IV + FPU |
| **N64 RSP** | RCP "Reality Signal Processor" | custom MIPS R4000 subset | **VU** (128-bit vector unit, 32 vec regs); also drops mult/div/FPU/TLB | ⚠ **needs its own subpackage** — different ISA |
| **N64 RDP** | (display processor) | not a CPU, command-stream — not in scope | | |
| **PS2 EE** | Sony R5900 (Toshiba) | MIPS III + MIPS IV (MOVN/MOVZ) | **MMI** (128-bit packed SIMD via MMI0-3), **LQ/SQ**, second HI/LO, VU0-macro | ✅ done (MMI; VU0-macro forms TBD) |
| **PS2 VU0 / VU1** | "Vector Unit" | not MIPS — VLIW pair (upper + lower microcode) | — | 🚧 **separate ISA** — sibling `vu/` subpackage if needed |
| **PS2 IOP** | (R3000A reused) | MIPS I | — | ✅ covered |
| **PSP** | Sony "Allegrex" | MIPS32 R2 (+ R2 bitfield + rotates + SEB/SEH + BITREV) | **VFPU** (vector FPU, 128 32-bit regs in 8×4×4 matrices), Allegrex-specific BITREV/etc. | ⚠ Mnemonics enumerated, encodings TBD |
| **PSP Media Engine** | (second Allegrex) | same as Allegrex | same VFPU | (covered when PSP CPU is) |
| **PSV / Vita PS1-mode** | Cortex-A9 emulating R3000 | — (host is ARM) | — | |
## Arcade and other
| Platform | CPU | Base ISA | Extension | Status |
|---|---|---|---|---|
| **SNK Hyper Neo Geo 64** | NEC VR4300 | MIPS III | none | ✅ covered |
| **Konami Hornet** (arcade) | various | MIPS-family | none | ✅ covered |
| **Sega Model 3** step 1.x | MIPS — IDT R5000 | MIPS IV | none | ✅ covered |
## Modern / embedded MIPS with vendor extensions
| Platform | CPU | Base | Extension | Status |
|---|---|---|---|---|
| **Ingenic XBurst** (Jz47xx) — old MP3/Android handhelds | XBurst | MIPS32 R2 | **MXU** (Multimedia Unit, custom SIMD), DSP ASE | 🚧 DSP enumerated, **MXU is XBurst-only** — defer |
| **Broadcom MIPS** (older routers) | bcm473x / bcm63xx | MIPS32 R2/R5 | DSP ASE common | DSP enumerated; encodings TBD |
| **Atheros / Qualcomm** (router SoC) | MIPS32 R2 | MIPS32 R2 | DSP common | as above |
| **MediaTek MIPS** (older routers) | MIPS32 R2 | MIPS32 R2 | DSP | as above |
| **Loongson 2/3** (China desktop) | Loongson | MIPS64 + custom | **Loongson MMI** (note: different from PS2 MMI!), **LSX** (128-bit), **LASX** (256-bit). Modern Loongson uses LoongArch instead. | 🚧 niche, defer |
| **Microchip PIC32** | MIPS M4K / microAptiv | MIPS32 R1/R2 + microMIPS | none | ✅ covered (microMIPS not in scope) |
| **Cavium Octeon** (server) | OCTEON | MIPS64 R2 | **OCTEON specific** (crypto, packet) | defer |
## Workstations (historical)
| Vendor | CPU | ISA | Notes |
|---|---|---|---|
| SGI Indy/Indigo/Octane/Origin | R4000/R5000/R8000/R10000/R12000/R14000 | MIPS IIIIV | stock MIPS — ✅ covered |
| DEC station | R3000 / R4000 | MIPS IIII | ✅ covered |
| Various Unix workstations | MIPS family | various | ✅ covered |
## **NOT** MIPS (mentioned because users sometimes ask)
- **GBA / DS / 3DS / Switch** — ARM. Out of scope for `mips/`.
- **Sega Saturn** — dual SH-2. **Dreamcast** — SH-4. Not MIPS.
- **3DO** — ARM60. Not MIPS.
- **Atari Jaguar** — 68k + custom Tom/Jerry RISCs. Not MIPS.
- **Apple PowerBook / Macintosh** — PowerPC / Motorola 68k. Not MIPS.
- **Sega Genesis / Mega Drive** — 68000. **Sega 32X** — SH-2. **Sega CD** — 68k. Not MIPS.
## Recommended priority for `rexcode`
Given typical demand (emulation, decompiling old console games, romhacking, RE):
1. **What's done is the bulk of console value:** PS1, PS2, N64 main CPU, FPU, COP0.
2. **N64 RSP** — high value for N64 emulation/microcode work. Should be `rexcode/rsp/` (separate ISA — see below).
3. **PSP VFPU encodings** — high value for PSP emulation, completes the Allegrex story. Stays inside `mips/`.
4. **DSP ASE encodings** — useful for modern router/embedded reversing. Stays inside `mips/`.
5. **PS2 VU microcode** — distinct from MIPS (VLIW). Worth `rexcode/vu/` only if a real consumer appears.
6. **MSA encodings** — modern MIPS only; some Linux distros for MIPS workstations. Lower priority.
7. **Loongson / Octeon / MXU** — defer until someone needs them.
## Why N64 RSP wants its own subpackage
The RSP is a **subset** of MIPS (no MULT/DIV/FPU/TLB; no doubleword ops) **plus** a heavily custom COP2 vector unit. Trying to share `mips/` with it would mean:
- The shared Mnemonic enum picks up ~60 RSP-only vector ops (VMULF/VMACF/VADDC/VCH/VCL/VCR/VRCP/VRCPL/VRSQ/VRSQL/VRNDP/VRNDN/...) plus vector load/store variants (LBV/LSV/LDV/LQV/LRV/LPV/LUV/LHV/LFV/LWV/LTV + their store equivalents). Polluting the MIPS namespace.
- The RSP's COP2 encoding *collides* with PS1 GTE bit patterns (both use op=0x12 with the CO bit) so a single decode table can't disambiguate without an ISA gate.
- The RSP's vector loads encode element offset + size in the cofun bits in ways that have no MIPS analogue.
Cleaner: `rexcode/rsp/` as a sibling subpackage. It will reuse `isa/` (labels, relocs, errors, print framework) and parallel `mips/`'s shape (registers / operands / instructions / mnemonics / encoding_table / encoder / decoder / printer). Users targeting N64 import either `mips` (for the R4300 main CPU) or `rsp` (for RSP microcode) — or both, side-by-side.

View File

@@ -0,0 +1,518 @@
# rexcode `x86` — Complete API Extraction
> Snapshot of the entire public surface of the `x86` subpackage
> (`rexcode/x86/`), grouped by module. This is the reference the
> cross-architecture design ([cross_arch_design.md](cross_arch_design.md))
> is built against.
The package is **table-driven**: a hand-written master encoding table
(`ENCODING_TABLE`) is the single source of truth, from which the decode
tables and the typed builder procedures are *generated*. The runtime is
zero-allocation (caller owns every buffer) and the hot paths are fully
inlined.
```
ENCODING_TABLE (hand-written, source of truth)
┌───────────────┼────────────────┐
gen_decode_tables gen_mnemonic_builders
│ │
decoding_tables.odin mnemonic_builders.odin
(decode() reads these) (typed inst_*/emit_* helpers)
```
Pipeline at a glance:
```
[]Instruction ──encode()──▶ []u8 (+ []Relocation, []Error)
▲ │
│ ▼
builders decode()
│ │
inst_*/emit_* ▼
[]Instruction + []Instruction_Info + []Label_Definition
print()/tprint()/… ──▶ text (+ []Token)
```
---
## 1. Registers (`registers.odin`)
### Core type
```odin
Register :: distinct u16 // bit layout: 0b_0000_CCCC_EEEN_NNNN
// NNNNN = hardware register number (031)
// E = needs REX/VEX .B/.R/.X extension (hw >= 8)
// EE = needs EVEX (hw 1631)
// CCCC = register class (high byte)
```
### Class constants (high byte)
`REG_NONE`, `REG_GPR64`, `REG_GPR32`, `REG_GPR16`, `REG_GPR8`, `REG_GPR8H`
(legacy AH/CH/DH/BH), `REG_XMM`, `REG_YMM`, `REG_ZMM`, `REG_K` (opmask),
`REG_SEG`, `REG_CR` (control), `REG_DR` (debug), `REG_BND` (MPX), `REG_MM`
(MMX), `REG_ST` (x87).
### Sentinels
`NONE :: Register(0xFFFF)`, `RIP :: Register(0xFFFE)`.
### Typed register enums (compile-time safety, value == hardware number)
`GPR64`, `GPR32`, `GPR16`, `GPR8`, `GPR8H` (`AH=4..BH=7`), `XMM`, `YMM`,
`ZMM` (each 031), `KREG` (K0K7), `SREG` (ES,CS,SS,DS,FS,GS), `MM`
(MM07), `CREG` (CR0,2,3,4,8), `DREG` (DR03,6,7), `ST` (ST07), `BND`
(BND03).
### Named register constants
Every register has a package-level constant: `RAX``R15`, `EAX``R15D`,
`AX``R15W`, `AL``R15B`, `AH/CH/DH/BH`, `XMM0``XMM31`, `YMM0``YMM31`,
`ZMM0``ZMM31`, `K0``K7`, `ES/CS/SS/DS/FS/GS`, `CR0/2/3/4/8`,
`DR0/1/2/3/6/7`, `BND0``BND3`, `MM0``MM7`, `ST0``ST7`, plus `RIP`.
### Utility functions (all branchless, `contextless`)
| Proc | Signature | Purpose |
|---|---|---|
| `reg_hw` | `(Register) -> u8` | hardware number (low 5 bits) |
| `reg_class` | `(Register) -> u16` | class (high byte) |
| `reg_needs_rex` | `(Register) -> bool` | hw >= 8 |
| `reg_needs_rex_ext` | `(Register) -> bool` | hw >= 8 and class < K |
| `reg_needs_evex` | `(Register) -> bool` | hw >= 16 |
| `reg_is_gpr` | `(Register) -> bool` | any GPR class |
| `reg_is_vector` | `(Register) -> bool` | XMM/YMM/ZMM |
| `reg_is_high_byte` | `(Register) -> bool` | AH/CH/DH/BH |
| `reg_size` | `(Register) -> u16` | size in **bits** |
### Register-from-number constructors
`gpr64_from_num`, `gpr32_from_num`, `gpr16_from_num` `(u8) -> Register`;
`gpr8_from_num(num: u8, has_rex: bool) -> Register` (handles AH↔SPL
aliasing); `xmm_from_num`, `ymm_from_num`, `zmm_from_num`,
`mm_from_num`. Each returns `NONE` if out of range. Pure casts, no table.
---
## 2. Operands (`operands.odin`)
### Operand kind
```odin
Operand_Kind :: enum u8 { NONE, REGISTER, MEMORY, IMMEDIATE, RELATIVE }
```
### Memory operand (packed)
```odin
Memory :: bit_field u64 {
base_hw: u8 | 5,
base_ext: bool | 1,
index_hw: u8 | 5,
index_ext: bool | 1,
scale_enc: u8 | 2,
displacement: i32 | 32,
segment: u8 | 3,
addr_size_override: bool | 1,
base_class: u8 | 5,
index_class: u8 | 5,
}
MEM_BASE_RIP :: 30 MEM_BASE_NONE :: 31 MEM_INDEX_NONE :: 31
```
**Constructor:** `mem_make(base, index: Register, scale: u8, displacement: i32, segment: Register) -> Memory`
**Convenience constructors** (current names after the in-tree refactor):
`mem_base_only(base)`, `mem_base_disp(base, disp)`,
`mem_base_index(base, index, scale)`,
`mem_base_index_disp(base, index, scale, disp)`, `mem_rip_disp(disp)`.
> ⚠️ The README and `tests/test.odin` still use the *old* names
> (`mem_base`, `mem_base_displacement`, `mem_base_index_displacement`,
> `mem_rip_relative`). `mem_base` is now an **accessor**, not a
> constructor. See the "Known drift" note at the end.
**Accessors:** `mem_scale`, `mem_is_rip_relative`, `mem_has_base`,
`mem_has_index` `(Memory) -> …`; `mem_base`, `mem_index` `(Memory) -> Register`.
### The unified operand
```odin
Operand :: struct #packed { // 16 bytes
using _: struct #raw_union {
reg: Register,
mem: Memory,
immediate: i64,
relative: i64, // offset or label id
},
kind: Operand_Kind,
size: u8, // operand size in bytes (1,2,4,8,16,32,64)
flags: Operand_Flags,
_pad: [4]u8,
}
Broadcast :: enum u8 { NONE, B1TO2, B1TO4, B1TO8, B1TO16 } // EVEX
Operand_Flags :: bit_field u16 { // EVEX-specific
mask: u8 | 3, // opmask K1K7
zeroing: bool | 1, // merge vs zero masking
broadcast: Broadcast | 3,
er_sae: u8 | 2, // embedded rounding / SAE
}
```
### Generic operand constructors
`op_reg(r)`, `op_mem(m, size)`, `op_mem_from_parts(base, index, scale, disp, size)`,
`op_imm8/16/32/64(v)`, `op_rel8/32(offset)`, `op_label(label_id, size=4)`.
### Typed operand constructors (compile-time class safety)
`op_gpr64`, `op_gpr32`, `op_gpr16`, `op_gpr8`, `op_gpr8h`, `op_xmm`,
`op_ymm`, `op_zmm`, `op_kreg`, `op_sreg`, `op_mm`, `op_creg`, `op_dreg`,
`op_st`, `op_bnd` — each takes the matching typed enum and returns an
`Operand` (e.g. `op_gpr64(.XMM0)` is a *compile error*).
---
## 3. Instructions (`instructions.odin`)
```odin
Rep :: enum u8 { NONE, REP, REPNE }
Instruction_Flags :: bit_field u8 {
lock: bool|1, rep: Rep|2, segment: u8|3, addr32: bool|1, data16: bool|1,
}
Instruction :: struct #packed { // 72 bytes
ops: [4]Operand,
mnemonic: Mnemonic,
operand_count: u8,
flags: Instruction_Flags,
length: u8, // filled by decoder
_pad: [3]u8,
}
```
### Generic instruction builders (`inst_*`, all `contextless`)
| Builder | Shape |
|---|---|
| `inst_none(m)` | no operands |
| `inst_r(m, r)` | one register |
| `inst_m(m, mem, size)` | one memory |
| `inst_i(m, imm, imm_size)` | one immediate |
| `inst_rel(m, label_id, size=4)` | branch to label |
| `inst_rel_offset(m, offset, size)` | branch to raw offset |
| `inst_r_r(m, dst, src)` | reg, reg |
| `inst_r_m(m, dst, src_mem, size)` | reg, mem |
| `inst_m_r(m, dst_mem, size, src)` | mem, reg |
| `inst_r_i(m, dst, imm, imm_size)` | reg, imm |
| `inst_m_i(m, dst_mem, size, imm, imm_size)` | mem, imm |
| `inst_r_r_r(m, dst, s1, s2)` | 3× reg (VEX/EVEX) |
| `inst_r_r_m(m, dst, s1, m2, size)` | reg, reg, mem |
| `inst_r_r_i(m, dst, src, imm, imm_size)` | reg, reg, imm |
| `inst_r_m_i(m, dst, m, msize, imm, isize)` | reg, mem, imm |
| `inst_m_r_i(m, mem, msize, src, imm, isize)` | mem, reg, imm |
| `inst_r_m_r(m, dst, m1, msize, s2)` | reg, mem, reg |
| `inst_r_r_r_r(m, dst, s1, s2, s3)` | 4× reg |
| `inst_r_r_r_i(m, dst, s1, s2, imm, isize)` | 3 reg + imm |
| `inst_r_r_m_i(m, dst, s1, m2, msize, imm, isize)` | 2 reg + mem + imm |
| `inst_r_r_m_r(m, dst, s1, m2, msize, s3)` | 2 reg + mem + reg |
### Dynamic-array emitters (`emit_*`, in `encoder.odin`)
One `emit_*` per `inst_*` shape: `emit_none, emit_r, emit_rr, emit_ri,
emit_rm, emit_mr, emit_m, emit_mi, emit_rel, emit_rrr, emit_rrm, emit_rri,
emit_rrrr, emit_i, emit_rmi, emit_mri, emit_rel_offset`. Each is
`(instructions: ^[dynamic]Instruction, mnemonic, …)` and appends.
---
## 4. Mnemonics (`mnemonics.odin`, generated)
```odin
Mnemonic :: enum u16 { INVALID = 0, MOV, MOVABS, MOVZX, , /* ~1176 total */ }
```
Grouped by family (data transfer, arithmetic, logical, …, SSE, AVX,
AVX-512, BMI, FMA, AES, …). `INVALID = 0` is the sentinel.
---
## 5. Labels & references (`labels.odin`)
Lightweight **array-index** model (`Label_Definition`) used by
`encode()`/`decode()`. The label-construction procedures live in
`isa/labels.odin` and are parametric over the Instruction type, so they
work directly for any arch without per-arch wrappers.
### Array-index model (used by encode/decode)
```odin
Label_Definition :: distinct u32 // label_id -> instruction index, then byte offset
LABEL_UNDEFINED :: Label_Definition(0xFFFFFFFF)
```
`label(labels: ^[dynamic]Label_Definition, instructions: ^[dynamic]Instruction) -> u32`
(define at current position), `label_forward(labels) -> u32` (reserve).
### Named labels
```odin
Label_Map :: struct { labels: [dynamic]Label_Definition, names: map[string]u32 }
```
`label_map_init(^, allocator)`, `label_map_destroy(^)`,
`label_named(^, name, instructions) -> u32`, `label_reserve(^, name) -> u32`,
`label_set(^, name, instructions)`.
---
## 6. Encoding types (`encoding_types.odin`)
These describe **how** an instruction is encoded; they are the schema of
`ENCODING_TABLE` and are shared by encoder and decoder.
```odin
Operand_Type :: enum u8 { // ~70 values
NONE, R8,R16,R32,R64, RM8,RM16,RM32,RM64, M,M8..M512,
IMM8,IMM16,IMM32,IMM64, IMM8SX, REL8,REL32,
AL_IMPL,AX_IMPL,EAX_IMPL,RAX_IMPL,CL_IMPL,DX_IMPL,ONE_IMPL,
SREG, CR, DR, XMM,YMM,ZMM, XMM_M32,XMM_M64,XMM_M128,YMM_M256,ZMM_M512,
MM,MM_M64, ST0_IMPL,STI, XMM0_IMPL, K,K_M8..K_M64,
MOFFS8..MOFFS64, PTR16_16,PTR16_32,PTR16_64, M16_16,M16_32,M16_64,
}
Operand_Encoding :: enum u8 { // where an operand's bits go
NONE, MR, REG, VVVV, OP_R, IB,IW,ID,IQ, IMPL, IS4, AAA,
}
Escape :: enum u8 { NONE, _0F, _0F38, _0F3A }
VEX_Type :: enum u8 { NONE, VEX, EVEX, XOP }
VEX_W :: enum u8 { WIG, W0, W1 }
VEX_L :: enum u8 { LIG, L0, L1, L2 }
Encoding_Flags :: bit_field u16 {
esc: Escape|2, prefix: u8|2, vex_type: VEX_Type|2, vex_w: VEX_W|2,
vex_l: VEX_L|2, default_64: bool|1, force_rex_w: bool|1, no_rex: bool|1,
lock_ok: bool|1, rep_ok: bool|1, modrm_reg_ext: bool|1,
}
Encoding :: struct #packed { // 14 bytes — one encoding form
mnemonic: Mnemonic, ops: [4]Operand_Type, enc: [4]Operand_Encoding,
opcode: u8, ext: u8, flags: Encoding_Flags,
}
PREFIX_66 :: 1 PREFIX_F3 :: 2 PREFIX_F2 :: 3
```
Helper: `encoding_flags(esc=…, prefix=…, …) -> Encoding_Flags`.
### Shared status / interop types
```odin
Relocation_Type :: enum u8 { NONE, REL8, REL32, ABS32, ABS64 }
Relocation :: struct #packed { // 16 bytes (ELF-rela-like)
offset: u32, label_id: u32, addend: i32,
type: Relocation_Type, size: u8, inst_idx: u16,
}
Error_Code :: enum u8 {
NONE,
// encode
INVALID_MNEMONIC, NO_MATCHING_ENCODING, OPERAND_MISMATCH,
IMMEDIATE_OUT_OF_RANGE, BUFFER_OVERFLOW, LABEL_OUT_OF_RANGE,
INVALID_OPERAND_COUNT,
// decode
BUFFER_TOO_SHORT, INVALID_OPCODE, INVALID_MODRM, INVALID_SIB,
INVALID_PREFIX, INVALID_VEX, INVALID_EVEX, TOO_MANY_PREFIXES,
}
Error :: struct #packed { inst_idx: u32, code: Error_Code, _pad: [3]u8 } // 8 bytes
Result :: struct { byte_count: u32, success: bool }
```
Helper: `op_type_to_size(Operand_Type) -> u8`.
---
## 7. Encoder (`encoder.odin`)
```odin
MAX_INST_SIZE :: 15
encode :: proc(
instructions: []Instruction,
label_defs: []Label_Definition, // in: inst index; MODIFIED to byte offsets
code: []u8, // output machine code
relocs: ^[dynamic]Relocation, // unresolved relocations appended
errors: ^[dynamic]Error,
resolve: bool = true, // patch resolvable relocs in place
base_address: u64 = 0, // for ABS relocations
) -> Result
```
Two-pass: (1) encode each instruction into `code`, recording byte offsets
and emitting pending relocations; (1.5) rewrite `label_defs` from
instruction indices to byte offsets; (2) resolve relocations, appending
the unresolvable ones to `relocs`. Pure / no shared state →
trivially parallelizable.
Buffer-sizing helpers: `encode_max_code_size(n) -> int` (`n*15`),
`encode_max_relocation_count(n) -> int` (`n`).
Internal matcher (file-local, inlined): `encoding_matches_inline`,
`operand_matches_inline`, `reg_matches_inline`, `mem_matches_inline`,
`imm_matches_inline`, `implicit_operand_matches`, `is_implicit_op_inline`,
`get_user_op_inline`.
---
## 8. Decoder (`decoder.odin`)
```odin
Instruction_Info :: struct { // parallel metadata, one per decoded inst
offset: u32,
rex: u8, has_lock: bool, rep: Rep, segment: Register,
vex_type: VEX_Type, vex_l: VEX_L, vex_w: VEX_W,
evex_b: bool, evex_z: bool, opmask: u8,
}
decode :: proc(
data: []u8,
relocs: []Relocation, // optional in: name labels
instructions: ^[dynamic]Instruction, // out
inst_info: ^[dynamic]Instruction_Info, // out (parallel)
label_defs: ^[dynamic]Label_Definition, // out: inferred branch labels
errors: ^[dynamic]Error,
) -> Result
```
Two-pass: (1) decode each instruction (prefixes → opcode → operands),
collecting branch targets; (2) infer labels for in-region branch targets,
reusing IDs from `relocs` when available.
`Decoder_State` (file-internal) holds prefix/VEX/EVEX decode state. The
decoder relies on the generated tables in §10. Mostly file-internal procs:
`decode_prefixes`, `decode_vex2/3`, `decode_evex`, `decode_opcode(_vex)`,
`decode_operands(_vex)`, `decode_single_operand(_vex)`,
`decode_memory_operand`, `decode_register`, `decode_implicit_operand`.
---
## 9. Printer (`printer.odin`)
Modified Intel syntax: size suffix on the mnemonic (`.b .w .d .q .x .y
.z`) instead of `PTR`, clean `[base + index*scale + disp]` memory.
```odin
Token_Kind :: enum u8 { WHITESPACE, NEWLINE, LABEL_DEF, LABEL_REF, OFFSET,
MNEMONIC, REGISTER, IMMEDIATE, MEMORY_BRACKET, MEMORY_OPERATOR,
MEMORY_DISP, MEMORY_SCALE, PUNCTUATION, COMMENT }
Token :: struct { offset: u32, length: u16, kind: Token_Kind, instruction_index: u16 }
Print_Options :: struct {
uppercase: bool, hex_prefix: string, hex_lowercase: bool,
label_prefix: string, show_offsets: bool, indent: string,
separator: string, space_after_comma: bool,
}
DEFAULT_PRINT_OPTIONS :: Print_Options{ }
Print_Result :: struct { text: string, tokens: []Token }
```
Helpers: `mnemonic_to_string(m, lowercase) -> string`,
`register_name(r, lowercase) -> string`, `token_kind_to_string`,
`size_to_suffix(size) -> u8`.
### Output variants (all share the same trailing param set
`tokens=nil, options=nil, label_names=nil`)
| Family | Sink |
|---|---|
| `sbprint` / `sbprintln` | into a `^strings.Builder` |
| `print` / `println` | stdout |
| `aprint` / `aprintln` | newly allocated string (`allocator` param) |
| `tprint` / `tprintln` | temp-allocator string |
| `bprint` / `bprintln` | caller `[]u8` buffer |
| `fprint` / `fprintln` | `^os.File` |
| `wprint` / `wprintln` | `io.Writer` |
All take `(instructions: []Instruction, inst_info: []Instruction_Info,
label_defs: []Label_Definition, …)`.
---
## 10. Generated tables & builders
### `encoding_table.odin` (hand-written master)
```odin
ENCODING_TABLE: [Mnemonic][]Encoding = { .MOV = { forms }, }
```
The single source of truth. `encode()` does `ENCODING_TABLE[mnemonic]`
(O(1)) then linear-scans the forms via `encoding_matches_inline`.
### `decoding_tables.odin` (generated from `ENCODING_TABLE`)
```odin
ModRM_Info :: struct #packed { mod, reg, rm: u8, has_sib: bool, disp_size: u8 }
SIB_Info :: struct #packed { /* scale, index, base */ }
Decode_Entry :: struct { esc: Escape, prefix, opcode, ext: u8,
mnemonic: Mnemonic, ops: [4]Operand_Type,
enc: [4]Operand_Encoding, flags: Encoding_Flags }
VEX_Decode_Entry :: struct { Decode_Entry fields + vex_w: VEX_W, vex_l: VEX_L }
Decode_Index :: struct { start: u16, count: u8 } // range into entries
MODRM_TABLE[256], SIB_TABLE[256]
LEGACY_DECODE_ENTRIES[1266], VEX_DECODE_ENTRIES[667], EVEX_DECODE_ENTRIES[418]
DECODE_INDEX_LEGACY[4][256], DECODE_INDEX_ESC_0F/_0F38/_0F3A[4][256]
VEX_INDEX_0F/_0F38/_0F3A[4][256], EVEX_INDEX_0F/_0F38/_0F3A[4][256]
```
`[prefix][opcode] -> Decode_Index` gives O(1) opcode resolution; the
small `count` range is scanned for ModR/M-ext, operand-size, or VEX.W/L
disambiguation.
### `mnemonic_builders.odin` (generated, ~7,477 procs + ~2,338 overload groups)
Typed memory wrappers `Mem8 … Mem512` (distinct structs over `Memory`)
with constructors `mem8 … mem512`. Per-form typed procs like
`inst_mov_r64_r64(dst: GPR64, src: GPR64) -> Instruction`, each grouped
into an overload set:
```odin
inst_mov :: proc{ inst_mov_r8_r8, inst_mov_r64_r64, inst_mov_r64_imm64, }
emit_mov :: proc{ emit_mov_r8_r8, }
```
So `x86.inst_mov(.RAX, .RBX)` resolves the right encoding at compile time
with full type checking, no runtime dispatch.
---
## 11. Tools (`x86/tools/`)
| File | Package | Role |
|---|---|---|
| `gen_decode_tables.odin` | `main` (`-file`) | walk `ENCODING_TABLE` → emit `decoding_tables.odin` |
| `gen_mnemonic_builders.odin` | `main` (`-file`) | walk `ENCODING_TABLE` → emit `mnemonic_builders.odin` |
| `verify_tables.odin` | `main`, imports `x86 "../"` | check decode tables consistent with `ENCODING_TABLE` |
Tests live in `x86/tests/test.odin` (`package x86_tests`, `import x86 "../"`),
run with `odin run x86/tests`.
---
## Known drift (pre-existing, not from the move)
The working tree had uncommitted edits to `operands.odin`/`printer.odin`
that **renamed the memory constructors** but did not update callers:
- `mem_base_displacement``mem_base_disp`
- `mem_base_index_displacement``mem_base_index_disp`
- `mem_rip_relative``mem_rip_disp`
- `mem_base` repurposed from *constructor* to *accessor*
Result: the library compiles, but `tests/test.odin` (and the README
examples) reference the old names and currently fail to type-check.
Fixing requires either restoring the old constructor names or sweeping
the tests/README to the new ones — a deliberate decision left to you.