mirror of
https://github.com/odin-lang/Odin.git
synced 2026-06-20 00:52:33 +00:00
core:rexcode
This commit is contained in:
469
core/rexcode/docs/cross_arch_design.md
Normal file
469
core/rexcode/docs/cross_arch_design.md
Normal file
@@ -0,0 +1,469 @@
|
||||
# rexcode — Cross-Architecture API Design
|
||||
|
||||
> How to grow rexcode from an x86-only encoder/decoder into a multi-target
|
||||
> library (x86, RISC-V, ARM64, MIPS, …) **without** flattening every
|
||||
> architecture to a lowest common denominator and **without** adding
|
||||
> runtime overhead to the single-target hot path.
|
||||
>
|
||||
> Companion to [x86_api.md](x86_api.md). Written ahead of the RISC-V
|
||||
> subpackage.
|
||||
|
||||
---
|
||||
|
||||
## 0. The guiding principle
|
||||
|
||||
> **Share the bookkeeping, specialize the bytes.**
|
||||
|
||||
An encoder/decoder is two things stitched together:
|
||||
|
||||
1. **Orchestration & bookkeeping** — labels, relocations, the two-pass
|
||||
encode/decode loops, error/result reporting, the print framework,
|
||||
buffer management, the table-gen tooling pattern. This is *the same
|
||||
problem on every ISA* and should be written once.
|
||||
2. **The instruction model & the bytes** — what a register/memory/operand
|
||||
*is*, what the encoding tables look like, and the actual
|
||||
bit/byte-twiddling of `encode_one`/`decode_one`. This is *irreducibly
|
||||
per-architecture* and must stay native and zero-cost.
|
||||
|
||||
Every decision below follows from drawing the line in exactly that place.
|
||||
We do **not** try to invent one `Instruction` type that fits all ISAs —
|
||||
that path forces x86's `segment`/SIB and ARM's writeback and RISC-V's
|
||||
split immediates into one bloated struct, and it is precisely the
|
||||
"compromise performance/effectiveness" outcome to avoid. Instead, each
|
||||
arch owns its concrete types, and uniformity comes from a **naming
|
||||
contract** (§6) plus a small **shared core** (§4) plus **opt-in**
|
||||
generic glue (§5, §7).
|
||||
|
||||
---
|
||||
|
||||
## 1. The universal shape
|
||||
|
||||
Strip away the x86 specifics and every target needs the same nine things:
|
||||
|
||||
| # | Concept | Example in x86 |
|
||||
|---|---|---|
|
||||
| 1 | A **register** = (class, hw number, size) | `Register` distinct u16 |
|
||||
| 2 | **Operands** tagged reg / mem / imm / relative | `Operand` + `Operand_Kind` |
|
||||
| 3 | An **instruction** = mnemonic + operands + flags | `Instruction` |
|
||||
| 4 | A **mnemonic** enum | `Mnemonic` (u16, INVALID=0) |
|
||||
| 5 | **Labels** + forward refs + named labels | `Label_Definition`, `Label_Map` |
|
||||
| 6 | **Relocations** left over after local resolution | `Relocation` |
|
||||
| 7 | `encode([]Inst) -> bytes (+relocs +errors)` | `encode()` |
|
||||
| 8 | `decode(bytes) -> []Inst (+info +labels +errors)` | `decode()` |
|
||||
| 9 | `print([]Inst) -> text (+tokens)` | `print()`/`tprint()`/… |
|
||||
|
||||
Plus two cross-cutting concerns: **errors/result** reporting and a
|
||||
**table-driven core** fed by **codegen tooling**.
|
||||
|
||||
The *shape* of items 5–9 (their signatures and the types they pass around)
|
||||
is architecture-independent. That is the surface we standardize.
|
||||
|
||||
---
|
||||
|
||||
## 2. Where architectures actually diverge
|
||||
|
||||
This is the heart of the analysis. Ranked from "diverges hardest" to
|
||||
"barely diverges."
|
||||
|
||||
### 2.1 Encoding mechanics — **maximal divergence**
|
||||
|
||||
| ISA | Width | Mechanism |
|
||||
|---|---|---|
|
||||
| x86 | 1–15 B, variable | legacy prefixes → REX/VEX/EVEX → escape → opcode → ModRM → SIB → disp → imm |
|
||||
| RISC-V | 4 B (2 B for "C") | pack fixed bitfields; ~6 formats (R/I/S/B/U/J) |
|
||||
| ARM64 | 4 B fixed | pack per-class bitfields; many classes; bitmask-imm encoder |
|
||||
| MIPS | 4 B fixed | 3 formats (R/I/J), very regular |
|
||||
|
||||
`encode()`'s ~500-line body and the whole `Encoding`/`Encoding_Flags`
|
||||
schema (esc/prefix/vex_*) are **x86-only**. RISC-V's `encode_one` is a
|
||||
dozen lines of shifts. **Conclusion: the `encode_one`/`decode_one` core
|
||||
and the `Encoding` struct do not generalize — but the loop that drives
|
||||
them does (§7).**
|
||||
|
||||
### 2.2 Memory addressing — **high divergence**
|
||||
|
||||
| ISA | Addressing modes |
|
||||
|---|---|
|
||||
| x86 | `[base + index*scale + disp32]`, RIP-relative, segment override, addr-size override |
|
||||
| RISC-V | `disp12(base)` only — no index, no scale |
|
||||
| MIPS | `imm16(base)` only |
|
||||
| ARM64 | `[base]`, `[base,#imm]`, `[base,Xm{,LSL#n}]`, `[base,Wm,SXTW]`, pre/post-index `[base,#imm]!` / `[base],#imm`, PC-rel literal |
|
||||
|
||||
The x86 `Memory` bit_field (with `segment`, `addr_size_override`,
|
||||
index+scale) is deeply x86-flavored. RISC-V's memory is `{base, i32 disp}`.
|
||||
ARM adds **writeback** (a mode x86 cannot express) and extend/shift on the
|
||||
index. **Conclusion: `Memory` is per-arch.** What generalizes is only the
|
||||
*role*: a `MEMORY`-kind operand carrying an arch-defined payload.
|
||||
|
||||
### 2.3 Immediates & operand size — **moderate divergence**
|
||||
|
||||
- The *value* (an `i64`) generalizes perfectly.
|
||||
- The *encoding* does not: RISC-V scatters immediate bits across fields
|
||||
(B-type, J-type) and shifts them; ARM has bitmask-immediate and shifted
|
||||
forms. All of that lives inside `encode_one`; the `Operand` just holds
|
||||
the clean value.
|
||||
- **Size association differs:** x86 carries an explicit `size: u8` and
|
||||
uses it to select an encoding; RISC-V/ARM bake width into the mnemonic
|
||||
(`LW` vs `LD`, `W0` vs `X0`). Keep `size` in the shared operand shape as
|
||||
a *carrier*; let each arch decide how much it matters.
|
||||
|
||||
### 2.4 Relocations — **moderate divergence (structurally aligned)**
|
||||
|
||||
The `Relocation` *struct* (offset, symbol/label, addend, type, size)
|
||||
mirrors ELF `rela` and is universal. The *type enum* is per-arch and much
|
||||
larger on RISC-V (paired `PCREL_HI20`/`PCREL_LO12`, `CALL`, `BRANCH`,
|
||||
`JAL`, `HI20`, `LO12_I/S`, …) because PC-relative addressing needs
|
||||
instruction *pairs* (AUIPC+ADDI). **Conclusion: share the struct shape,
|
||||
make the type enum a per-arch parameter.**
|
||||
|
||||
### 2.5 Registers — **low/structural divergence**
|
||||
|
||||
The `(class, hw_number)`-packed `distinct u16` scheme generalizes well.
|
||||
What differs:
|
||||
- x86: REX/EVEX extension bits, AH↔SPL aliasing, RIP pseudo-reg.
|
||||
- RISC-V: clean 5-bit fields, `x0`=hardwired zero, ABI names
|
||||
(`zero/ra/sp/gp/tp/t0../s0../a0..`), separate `f`/`v` files.
|
||||
- ARM64: reg #31 means **SP or XZR depending on instruction** (a
|
||||
decode/print-time disambiguation x86 never needs); `w`/`x` and
|
||||
`b/h/s/d/q` views.
|
||||
**Conclusion: share the *layout convention* + `reg_hw`/`reg_class`
|
||||
accessors; per-arch owns classes, enums, names, and extension semantics.**
|
||||
|
||||
### 2.6 Mnemonics — **content differs, shape identical**
|
||||
|
||||
Per-arch `enum u16`, `INVALID=0`. Nothing to share but the convention.
|
||||
|
||||
### 2.7 Labels — **no divergence**
|
||||
|
||||
`labels.odin` is pure bookkeeping. The array-index model
|
||||
(`Label_Definition`, `label`, `label_forward`, `label_set_at`,
|
||||
`Label_Map`, `label_named`, `label_reserve`, `label_set`) lives in
|
||||
`isa/labels.odin` and is parametric over the Instruction type. **Fully
|
||||
shared.** Each arch's `encode()` rewrites label_defs from instruction
|
||||
indices to byte offsets between pass 1 and pass 2.
|
||||
|
||||
### 2.8 Errors / Result — **low divergence**
|
||||
|
||||
`Result` is universal. `Error` is universal in shape. `Error_Code` splits
|
||||
into a **shared core** (`NONE, BUFFER_OVERFLOW, INVALID_MNEMONIC,
|
||||
NO_MATCHING_ENCODING, BUFFER_TOO_SHORT, INVALID_OPCODE, LABEL_OUT_OF_RANGE,
|
||||
…`) and **arch-specific** extras (`INVALID_MODRM/SIB/VEX/EVEX,
|
||||
TOO_MANY_PREFIXES` on x86; RISC-V would add `MISALIGNED_IMMEDIATE`,
|
||||
`INVALID_ROUNDING_MODE`, …).
|
||||
|
||||
### 2.9 Printer — **framework universal, formatting per-arch**
|
||||
|
||||
Shareable: `Token`, `Token_Kind` (the kinds are generic), `Print_Options`,
|
||||
the builder/number-formatting helpers, and the whole family of output
|
||||
sinks (`sbprint/print/aprint/tprint/bprint/fprint/wprint` + `ln`). Per-arch:
|
||||
`register_name`, `print_memory` (syntax differs wildly),
|
||||
`mnemonic_to_string`, and the size-suffix convention (x86's `.b/.w/.d` is
|
||||
x86-only; RISC-V puts width in the mnemonic).
|
||||
|
||||
### Divergence summary
|
||||
|
||||
| Component | Verdict | What's shared | What's per-arch |
|
||||
|---|---|---|---|
|
||||
| Labels | ✅ shared | everything | — |
|
||||
| Result / Error struct | ✅ shared | struct shapes | error-code extras |
|
||||
| Relocation struct | ✅ shared | struct shape | type enum |
|
||||
| Printer framework | ◑ split | tokens, options, sinks, num-fmt | reg/mem/mnemonic formatting |
|
||||
| Register scheme | ◑ split | layout + `reg_hw`/`reg_class` | classes, enums, names, ext bits |
|
||||
| Operand model | ◑ split | kind tag + union discipline + `size` carrier | `Memory`, `flags` payloads |
|
||||
| Encode/decode **driver** | ◑ shared via generics | two-pass loops, label/reloc resolution | the per-instruction hook |
|
||||
| `Instruction` | ✗ per-arch | shape convention only | concrete struct |
|
||||
| `Mnemonic` | ✗ per-arch | convention (u16, INVALID=0) | the enum |
|
||||
| `Encoding` + tables | ✗ per-arch | codegen *pattern* | schema + data |
|
||||
| `encode_one`/`decode_one` | ✗ per-arch | nothing | all of it |
|
||||
| Memory addressing | ✗ per-arch | operand *role* | the model |
|
||||
|
||||
---
|
||||
|
||||
## 3. Why not the "obvious" unifications
|
||||
|
||||
Three tempting designs that **violate** the no-compromise rule:
|
||||
|
||||
1. **One universal `Operand`/`Memory` for all ISAs.** Forces the union of
|
||||
x86 SIB+segment, ARM writeback+extend, and RISC-V's nothing into a
|
||||
single struct. Bloats every operand, leaks `segment` into RISC-V, and
|
||||
still can't represent ARM writeback cleanly. ✗
|
||||
|
||||
2. **A runtime `interface`/vtable the encoder calls per instruction.**
|
||||
Adds an indirect call to the hottest loop (x86 does ~17 M inst/s — a
|
||||
per-instruction `proc` pointer is a measurable tax) and defeats
|
||||
inlining. ✗ on the default path.
|
||||
|
||||
3. **`any`/tagged-union `Instruction` passed through a generic `encode`.**
|
||||
Same monomorphization loss + runtime type checks in the hot loop. ✗
|
||||
|
||||
The design instead gets uniformity from **compile-time** mechanisms
|
||||
(naming contract + parametric polymorphism), and reserves runtime dispatch
|
||||
for an **opt-in** facade (§5.3) that only multi-target *tools* pay for.
|
||||
|
||||
---
|
||||
|
||||
## 4. Proposed package layout
|
||||
|
||||
```
|
||||
rexcode/
|
||||
isa/ # shared, architecture-independent core
|
||||
labels.odin # Label, Label_Definition, Label_Map, resolution
|
||||
reloc.odin # Relocation (type field is generic/u8)
|
||||
status.odin # Result, Error, shared Error_Code core
|
||||
print.odin # Token, Token_Kind, Print_Options, sinks, num-fmt
|
||||
register.odin # distinct-u16 layout convention + reg_hw/reg_class
|
||||
pipeline.odin # parametric encode_stream/decode_stream (§7)
|
||||
target.odin # optional runtime Target vtable (§5.3)
|
||||
|
||||
x86/ # exists today; refactor to import isa
|
||||
registers.odin operands.odin instructions.odin mnemonics.odin
|
||||
encoding_types.odin encoder.odin decoder.odin printer.odin
|
||||
encoding_table.odin decoding_tables.odin mnemonic_builders.odin
|
||||
tests/ tools/
|
||||
|
||||
riscv/ # next: same shape as x86/
|
||||
registers.odin operands.odin instructions.odin mnemonics.odin
|
||||
encoding_types.odin encoder.odin decoder.odin printer.odin
|
||||
encoding_table.odin decoding_tables.odin mnemonic_builders.odin
|
||||
tests/ tools/
|
||||
|
||||
arm64/ mips/ … # future, same template
|
||||
```
|
||||
|
||||
- **`isa` depends on nothing.** Each arch package depends on `isa` and
|
||||
**re-exports** the shared types (e.g. `x86.Result`, `x86.Label_Map`)
|
||||
so a consumer of `x86` sees one coherent namespace and never imports
|
||||
`isa` directly unless writing arch-generic tooling.
|
||||
- Each arch package is **self-contained** (its own tests/tools), matching
|
||||
the move already done for x86.
|
||||
|
||||
---
|
||||
|
||||
## 5. Three layers of generality (pick per use case)
|
||||
|
||||
### 5.1 Layer A — direct single-arch use (default, zero overhead)
|
||||
|
||||
```odin
|
||||
import "rexcode/x86"
|
||||
code: [4096]u8
|
||||
res := x86.encode(insts[:], labels[:], code[:], &relocs, &errors)
|
||||
```
|
||||
Fully static, fully inlined, exactly as fast as today. **99% of consumers
|
||||
live here.**
|
||||
|
||||
### 5.2 Layer B — source-portable code via the naming contract
|
||||
|
||||
Because every arch package exposes the *same names with the same
|
||||
signatures* (§6), code that only touches the shared vocabulary
|
||||
(`Label_Map`, `encode`, `tprint`, `Result`, `Relocation`) can be written
|
||||
against `import arch "rexcode/x86"` and re-pointed at `rexcode/riscv` by
|
||||
changing one import — as long as the arch-specific operand construction is
|
||||
isolated (e.g. behind your own per-arch helper). Still 100% compile-time,
|
||||
zero overhead.
|
||||
|
||||
### 5.3 Layer C — runtime multi-target facade (opt-in, for tools)
|
||||
|
||||
For a disassembler or JIT that selects the arch *at runtime*, `isa`
|
||||
provides a vtable populated by each arch:
|
||||
|
||||
```odin
|
||||
// isa/target.odin
|
||||
Target :: struct {
|
||||
name: string,
|
||||
decode: proc(data: []u8, out: ^Decoded) -> Result, // bytes → generic Decoded
|
||||
print: proc(d: ^Decoded, opts: ^Print_Options) -> string,
|
||||
inst_align: u32, // 1 for x86, 4 for riscv/arm64/mips
|
||||
max_inst: u32, // 15 for x86, 4 for riscv (8 for C-pairs), 4 for arm64
|
||||
}
|
||||
// each arch: x86.TARGET: isa.Target = { … }
|
||||
```
|
||||
This boundary trades in **bytes and a generic `Decoded` view**, not the
|
||||
concrete `Instruction`, so it never forces a unified instruction struct.
|
||||
It carries a proc-pointer indirection — acceptable for a tool that has
|
||||
already paid a `switch arch` somewhere, and never on Layer A's path.
|
||||
|
||||
---
|
||||
|
||||
## 6. The naming contract (the most important artifact)
|
||||
|
||||
Every architecture package **MUST** expose these names with these
|
||||
signatures. This is what makes the family feel like one library and what
|
||||
the RISC-V implementation is built against as a checklist.
|
||||
|
||||
### Types (concrete per arch, identical names)
|
||||
|
||||
```
|
||||
Register Memory Operand Operand_Kind
|
||||
Instruction Mnemonic Encoding Instruction_Info
|
||||
```
|
||||
|
||||
### Re-exported shared types (from `isa`)
|
||||
|
||||
```
|
||||
Label Label_Definition Label_Map LABEL_UNDEFINED
|
||||
Relocation Relocation_Type Error Error_Code Result
|
||||
Token Token_Kind Print_Options DEFAULT_PRINT_OPTIONS
|
||||
```
|
||||
|
||||
### Operand constructors
|
||||
|
||||
```
|
||||
op_reg(r) op_mem(m, size) op_imm(v, size) op_label(id, size)
|
||||
mem_*(…) # arch-specific set; at minimum mem_base_disp
|
||||
# (mem_base in x86 is an accessor, not a constructor;
|
||||
# use mem_base_only for the no-displacement case)
|
||||
op_<class>(typed) # typed safe constructors where the arch has classes
|
||||
```
|
||||
|
||||
### Instruction builders & emitters
|
||||
|
||||
Builder names spell out each operand kind separated by underscores
|
||||
(matches x86's existing convention):
|
||||
|
||||
```
|
||||
inst_none / inst_r / inst_r_r / inst_r_i / inst_r_m / inst_m_r / …
|
||||
emit_none / emit_r / emit_rr / emit_ri / emit_rm / emit_mr / …
|
||||
# NB: emit_* uses concatenated suffixes (legacy x86 spelling)
|
||||
inst_<mnemonic>(…) / emit_<mnemonic>(…) # generated typed overloads
|
||||
```
|
||||
|
||||
### Entry points (identical signatures across arches)
|
||||
|
||||
```odin
|
||||
encode(instructions: []Instruction, label_defs: []Label_Definition,
|
||||
code: []u8, relocs: ^[dynamic]Relocation, errors: ^[dynamic]Error,
|
||||
resolve := true, base_address: u64 = 0) -> Result
|
||||
|
||||
decode(data: []u8, relocs: []Relocation,
|
||||
instructions: ^[dynamic]Instruction, inst_info: ^[dynamic]Instruction_Info,
|
||||
label_defs: ^[dynamic]Label_Definition, errors: ^[dynamic]Error) -> Result
|
||||
|
||||
print/println/aprint/tprint/bprint/fprint/wprint(+ln)(
|
||||
instructions: []Instruction, inst_info: []Instruction_Info,
|
||||
label_defs: []Label_Definition, tokens=nil, options=nil, label_names=nil)
|
||||
```
|
||||
|
||||
### Register/label/print helpers
|
||||
|
||||
```
|
||||
reg_hw reg_class reg_size register_name mnemonic_to_string
|
||||
label label_forward label_named label_reserve label_set
|
||||
```
|
||||
|
||||
> Anything an arch genuinely lacks (e.g. RISC-V has no `mem_base_index`)
|
||||
> is simply **absent**, not stubbed. Portable (Layer B) code stays within
|
||||
> the intersection; arch-aware code uses the extras.
|
||||
|
||||
---
|
||||
|
||||
## 7. Zero-cost code reuse via parametric polymorphism
|
||||
|
||||
The encode/decode **drivers** are arch-independent control flow. Factor
|
||||
them into `isa` as procedures generic over the instruction type `$I`,
|
||||
parameterized by an arch-provided per-instruction hook. Odin monomorphizes
|
||||
these at compile time → **no runtime cost, real code sharing.**
|
||||
|
||||
```odin
|
||||
// isa/pipeline.odin (sketch)
|
||||
encode_stream :: proc(
|
||||
instructions: []$I,
|
||||
label_defs: []Label_Definition,
|
||||
code: []u8,
|
||||
relocs: ^[dynamic]Relocation,
|
||||
errors: ^[dynamic]Error,
|
||||
encode_one: proc(inst: ^I, out: []u8, code_pos: u32,
|
||||
relocs: ^[dynamic]Relocation, errors: ^[dynamic]Error) -> (n: u32, ok: bool),
|
||||
resolve := true, base_address: u64 = 0,
|
||||
) -> Result {
|
||||
// PASS 1: for each inst → record offset, call encode_one, advance
|
||||
// PASS 1.5: rewrite label_defs inst-index → byte-offset (identical on every arch)
|
||||
// PASS 2: resolve relocations / patch / spill unresolved (identical on every arch)
|
||||
}
|
||||
```
|
||||
|
||||
x86's current `encode()` becomes a thin wrapper that passes its
|
||||
`encode_one` (the prefix/ModRM/SIB body); RISC-V's wrapper passes its
|
||||
12-line bitfield packer. The label/relocation machinery — the part that's
|
||||
easy to get subtly wrong — is written and tested **once**.
|
||||
|
||||
Caveats (arch-specific passes that stay out of the shared driver):
|
||||
- **RISC-V pseudo-ops** (`li`, `call`, `la`, `j`) expand to 1–2 real
|
||||
instructions; needs an arch pre-lowering pass.
|
||||
- **Branch relaxation** (short↔long form) is arch-specific.
|
||||
- **ARM literal pools / constant islands** are an extra emission phase.
|
||||
|
||||
These plug in *around* the shared driver, not inside it.
|
||||
|
||||
---
|
||||
|
||||
## 8. Concrete RISC-V mapping (RV64GC as the first target)
|
||||
|
||||
What each contract item becomes, to validate the design before coding:
|
||||
|
||||
| Contract item | RISC-V realization |
|
||||
|---|---|
|
||||
| `Register` | `distinct u16`, classes `REG_X` (x0–31), `REG_F` (f0–31), `REG_V` (v0–31). No REX/EVEX bits. `x0` semantic = zero. |
|
||||
| typed enums | `XREG{ZERO,RA,SP,GP,TP,T0,T1,T2,S0,S1,A0..A7,S2..S11,T3..T6}`, `FREG`, `VREG` |
|
||||
| `Memory` | `struct { base: Register, disp: i32 }` — no index/scale/segment |
|
||||
| `mem_*` | `mem_base(base)`, `mem_base_disp(base, disp)` only |
|
||||
| `Operand` | same kind-tagged shape; `size` mostly informational (width is in the mnemonic) |
|
||||
| `Mnemonic` | `enum u16` — RV32I/64I + M,A,F,D,C,V (`ADDI, LW, LD, BEQ, JAL, AUIPC, FADD_D, …`) |
|
||||
| `Encoding` | `struct { format: Format, opcode, funct3, funct7: u8, … }`, `Format{R,I,S,B,U,J,R4,…}` |
|
||||
| `encode_one` | switch on `format`, pack fields, scatter immediate bits |
|
||||
| `Encoding_Flags` | tiny (e.g. `is_compressible`, `rounding_ok`) vs x86's 11 fields |
|
||||
| `Relocation_Type` | `R_RISCV_BRANCH, JAL, CALL, PCREL_HI20, PCREL_LO12_I/S, HI20, LO12_I/S, RVC_BRANCH/JUMP, …` |
|
||||
| `Instruction_Info` | `offset`, `is_compressed: bool`, rounding mode — no prefix/VEX fields |
|
||||
| printer | `register_name` uses ABI names; `print_memory` emits `disp(base)`; width lives in the mnemonic (no `.b/.w` suffix) |
|
||||
| tables | `gen_decode_tables` becomes near-trivial: a fixed-field instruction decodes by `(opcode, funct3, funct7)` keys |
|
||||
| `MAX_INST_SIZE` | `4` (or `8` to cover a compressed pair); `inst_align` = 2 |
|
||||
|
||||
Notable RISC-V-only concerns the design already accommodates:
|
||||
- **Split immediates** → hidden in `encode_one`; operand stays a clean value.
|
||||
- **Paired PC-relative relocs** (AUIPC+ADDI) → expressed via the shared
|
||||
`Relocation` struct with RISC-V's type enum; resolution of the *pair* is
|
||||
a RISC-V detail layered on the shared reloc list.
|
||||
- **Compressed (C) extension** → variable 2/4-byte width handled by
|
||||
`decode_one` returning a length, exactly like x86's variable length —
|
||||
the shared decode driver already threads instruction length.
|
||||
|
||||
If RISC-V slots cleanly into the contract (it does above), the contract is
|
||||
sound for the regular fixed-width ISAs (ARM64, MIPS) too.
|
||||
|
||||
---
|
||||
|
||||
## 9. Recommended next steps
|
||||
|
||||
1. **Stabilize x86 first.** Resolve the constructor-rename drift noted in
|
||||
[x86_api.md](x86_api.md#known-drift) (tests/README vs `operands.odin`)
|
||||
so x86 is the clean reference the contract is extracted from.
|
||||
2. **Extract `isa`** by lifting the *already-arch-independent* files:
|
||||
`labels.odin`, the `Relocation`/`Error`/`Result` types, and the printer
|
||||
framework (tokens/options/sinks/number-formatting). Make `x86`
|
||||
re-export them. This is a low-risk refactor that proves the split.
|
||||
3. **Add the parametric `encode_stream`/`decode_stream`** to `isa` and
|
||||
reduce x86's `encode`/`decode` to wrappers. Validate against the
|
||||
existing test suite (same bytes out).
|
||||
4. **Write the RISC-V package against the contract** (§6) and the mapping
|
||||
(§8), reusing `isa` wholesale. Build its `encoding_table.odin` by hand,
|
||||
then port the two generators.
|
||||
5. **Only if a runtime-multi-target tool appears**, add the `Target`
|
||||
vtable (§5.3). Don't build it speculatively.
|
||||
|
||||
The deliverable order matters: every step is independently shippable, and
|
||||
x86 keeps working (and keeps its performance) throughout.
|
||||
|
||||
---
|
||||
|
||||
## 10. One-paragraph summary
|
||||
|
||||
Make `isa` own the parts that are the same on every ISA — labels,
|
||||
relocations, errors/result, the print framework, and (via Odin
|
||||
parametric polymorphism) the encode/decode driver loops. Make each arch
|
||||
package own its registers, memory model, operands, mnemonics, encoding
|
||||
tables, and the actual `encode_one`/`decode_one` bytes. Bind the family
|
||||
together with a strict **naming contract** so packages are drop-in
|
||||
swappable at source level with zero runtime cost, and reserve a single
|
||||
opt-in runtime `Target` vtable for the rare tool that needs to choose an
|
||||
architecture dynamically. x86 keeps every cycle of its current
|
||||
performance; RISC-V (and later ARM/MIPS) gets the boring 60% for free and
|
||||
writes only the 40% that is genuinely its own.
|
||||
79
core/rexcode/docs/mips_platforms.md
Normal file
79
core/rexcode/docs/mips_platforms.md
Normal file
@@ -0,0 +1,79 @@
|
||||
# MIPS targets and extensions — platform catalog
|
||||
|
||||
> What's worth supporting in `rexcode/mips/` (or a sibling subpackage) and
|
||||
> what isn't, framed around the actual hardware that runs MIPS.
|
||||
|
||||
## Mainline consoles (MIPS-family CPUs)
|
||||
|
||||
| Platform | CPU | Base ISA | Custom extension | Status |
|
||||
|---|---|---|---|---|
|
||||
| **PS1 / PSX** | Sony R3000A | MIPS I (no MMU) | **GTE** (COP2) — geometry transformation engine | ✅ done |
|
||||
| **PSX IOP / PS3 IOP** | LSI CW33300 / "IOP" | MIPS I | (none — same as PS1 CPU) | ✅ covered by MIPS I |
|
||||
| **N64** | NEC VR4300i | MIPS III + partial MIPS IV FPU | none on main CPU | ✅ covered by MIPS III + IV + FPU |
|
||||
| **N64 RSP** | RCP "Reality Signal Processor" | custom MIPS R4000 subset | **VU** (128-bit vector unit, 32 vec regs); also drops mult/div/FPU/TLB | ⚠ **needs its own subpackage** — different ISA |
|
||||
| **N64 RDP** | (display processor) | not a CPU, command-stream — not in scope | | |
|
||||
| **PS2 EE** | Sony R5900 (Toshiba) | MIPS III + MIPS IV (MOVN/MOVZ) | **MMI** (128-bit packed SIMD via MMI0-3), **LQ/SQ**, second HI/LO, VU0-macro | ✅ done (MMI; VU0-macro forms TBD) |
|
||||
| **PS2 VU0 / VU1** | "Vector Unit" | not MIPS — VLIW pair (upper + lower microcode) | — | 🚧 **separate ISA** — sibling `vu/` subpackage if needed |
|
||||
| **PS2 IOP** | (R3000A reused) | MIPS I | — | ✅ covered |
|
||||
| **PSP** | Sony "Allegrex" | MIPS32 R2 (+ R2 bitfield + rotates + SEB/SEH + BITREV) | **VFPU** (vector FPU, 128 32-bit regs in 8×4×4 matrices), Allegrex-specific BITREV/etc. | ⚠ Mnemonics enumerated, encodings TBD |
|
||||
| **PSP Media Engine** | (second Allegrex) | same as Allegrex | same VFPU | (covered when PSP CPU is) |
|
||||
| **PSV / Vita PS1-mode** | Cortex-A9 emulating R3000 | — (host is ARM) | — | |
|
||||
|
||||
## Arcade and other
|
||||
|
||||
| Platform | CPU | Base ISA | Extension | Status |
|
||||
|---|---|---|---|---|
|
||||
| **SNK Hyper Neo Geo 64** | NEC VR4300 | MIPS III | none | ✅ covered |
|
||||
| **Konami Hornet** (arcade) | various | MIPS-family | none | ✅ covered |
|
||||
| **Sega Model 3** step 1.x | MIPS — IDT R5000 | MIPS IV | none | ✅ covered |
|
||||
|
||||
## Modern / embedded MIPS with vendor extensions
|
||||
|
||||
| Platform | CPU | Base | Extension | Status |
|
||||
|---|---|---|---|---|
|
||||
| **Ingenic XBurst** (Jz47xx) — old MP3/Android handhelds | XBurst | MIPS32 R2 | **MXU** (Multimedia Unit, custom SIMD), DSP ASE | 🚧 DSP enumerated, **MXU is XBurst-only** — defer |
|
||||
| **Broadcom MIPS** (older routers) | bcm473x / bcm63xx | MIPS32 R2/R5 | DSP ASE common | DSP enumerated; encodings TBD |
|
||||
| **Atheros / Qualcomm** (router SoC) | MIPS32 R2 | MIPS32 R2 | DSP common | as above |
|
||||
| **MediaTek MIPS** (older routers) | MIPS32 R2 | MIPS32 R2 | DSP | as above |
|
||||
| **Loongson 2/3** (China desktop) | Loongson | MIPS64 + custom | **Loongson MMI** (note: different from PS2 MMI!), **LSX** (128-bit), **LASX** (256-bit). Modern Loongson uses LoongArch instead. | 🚧 niche, defer |
|
||||
| **Microchip PIC32** | MIPS M4K / microAptiv | MIPS32 R1/R2 + microMIPS | none | ✅ covered (microMIPS not in scope) |
|
||||
| **Cavium Octeon** (server) | OCTEON | MIPS64 R2 | **OCTEON specific** (crypto, packet) | defer |
|
||||
|
||||
## Workstations (historical)
|
||||
|
||||
| Vendor | CPU | ISA | Notes |
|
||||
|---|---|---|---|
|
||||
| SGI Indy/Indigo/Octane/Origin | R4000/R5000/R8000/R10000/R12000/R14000 | MIPS III–IV | stock MIPS — ✅ covered |
|
||||
| DEC station | R3000 / R4000 | MIPS I–III | ✅ covered |
|
||||
| Various Unix workstations | MIPS family | various | ✅ covered |
|
||||
|
||||
## **NOT** MIPS (mentioned because users sometimes ask)
|
||||
|
||||
- **GBA / DS / 3DS / Switch** — ARM. Out of scope for `mips/`.
|
||||
- **Sega Saturn** — dual SH-2. **Dreamcast** — SH-4. Not MIPS.
|
||||
- **3DO** — ARM60. Not MIPS.
|
||||
- **Atari Jaguar** — 68k + custom Tom/Jerry RISCs. Not MIPS.
|
||||
- **Apple PowerBook / Macintosh** — PowerPC / Motorola 68k. Not MIPS.
|
||||
- **Sega Genesis / Mega Drive** — 68000. **Sega 32X** — SH-2. **Sega CD** — 68k. Not MIPS.
|
||||
|
||||
## Recommended priority for `rexcode`
|
||||
|
||||
Given typical demand (emulation, decompiling old console games, romhacking, RE):
|
||||
|
||||
1. **What's done is the bulk of console value:** PS1, PS2, N64 main CPU, FPU, COP0.
|
||||
2. **N64 RSP** — high value for N64 emulation/microcode work. Should be `rexcode/rsp/` (separate ISA — see below).
|
||||
3. **PSP VFPU encodings** — high value for PSP emulation, completes the Allegrex story. Stays inside `mips/`.
|
||||
4. **DSP ASE encodings** — useful for modern router/embedded reversing. Stays inside `mips/`.
|
||||
5. **PS2 VU microcode** — distinct from MIPS (VLIW). Worth `rexcode/vu/` only if a real consumer appears.
|
||||
6. **MSA encodings** — modern MIPS only; some Linux distros for MIPS workstations. Lower priority.
|
||||
7. **Loongson / Octeon / MXU** — defer until someone needs them.
|
||||
|
||||
## Why N64 RSP wants its own subpackage
|
||||
|
||||
The RSP is a **subset** of MIPS (no MULT/DIV/FPU/TLB; no doubleword ops) **plus** a heavily custom COP2 vector unit. Trying to share `mips/` with it would mean:
|
||||
|
||||
- The shared Mnemonic enum picks up ~60 RSP-only vector ops (VMULF/VMACF/VADDC/VCH/VCL/VCR/VRCP/VRCPL/VRSQ/VRSQL/VRNDP/VRNDN/...) plus vector load/store variants (LBV/LSV/LDV/LQV/LRV/LPV/LUV/LHV/LFV/LWV/LTV + their store equivalents). Polluting the MIPS namespace.
|
||||
- The RSP's COP2 encoding *collides* with PS1 GTE bit patterns (both use op=0x12 with the CO bit) so a single decode table can't disambiguate without an ISA gate.
|
||||
- The RSP's vector loads encode element offset + size in the cofun bits in ways that have no MIPS analogue.
|
||||
|
||||
Cleaner: `rexcode/rsp/` as a sibling subpackage. It will reuse `isa/` (labels, relocs, errors, print framework) and parallel `mips/`'s shape (registers / operands / instructions / mnemonics / encoding_table / encoder / decoder / printer). Users targeting N64 import either `mips` (for the R4300 main CPU) or `rsp` (for RSP microcode) — or both, side-by-side.
|
||||
518
core/rexcode/docs/x86_api.md
Normal file
518
core/rexcode/docs/x86_api.md
Normal file
@@ -0,0 +1,518 @@
|
||||
# rexcode `x86` — Complete API Extraction
|
||||
|
||||
> Snapshot of the entire public surface of the `x86` subpackage
|
||||
> (`rexcode/x86/`), grouped by module. This is the reference the
|
||||
> cross-architecture design ([cross_arch_design.md](cross_arch_design.md))
|
||||
> is built against.
|
||||
|
||||
The package is **table-driven**: a hand-written master encoding table
|
||||
(`ENCODING_TABLE`) is the single source of truth, from which the decode
|
||||
tables and the typed builder procedures are *generated*. The runtime is
|
||||
zero-allocation (caller owns every buffer) and the hot paths are fully
|
||||
inlined.
|
||||
|
||||
```
|
||||
ENCODING_TABLE (hand-written, source of truth)
|
||||
│
|
||||
┌───────────────┼────────────────┐
|
||||
gen_decode_tables gen_mnemonic_builders
|
||||
│ │
|
||||
decoding_tables.odin mnemonic_builders.odin
|
||||
(decode() reads these) (typed inst_*/emit_* helpers)
|
||||
```
|
||||
|
||||
Pipeline at a glance:
|
||||
|
||||
```
|
||||
[]Instruction ──encode()──▶ []u8 (+ []Relocation, []Error)
|
||||
▲ │
|
||||
│ ▼
|
||||
builders decode()
|
||||
│ │
|
||||
inst_*/emit_* ▼
|
||||
[]Instruction + []Instruction_Info + []Label_Definition
|
||||
│
|
||||
▼
|
||||
print()/tprint()/… ──▶ text (+ []Token)
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 1. Registers (`registers.odin`)
|
||||
|
||||
### Core type
|
||||
|
||||
```odin
|
||||
Register :: distinct u16 // bit layout: 0b_0000_CCCC_EEEN_NNNN
|
||||
// NNNNN = hardware register number (0–31)
|
||||
// E = needs REX/VEX .B/.R/.X extension (hw >= 8)
|
||||
// EE = needs EVEX (hw 16–31)
|
||||
// CCCC = register class (high byte)
|
||||
```
|
||||
|
||||
### Class constants (high byte)
|
||||
|
||||
`REG_NONE`, `REG_GPR64`, `REG_GPR32`, `REG_GPR16`, `REG_GPR8`, `REG_GPR8H`
|
||||
(legacy AH/CH/DH/BH), `REG_XMM`, `REG_YMM`, `REG_ZMM`, `REG_K` (opmask),
|
||||
`REG_SEG`, `REG_CR` (control), `REG_DR` (debug), `REG_BND` (MPX), `REG_MM`
|
||||
(MMX), `REG_ST` (x87).
|
||||
|
||||
### Sentinels
|
||||
|
||||
`NONE :: Register(0xFFFF)`, `RIP :: Register(0xFFFE)`.
|
||||
|
||||
### Typed register enums (compile-time safety, value == hardware number)
|
||||
|
||||
`GPR64`, `GPR32`, `GPR16`, `GPR8`, `GPR8H` (`AH=4..BH=7`), `XMM`, `YMM`,
|
||||
`ZMM` (each 0–31), `KREG` (K0–K7), `SREG` (ES,CS,SS,DS,FS,GS), `MM`
|
||||
(MM0–7), `CREG` (CR0,2,3,4,8), `DREG` (DR0–3,6,7), `ST` (ST0–7), `BND`
|
||||
(BND0–3).
|
||||
|
||||
### Named register constants
|
||||
|
||||
Every register has a package-level constant: `RAX`…`R15`, `EAX`…`R15D`,
|
||||
`AX`…`R15W`, `AL`…`R15B`, `AH/CH/DH/BH`, `XMM0`…`XMM31`, `YMM0`…`YMM31`,
|
||||
`ZMM0`…`ZMM31`, `K0`…`K7`, `ES/CS/SS/DS/FS/GS`, `CR0/2/3/4/8`,
|
||||
`DR0/1/2/3/6/7`, `BND0`…`BND3`, `MM0`…`MM7`, `ST0`…`ST7`, plus `RIP`.
|
||||
|
||||
### Utility functions (all branchless, `contextless`)
|
||||
|
||||
| Proc | Signature | Purpose |
|
||||
|---|---|---|
|
||||
| `reg_hw` | `(Register) -> u8` | hardware number (low 5 bits) |
|
||||
| `reg_class` | `(Register) -> u16` | class (high byte) |
|
||||
| `reg_needs_rex` | `(Register) -> bool` | hw >= 8 |
|
||||
| `reg_needs_rex_ext` | `(Register) -> bool` | hw >= 8 and class < K |
|
||||
| `reg_needs_evex` | `(Register) -> bool` | hw >= 16 |
|
||||
| `reg_is_gpr` | `(Register) -> bool` | any GPR class |
|
||||
| `reg_is_vector` | `(Register) -> bool` | XMM/YMM/ZMM |
|
||||
| `reg_is_high_byte` | `(Register) -> bool` | AH/CH/DH/BH |
|
||||
| `reg_size` | `(Register) -> u16` | size in **bits** |
|
||||
|
||||
### Register-from-number constructors
|
||||
|
||||
`gpr64_from_num`, `gpr32_from_num`, `gpr16_from_num` `(u8) -> Register`;
|
||||
`gpr8_from_num(num: u8, has_rex: bool) -> Register` (handles AH↔SPL
|
||||
aliasing); `xmm_from_num`, `ymm_from_num`, `zmm_from_num`,
|
||||
`mm_from_num`. Each returns `NONE` if out of range. Pure casts, no table.
|
||||
|
||||
---
|
||||
|
||||
## 2. Operands (`operands.odin`)
|
||||
|
||||
### Operand kind
|
||||
|
||||
```odin
|
||||
Operand_Kind :: enum u8 { NONE, REGISTER, MEMORY, IMMEDIATE, RELATIVE }
|
||||
```
|
||||
|
||||
### Memory operand (packed)
|
||||
|
||||
```odin
|
||||
Memory :: bit_field u64 {
|
||||
base_hw: u8 | 5,
|
||||
base_ext: bool | 1,
|
||||
index_hw: u8 | 5,
|
||||
index_ext: bool | 1,
|
||||
scale_enc: u8 | 2,
|
||||
displacement: i32 | 32,
|
||||
segment: u8 | 3,
|
||||
addr_size_override: bool | 1,
|
||||
base_class: u8 | 5,
|
||||
index_class: u8 | 5,
|
||||
}
|
||||
MEM_BASE_RIP :: 30 MEM_BASE_NONE :: 31 MEM_INDEX_NONE :: 31
|
||||
```
|
||||
|
||||
**Constructor:** `mem_make(base, index: Register, scale: u8, displacement: i32, segment: Register) -> Memory`
|
||||
|
||||
**Convenience constructors** (current names after the in-tree refactor):
|
||||
`mem_base_only(base)`, `mem_base_disp(base, disp)`,
|
||||
`mem_base_index(base, index, scale)`,
|
||||
`mem_base_index_disp(base, index, scale, disp)`, `mem_rip_disp(disp)`.
|
||||
|
||||
> ⚠️ The README and `tests/test.odin` still use the *old* names
|
||||
> (`mem_base`, `mem_base_displacement`, `mem_base_index_displacement`,
|
||||
> `mem_rip_relative`). `mem_base` is now an **accessor**, not a
|
||||
> constructor. See the "Known drift" note at the end.
|
||||
|
||||
**Accessors:** `mem_scale`, `mem_is_rip_relative`, `mem_has_base`,
|
||||
`mem_has_index` `(Memory) -> …`; `mem_base`, `mem_index` `(Memory) -> Register`.
|
||||
|
||||
### The unified operand
|
||||
|
||||
```odin
|
||||
Operand :: struct #packed { // 16 bytes
|
||||
using _: struct #raw_union {
|
||||
reg: Register,
|
||||
mem: Memory,
|
||||
immediate: i64,
|
||||
relative: i64, // offset or label id
|
||||
},
|
||||
kind: Operand_Kind,
|
||||
size: u8, // operand size in bytes (1,2,4,8,16,32,64)
|
||||
flags: Operand_Flags,
|
||||
_pad: [4]u8,
|
||||
}
|
||||
|
||||
Broadcast :: enum u8 { NONE, B1TO2, B1TO4, B1TO8, B1TO16 } // EVEX
|
||||
|
||||
Operand_Flags :: bit_field u16 { // EVEX-specific
|
||||
mask: u8 | 3, // opmask K1–K7
|
||||
zeroing: bool | 1, // merge vs zero masking
|
||||
broadcast: Broadcast | 3,
|
||||
er_sae: u8 | 2, // embedded rounding / SAE
|
||||
}
|
||||
```
|
||||
|
||||
### Generic operand constructors
|
||||
|
||||
`op_reg(r)`, `op_mem(m, size)`, `op_mem_from_parts(base, index, scale, disp, size)`,
|
||||
`op_imm8/16/32/64(v)`, `op_rel8/32(offset)`, `op_label(label_id, size=4)`.
|
||||
|
||||
### Typed operand constructors (compile-time class safety)
|
||||
|
||||
`op_gpr64`, `op_gpr32`, `op_gpr16`, `op_gpr8`, `op_gpr8h`, `op_xmm`,
|
||||
`op_ymm`, `op_zmm`, `op_kreg`, `op_sreg`, `op_mm`, `op_creg`, `op_dreg`,
|
||||
`op_st`, `op_bnd` — each takes the matching typed enum and returns an
|
||||
`Operand` (e.g. `op_gpr64(.XMM0)` is a *compile error*).
|
||||
|
||||
---
|
||||
|
||||
## 3. Instructions (`instructions.odin`)
|
||||
|
||||
```odin
|
||||
Rep :: enum u8 { NONE, REP, REPNE }
|
||||
|
||||
Instruction_Flags :: bit_field u8 {
|
||||
lock: bool|1, rep: Rep|2, segment: u8|3, addr32: bool|1, data16: bool|1,
|
||||
}
|
||||
|
||||
Instruction :: struct #packed { // 72 bytes
|
||||
ops: [4]Operand,
|
||||
mnemonic: Mnemonic,
|
||||
operand_count: u8,
|
||||
flags: Instruction_Flags,
|
||||
length: u8, // filled by decoder
|
||||
_pad: [3]u8,
|
||||
}
|
||||
```
|
||||
|
||||
### Generic instruction builders (`inst_*`, all `contextless`)
|
||||
|
||||
| Builder | Shape |
|
||||
|---|---|
|
||||
| `inst_none(m)` | no operands |
|
||||
| `inst_r(m, r)` | one register |
|
||||
| `inst_m(m, mem, size)` | one memory |
|
||||
| `inst_i(m, imm, imm_size)` | one immediate |
|
||||
| `inst_rel(m, label_id, size=4)` | branch to label |
|
||||
| `inst_rel_offset(m, offset, size)` | branch to raw offset |
|
||||
| `inst_r_r(m, dst, src)` | reg, reg |
|
||||
| `inst_r_m(m, dst, src_mem, size)` | reg, mem |
|
||||
| `inst_m_r(m, dst_mem, size, src)` | mem, reg |
|
||||
| `inst_r_i(m, dst, imm, imm_size)` | reg, imm |
|
||||
| `inst_m_i(m, dst_mem, size, imm, imm_size)` | mem, imm |
|
||||
| `inst_r_r_r(m, dst, s1, s2)` | 3× reg (VEX/EVEX) |
|
||||
| `inst_r_r_m(m, dst, s1, m2, size)` | reg, reg, mem |
|
||||
| `inst_r_r_i(m, dst, src, imm, imm_size)` | reg, reg, imm |
|
||||
| `inst_r_m_i(m, dst, m, msize, imm, isize)` | reg, mem, imm |
|
||||
| `inst_m_r_i(m, mem, msize, src, imm, isize)` | mem, reg, imm |
|
||||
| `inst_r_m_r(m, dst, m1, msize, s2)` | reg, mem, reg |
|
||||
| `inst_r_r_r_r(m, dst, s1, s2, s3)` | 4× reg |
|
||||
| `inst_r_r_r_i(m, dst, s1, s2, imm, isize)` | 3 reg + imm |
|
||||
| `inst_r_r_m_i(m, dst, s1, m2, msize, imm, isize)` | 2 reg + mem + imm |
|
||||
| `inst_r_r_m_r(m, dst, s1, m2, msize, s3)` | 2 reg + mem + reg |
|
||||
|
||||
### Dynamic-array emitters (`emit_*`, in `encoder.odin`)
|
||||
|
||||
One `emit_*` per `inst_*` shape: `emit_none, emit_r, emit_rr, emit_ri,
|
||||
emit_rm, emit_mr, emit_m, emit_mi, emit_rel, emit_rrr, emit_rrm, emit_rri,
|
||||
emit_rrrr, emit_i, emit_rmi, emit_mri, emit_rel_offset`. Each is
|
||||
`(instructions: ^[dynamic]Instruction, mnemonic, …)` and appends.
|
||||
|
||||
---
|
||||
|
||||
## 4. Mnemonics (`mnemonics.odin`, generated)
|
||||
|
||||
```odin
|
||||
Mnemonic :: enum u16 { INVALID = 0, MOV, MOVABS, MOVZX, …, /* ~1176 total */ }
|
||||
```
|
||||
|
||||
Grouped by family (data transfer, arithmetic, logical, …, SSE, AVX,
|
||||
AVX-512, BMI, FMA, AES, …). `INVALID = 0` is the sentinel.
|
||||
|
||||
---
|
||||
|
||||
## 5. Labels & references (`labels.odin`)
|
||||
|
||||
Lightweight **array-index** model (`Label_Definition`) used by
|
||||
`encode()`/`decode()`. The label-construction procedures live in
|
||||
`isa/labels.odin` and are parametric over the Instruction type, so they
|
||||
work directly for any arch without per-arch wrappers.
|
||||
|
||||
### Array-index model (used by encode/decode)
|
||||
|
||||
```odin
|
||||
Label_Definition :: distinct u32 // label_id -> instruction index, then byte offset
|
||||
LABEL_UNDEFINED :: Label_Definition(0xFFFFFFFF)
|
||||
```
|
||||
`label(labels: ^[dynamic]Label_Definition, instructions: ^[dynamic]Instruction) -> u32`
|
||||
(define at current position), `label_forward(labels) -> u32` (reserve).
|
||||
|
||||
### Named labels
|
||||
|
||||
```odin
|
||||
Label_Map :: struct { labels: [dynamic]Label_Definition, names: map[string]u32 }
|
||||
```
|
||||
`label_map_init(^, allocator)`, `label_map_destroy(^)`,
|
||||
`label_named(^, name, instructions) -> u32`, `label_reserve(^, name) -> u32`,
|
||||
`label_set(^, name, instructions)`.
|
||||
|
||||
---
|
||||
|
||||
## 6. Encoding types (`encoding_types.odin`)
|
||||
|
||||
These describe **how** an instruction is encoded; they are the schema of
|
||||
`ENCODING_TABLE` and are shared by encoder and decoder.
|
||||
|
||||
```odin
|
||||
Operand_Type :: enum u8 { // ~70 values
|
||||
NONE, R8,R16,R32,R64, RM8,RM16,RM32,RM64, M,M8..M512,
|
||||
IMM8,IMM16,IMM32,IMM64, IMM8SX, REL8,REL32,
|
||||
AL_IMPL,AX_IMPL,EAX_IMPL,RAX_IMPL,CL_IMPL,DX_IMPL,ONE_IMPL,
|
||||
SREG, CR, DR, XMM,YMM,ZMM, XMM_M32,XMM_M64,XMM_M128,YMM_M256,ZMM_M512,
|
||||
MM,MM_M64, ST0_IMPL,STI, XMM0_IMPL, K,K_M8..K_M64,
|
||||
MOFFS8..MOFFS64, PTR16_16,PTR16_32,PTR16_64, M16_16,M16_32,M16_64,
|
||||
}
|
||||
|
||||
Operand_Encoding :: enum u8 { // where an operand's bits go
|
||||
NONE, MR, REG, VVVV, OP_R, IB,IW,ID,IQ, IMPL, IS4, AAA,
|
||||
}
|
||||
|
||||
Escape :: enum u8 { NONE, _0F, _0F38, _0F3A }
|
||||
VEX_Type :: enum u8 { NONE, VEX, EVEX, XOP }
|
||||
VEX_W :: enum u8 { WIG, W0, W1 }
|
||||
VEX_L :: enum u8 { LIG, L0, L1, L2 }
|
||||
|
||||
Encoding_Flags :: bit_field u16 {
|
||||
esc: Escape|2, prefix: u8|2, vex_type: VEX_Type|2, vex_w: VEX_W|2,
|
||||
vex_l: VEX_L|2, default_64: bool|1, force_rex_w: bool|1, no_rex: bool|1,
|
||||
lock_ok: bool|1, rep_ok: bool|1, modrm_reg_ext: bool|1,
|
||||
}
|
||||
|
||||
Encoding :: struct #packed { // 14 bytes — one encoding form
|
||||
mnemonic: Mnemonic, ops: [4]Operand_Type, enc: [4]Operand_Encoding,
|
||||
opcode: u8, ext: u8, flags: Encoding_Flags,
|
||||
}
|
||||
PREFIX_66 :: 1 PREFIX_F3 :: 2 PREFIX_F2 :: 3
|
||||
```
|
||||
Helper: `encoding_flags(esc=…, prefix=…, …) -> Encoding_Flags`.
|
||||
|
||||
### Shared status / interop types
|
||||
|
||||
```odin
|
||||
Relocation_Type :: enum u8 { NONE, REL8, REL32, ABS32, ABS64 }
|
||||
Relocation :: struct #packed { // 16 bytes (ELF-rela-like)
|
||||
offset: u32, label_id: u32, addend: i32,
|
||||
type: Relocation_Type, size: u8, inst_idx: u16,
|
||||
}
|
||||
|
||||
Error_Code :: enum u8 {
|
||||
NONE,
|
||||
// encode
|
||||
INVALID_MNEMONIC, NO_MATCHING_ENCODING, OPERAND_MISMATCH,
|
||||
IMMEDIATE_OUT_OF_RANGE, BUFFER_OVERFLOW, LABEL_OUT_OF_RANGE,
|
||||
INVALID_OPERAND_COUNT,
|
||||
// decode
|
||||
BUFFER_TOO_SHORT, INVALID_OPCODE, INVALID_MODRM, INVALID_SIB,
|
||||
INVALID_PREFIX, INVALID_VEX, INVALID_EVEX, TOO_MANY_PREFIXES,
|
||||
}
|
||||
Error :: struct #packed { inst_idx: u32, code: Error_Code, _pad: [3]u8 } // 8 bytes
|
||||
Result :: struct { byte_count: u32, success: bool }
|
||||
```
|
||||
Helper: `op_type_to_size(Operand_Type) -> u8`.
|
||||
|
||||
---
|
||||
|
||||
## 7. Encoder (`encoder.odin`)
|
||||
|
||||
```odin
|
||||
MAX_INST_SIZE :: 15
|
||||
|
||||
encode :: proc(
|
||||
instructions: []Instruction,
|
||||
label_defs: []Label_Definition, // in: inst index; MODIFIED to byte offsets
|
||||
code: []u8, // output machine code
|
||||
relocs: ^[dynamic]Relocation, // unresolved relocations appended
|
||||
errors: ^[dynamic]Error,
|
||||
resolve: bool = true, // patch resolvable relocs in place
|
||||
base_address: u64 = 0, // for ABS relocations
|
||||
) -> Result
|
||||
```
|
||||
|
||||
Two-pass: (1) encode each instruction into `code`, recording byte offsets
|
||||
and emitting pending relocations; (1.5) rewrite `label_defs` from
|
||||
instruction indices to byte offsets; (2) resolve relocations, appending
|
||||
the unresolvable ones to `relocs`. Pure / no shared state →
|
||||
trivially parallelizable.
|
||||
|
||||
Buffer-sizing helpers: `encode_max_code_size(n) -> int` (`n*15`),
|
||||
`encode_max_relocation_count(n) -> int` (`n`).
|
||||
|
||||
Internal matcher (file-local, inlined): `encoding_matches_inline`,
|
||||
`operand_matches_inline`, `reg_matches_inline`, `mem_matches_inline`,
|
||||
`imm_matches_inline`, `implicit_operand_matches`, `is_implicit_op_inline`,
|
||||
`get_user_op_inline`.
|
||||
|
||||
---
|
||||
|
||||
## 8. Decoder (`decoder.odin`)
|
||||
|
||||
```odin
|
||||
Instruction_Info :: struct { // parallel metadata, one per decoded inst
|
||||
offset: u32,
|
||||
rex: u8, has_lock: bool, rep: Rep, segment: Register,
|
||||
vex_type: VEX_Type, vex_l: VEX_L, vex_w: VEX_W,
|
||||
evex_b: bool, evex_z: bool, opmask: u8,
|
||||
}
|
||||
|
||||
decode :: proc(
|
||||
data: []u8,
|
||||
relocs: []Relocation, // optional in: name labels
|
||||
instructions: ^[dynamic]Instruction, // out
|
||||
inst_info: ^[dynamic]Instruction_Info, // out (parallel)
|
||||
label_defs: ^[dynamic]Label_Definition, // out: inferred branch labels
|
||||
errors: ^[dynamic]Error,
|
||||
) -> Result
|
||||
```
|
||||
|
||||
Two-pass: (1) decode each instruction (prefixes → opcode → operands),
|
||||
collecting branch targets; (2) infer labels for in-region branch targets,
|
||||
reusing IDs from `relocs` when available.
|
||||
|
||||
`Decoder_State` (file-internal) holds prefix/VEX/EVEX decode state. The
|
||||
decoder relies on the generated tables in §10. Mostly file-internal procs:
|
||||
`decode_prefixes`, `decode_vex2/3`, `decode_evex`, `decode_opcode(_vex)`,
|
||||
`decode_operands(_vex)`, `decode_single_operand(_vex)`,
|
||||
`decode_memory_operand`, `decode_register`, `decode_implicit_operand`.
|
||||
|
||||
---
|
||||
|
||||
## 9. Printer (`printer.odin`)
|
||||
|
||||
Modified Intel syntax: size suffix on the mnemonic (`.b .w .d .q .x .y
|
||||
.z`) instead of `PTR`, clean `[base + index*scale + disp]` memory.
|
||||
|
||||
```odin
|
||||
Token_Kind :: enum u8 { WHITESPACE, NEWLINE, LABEL_DEF, LABEL_REF, OFFSET,
|
||||
MNEMONIC, REGISTER, IMMEDIATE, MEMORY_BRACKET, MEMORY_OPERATOR,
|
||||
MEMORY_DISP, MEMORY_SCALE, PUNCTUATION, COMMENT }
|
||||
|
||||
Token :: struct { offset: u32, length: u16, kind: Token_Kind, instruction_index: u16 }
|
||||
|
||||
Print_Options :: struct {
|
||||
uppercase: bool, hex_prefix: string, hex_lowercase: bool,
|
||||
label_prefix: string, show_offsets: bool, indent: string,
|
||||
separator: string, space_after_comma: bool,
|
||||
}
|
||||
DEFAULT_PRINT_OPTIONS :: Print_Options{ … }
|
||||
|
||||
Print_Result :: struct { text: string, tokens: []Token }
|
||||
```
|
||||
|
||||
Helpers: `mnemonic_to_string(m, lowercase) -> string`,
|
||||
`register_name(r, lowercase) -> string`, `token_kind_to_string`,
|
||||
`size_to_suffix(size) -> u8`.
|
||||
|
||||
### Output variants (all share the same trailing param set
|
||||
`tokens=nil, options=nil, label_names=nil`)
|
||||
|
||||
| Family | Sink |
|
||||
|---|---|
|
||||
| `sbprint` / `sbprintln` | into a `^strings.Builder` |
|
||||
| `print` / `println` | stdout |
|
||||
| `aprint` / `aprintln` | newly allocated string (`allocator` param) |
|
||||
| `tprint` / `tprintln` | temp-allocator string |
|
||||
| `bprint` / `bprintln` | caller `[]u8` buffer |
|
||||
| `fprint` / `fprintln` | `^os.File` |
|
||||
| `wprint` / `wprintln` | `io.Writer` |
|
||||
|
||||
All take `(instructions: []Instruction, inst_info: []Instruction_Info,
|
||||
label_defs: []Label_Definition, …)`.
|
||||
|
||||
---
|
||||
|
||||
## 10. Generated tables & builders
|
||||
|
||||
### `encoding_table.odin` (hand-written master)
|
||||
|
||||
```odin
|
||||
ENCODING_TABLE: [Mnemonic][]Encoding = { .MOV = { …forms… }, … }
|
||||
```
|
||||
The single source of truth. `encode()` does `ENCODING_TABLE[mnemonic]`
|
||||
(O(1)) then linear-scans the forms via `encoding_matches_inline`.
|
||||
|
||||
### `decoding_tables.odin` (generated from `ENCODING_TABLE`)
|
||||
|
||||
```odin
|
||||
ModRM_Info :: struct #packed { mod, reg, rm: u8, has_sib: bool, disp_size: u8 }
|
||||
SIB_Info :: struct #packed { /* scale, index, base */ }
|
||||
Decode_Entry :: struct { esc: Escape, prefix, opcode, ext: u8,
|
||||
mnemonic: Mnemonic, ops: [4]Operand_Type,
|
||||
enc: [4]Operand_Encoding, flags: Encoding_Flags }
|
||||
VEX_Decode_Entry :: struct { …Decode_Entry fields + vex_w: VEX_W, vex_l: VEX_L }
|
||||
Decode_Index :: struct { start: u16, count: u8 } // range into entries
|
||||
|
||||
MODRM_TABLE[256], SIB_TABLE[256]
|
||||
LEGACY_DECODE_ENTRIES[1266], VEX_DECODE_ENTRIES[667], EVEX_DECODE_ENTRIES[418]
|
||||
DECODE_INDEX_LEGACY[4][256], DECODE_INDEX_ESC_0F/_0F38/_0F3A[4][256]
|
||||
VEX_INDEX_0F/_0F38/_0F3A[4][256], EVEX_INDEX_0F/_0F38/_0F3A[4][256]
|
||||
```
|
||||
`[prefix][opcode] -> Decode_Index` gives O(1) opcode resolution; the
|
||||
small `count` range is scanned for ModR/M-ext, operand-size, or VEX.W/L
|
||||
disambiguation.
|
||||
|
||||
### `mnemonic_builders.odin` (generated, ~7,477 procs + ~2,338 overload groups)
|
||||
|
||||
Typed memory wrappers `Mem8 … Mem512` (distinct structs over `Memory`)
|
||||
with constructors `mem8 … mem512`. Per-form typed procs like
|
||||
`inst_mov_r64_r64(dst: GPR64, src: GPR64) -> Instruction`, each grouped
|
||||
into an overload set:
|
||||
|
||||
```odin
|
||||
inst_mov :: proc{ inst_mov_r8_r8, inst_mov_r64_r64, inst_mov_r64_imm64, … }
|
||||
emit_mov :: proc{ emit_mov_r8_r8, … }
|
||||
```
|
||||
So `x86.inst_mov(.RAX, .RBX)` resolves the right encoding at compile time
|
||||
with full type checking, no runtime dispatch.
|
||||
|
||||
---
|
||||
|
||||
## 11. Tools (`x86/tools/`)
|
||||
|
||||
| File | Package | Role |
|
||||
|---|---|---|
|
||||
| `gen_decode_tables.odin` | `main` (`-file`) | walk `ENCODING_TABLE` → emit `decoding_tables.odin` |
|
||||
| `gen_mnemonic_builders.odin` | `main` (`-file`) | walk `ENCODING_TABLE` → emit `mnemonic_builders.odin` |
|
||||
| `verify_tables.odin` | `main`, imports `x86 "../"` | check decode tables consistent with `ENCODING_TABLE` |
|
||||
|
||||
Tests live in `x86/tests/test.odin` (`package x86_tests`, `import x86 "../"`),
|
||||
run with `odin run x86/tests`.
|
||||
|
||||
---
|
||||
|
||||
## Known drift (pre-existing, not from the move)
|
||||
|
||||
The working tree had uncommitted edits to `operands.odin`/`printer.odin`
|
||||
that **renamed the memory constructors** but did not update callers:
|
||||
|
||||
- `mem_base_displacement` → `mem_base_disp`
|
||||
- `mem_base_index_displacement` → `mem_base_index_disp`
|
||||
- `mem_rip_relative` → `mem_rip_disp`
|
||||
- `mem_base` repurposed from *constructor* to *accessor*
|
||||
|
||||
Result: the library compiles, but `tests/test.odin` (and the README
|
||||
examples) reference the old names and currently fail to type-check.
|
||||
Fixing requires either restoring the old constructor names or sweeping
|
||||
the tests/README to the new ones — a deliberate decision left to you.
|
||||
Reference in New Issue
Block a user