The previous runtime_detect.zig called std.zig.system.resolveTargetQuery
which pulled in the entire Zig target/CPU model table infrastructure for
every architecture (~4,000 symbols, ~175 KB of data tables, ~130 KB of
code). This bloated the binary by ~500 KB and shifted code layout enough
to cause a measurable icache/branch-predictor regression in unrelated
hot paths like the terminal parser (~20% more cycles for identical
instruction counts).
Replace with minimal, direct CPU feature detection per architecture:
CPUID + XGETBV inline assembly on x86, sysctlbyname on Darwin AArch64,
and getauxval/prctl via std.os.linux (direct syscalls, no libc) on
Linux for AArch64, PPC, S390x, RISC-V, and LoongArch.
Split into per-architecture files under src/detect/ for
maintainability.
This uses a custom fork of `hwy/targtes.cpp` that uses an extern
function written in Zig to use Zig's standard CPU detection to avoid
a dependency on Apple SDK headers.
This is on the path to removing Apple SDK requirements to build
libghostty-vt, but will require a lot more work outside of this. The goal
is to get this out of our external dependencies first and then we can
work on removing the internal side.