This includes various minor things that didn't seem right or could be
improved, including:
- XXH3_state is documented to have a strict alignment requirement of 64
bytes, and thus came with a disclaimer not to use `new` because it
wouldn't be aligned correctly. It now has an `#align(64)` so that it
will.
- An _internal proc being marked #force_no_inline (every other one is
#force_inline)
- Unnecessarily casting the product of two u32s through u128 (and
ultimately truncating to u64 anyway)