This change was made in order to allow things produced with Odin and using Odin's core library, to not require the LICENSE to also be distributed alongside the binary form.
This makes a tremendous (2x with SSE2, 3x with AVX2) difference on big
datasets on my system, but this may be hardware-dependent (e.g.
instruction cache sizes).
Naturally, this also results in somewhat larger code for the large-data
case (~75% larger).
This includes various minor things that didn't seem right or could be
improved, including:
- XXH3_state is documented to have a strict alignment requirement of 64
bytes, and thus came with a disclaimer not to use `new` because it
wouldn't be aligned correctly. It now has an `#align(64)` so that it
will.
- An _internal proc being marked #force_no_inline (every other one is
#force_inline)
- Unnecessarily casting the product of two u32s through u128 (and
ultimately truncating to u64 anyway)
This uses compile-time features to decide how large of a SIMD vector to
use. It currently has checks for amd64/i386 to size its vectors for
SSE2/AVX2/AVX512 as necessary.
The generalized SIMD functions could also be useful for multiversioning
of the hash procs, to allow for run-time dispatch based on available CPU
features.
Randomize size used with `update`.
It'll print "Using user-selected seed {18109872483301276539,2000259725719371} for update size randomness."
If a streaming test then fails, you can repeat it using:
`odin run . -define:RAND_STATE=18109872483301276539 -define:RAND_INC=2000259725719371`
Test XXH32, XXH64, XXH3-64 and XXH3-128 for large inputs, with both all-at-once and streaming APIs.
XXH32_create_state and XXH64_create_state now implicitly call their "reset state" variants to simplify the streaming API to 3 steps:
- create state / defer destroy
- update
- digest (finalize)
These are tested with an array of 1, 2, 4, 8 and 16 megabytes worth of zeroes.
All return the same hashes as do both the one-shot version, as well as that of the official xxhsum tool.
3778/3778 tests successful.