Make similar changes to those made in sets.nim, including hcode, rightSize
rawGet/rawGetKnownHC result protocol, nextTry probe sequence to be the cache
friendlier h=h+1 which in turn allows supporting changing deletion to fix the
infinite loop bug with local rehashing which in turn has desirable properties
of graceful table aging when deletes do happen and also making insert-only
usage patterns no longer pay any time/space cost to check deleted status.
Unlike collections.sets, this module has add() for duplicate key inserts and
a 3rd type of table, CountTable. The first wrinkle is handled by introducing
a rawGetDeep for unconditionally adding entries along collision chains. This
point of CountTable seems to be space efficiency at 2 items per slot. These
changes retain that by keeping the val==0 => EMPTY rule and not caching hash
codes. putImpl is expanded in-place for CountTable since the new putImpl() is
too different. { Depending on table size relative to caches & key expense,
regular Table[A,B] may become faster than CountTable, especially if the basic
count update could be something like inc(mGetOrPut(t, key, 0)). }
Unit tests pass, but in this module those are much more of just a demo than
probing for bugs. Should exercise/test this a little more before merging.
I got warning about deprecated names here. I also know that other names probably need to change (T/P prefixes) but I am unsure about the exact rules. I may do that later if you like.
I don't know if the (15|16...) is supposed to work on OSX. I have "libmysqlclient.18.dylib" in my lib directory and get "could not load: libmysqlclient.(15|16|17[18).dylib" on execution. After removing the pattern I can run my little example program and it works as "libmysqlclient.dylib" is a softlink to the current version anyway.
The estimation of the initialSize as simply array len + 10 was too small for
for all but the smallest sets. It would not elide/skip one final enlarge().
That last one is actually always the most expensive enlarge(). Indeed, in a
series where one to start from tiny and build up the table..that last one is
about 50% of all the enlarging time in general. So, this simple and reasonable
optimization (compared to just starting at 64) was only helping about half as
much as it could.
Introduce a rightSize() proc to be the inverse to mustRehash(). Export it
to clients since pre-sizing is externally useful in set construction and the
current mustRehash rules are opaque and beyond the control of clients.
Also add test module logic to check that rightSize() and mustRehash() are
inverses in the appropriate sense..not really in a block/assertion throwing
unit test since this is a peformance nice-to-have issue rather than about
basic correctness. (Also, fix a too vs. two typo in doc comment.)