Commit Graph

67 Commits

Author SHA1 Message Date
Timothee Cour
63f1c38f4e hashes: support object default hash (#17175) 2021-02-26 08:45:37 +01:00
dawidkotlin
95664e1524 add example of hashing an object by all of its fields with fields (#16643)
* add example of hashing an object by all of its fields with `fields`

* Update lib/pure/hashes.nim

* Update lib/pure/hashes.nim

* Update lib/pure/hashes.nim

Co-authored-by: flywind <43030857+xflywind@users.noreply.github.com>
Co-authored-by: Timothee Cour <timothee.cour2@gmail.com>
2021-02-19 07:59:33 +01:00
flywind
a2855b66ae JS: make hash float support IE/Safari (#16872) 2021-02-01 13:19:25 +01:00
flywind
111092e8aa refactor hash in JS backend (#16863) 2021-01-30 14:14:38 +01:00
konsumlamm
7b632f9ccb Improve documentation for the hashes module (#16720)
* Improve documentation for hashes

* Fix runnableExamples

* Apply suggestions
2021-01-15 22:42:01 +00:00
flywind
e869767aa7 fix #16061 (#16551) 2021-01-02 17:13:01 +01:00
flywind
d8b1ffc857 fix #16542 (#16549)
* fix #16542
2021-01-02 14:32:37 +01:00
flywind
cbc793b30b move tests to testament (#16101)
* move tests to testament

* minor

* fix random

* disable test random
2020-11-24 19:06:41 +01:00
c-blake
a9bd4c4e80 Alternate to https://github.com/nim-lang/Nim/pull/15915 (#15937)
* Alternate PR to https://github.com/nim-lang/Nim/pull/15915 to
resolve the problem mentioned there (`hash() == 0`) as well as
to close https://github.com/nim-lang/Nim/issues/15624

* Address https://github.com/nim-lang/Nim/pull/15937#discussion_r522759669
{ though this was only a move from 2 copies to 3 copies. ;-) }
2020-11-13 14:04:40 +01:00
Clyybber
ac65986aae Fix #14394 (#14395) 2020-05-18 17:43:06 +01:00
hlaaftana
fbc97e712a move since from inclrtl to std/private/since (#14188)
* move since from inclrtl to std/private/since
* move since import in system below for HCR
2020-05-02 23:51:59 +02:00
cooldome
289d48e5fe bug fix (#14149) [backport:1.2]
Co-authored-by: cooldome <ariabushenko@bk.ru>
2020-04-28 18:02:14 +02:00
Andreas Rumpf
242d39d27f fixes #12834 (#14017) 2020-04-19 14:42:45 +02:00
Andreas Rumpf
60ec5c89c5 added a .since annotation to hashIdentity 2020-04-15 23:35:10 +02:00
c-blake
a0b33f9408 Add hashWangYi1 (#13823)
* Unwind just the "pseudorandom probing" (whole hash-code-keyed variable
stride double hashing) part of recent sets & tables changes (which has
still been causing bugs over a month later (e.g., two days ago
https://github.com/nim-lang/Nim/issues/13794) as well as still having
several "figure this out" implementation question comments in them (see
just diffs of this PR).

This topic has been discussed in many places:
  https://github.com/nim-lang/Nim/issues/13393
  https://github.com/nim-lang/Nim/pull/13418
  https://github.com/nim-lang/Nim/pull/13440
  https://github.com/nim-lang/Nim/issues/13794

Alternative/non-mandatory stronger integer hashes (or vice-versa opt-in
identity hashes) are a better solution that is more general (no illusion
of one hard-coded sequence solving all problems) while retaining the
virtues of linear probing such as cache obliviousness and age-less tables
under delete-heavy workloads (still untested after a month of this change).

The only real solution for truly adversarial keys is a hash keyed off of
data unobservable to attackers.  That all fits better with a few families
of user-pluggable/define-switchable hashes which can be provided in a
separate PR more about `hashes.nim`.

This PR carefully preserves the better (but still hard coded!) probing
of the  `intsets` and other recent fixes like `move` annotations, hash
order invariant tests, `intsets.missingOrExcl` fixing, and the move of
`rightSize` into `hashcommon.nim`.

* Fix `data.len` -> `dataLen` problem.

* This is an alternate resolution to https://github.com/nim-lang/Nim/issues/13393
(which arguably could be resolved outside the stdlib).

Add version1 of Wang Yi's hash specialized to 8 byte integers.  This gives
simple help to users having trouble with overly colliding hash(key)s.  I.e.,
  A) `import hashes; proc hash(x: myInt): Hash = hashWangYi1(int(x))`
      in the instantiation context of a `HashSet` or `Table`
or
  B) more globally, compile with `nim c -d:hashWangYi1`.

No hash can be all things to all use cases, but this one is A) vetted to
scramble well by the SMHasher test suite (a necessarily limited but far
more thorough test than prior proposals here), B) only a few ALU ops on
many common CPUs, and C) possesses an easy via "grade school multi-digit
multiplication" fall back for weaker deployment contexts.

Some people might want to stampede ahead unbridled, but my view is that a
good plan is to
  A) include this in the stdlib for a release or three to let people try it
     on various key sets nim-core could realistically never access/test
     (maybe mentioning it in the changelog so people actually try it out),
  B) have them report problems (if any),
  C) if all seems good, make the stdlib more novice friendly by adding
     `hashIdentity(x)=x` and changing the default `hash() = hashWangYi1`
     with some `when defined` rearranging so users can `-d:hashIdentity`
     if they want the old behavior back.
This plan is compatible with any number of competing integer hashes if
people want to add them.  I would strongly recommend they all *at least*
pass the SMHasher suite since the idea here is to become more friendly to
novices who do not generally understand hashing failure modes.

* Re-organize to work around `when nimvm` limitations; Add some tests; Add
a changelog.md entry.

* Add less than 64-bit CPU when fork.

* Fix decl instead of call typo.

* First attempt at fixing range error on 32-bit platforms; Still do the
arithmetic in doubled up 64-bit, but truncate the hash to the lower
32-bits, but then still return `uint64` to be the same.  So, type
correct but truncated hash value.  Update `thashes.nim` as well.

* A second try at making 32-bit mode CI work.

* Use a more systematic identifier convention than Wang Yi's code.

* Fix test that was wrong for as long as `toHashSet` used `rightSize` (a
very long time, I think).  `$a`/`$b` depend on iteration order which
varies with table range reduced hash order which varies with range for
some `hash()`.  With 3 elements, 3!=6 is small and we've just gotten
lucky with past experimental `hash()` changes.  An alternate fix here
would be to not stringify but use the HashSet operators, but it is not
clear that doesn't alter the "spirit" of the test.

* Fix another stringified test depending upon hash order.

* Oops - revert the string-keyed test.

* Fix another stringify test depending on hash order.

* Add a better than always zero `defined(js)` branch.

* It turns out to be easy to just work all in `BigInt` inside JS and thus
guarantee the same low order bits of output hashes (for `isSafeInteger`
input numbers).  Since `hashWangYi1` output bits are equally random in
all their bits, this means that tables will be safely scrambled for table
sizes up to 2**32 or 4 gigaentries which is probably fine, as long as the
integer keys are all < 2**53 (also likely fine).  (I'm unsure why the
infidelity with C/C++ back ends cut off is 32, not 53 bits.)

Since HashSet & Table only use the low order bits, a quick corollary of
this is that `$` on most int-keyed sets/tables will be the same in all
the various back ends which seems a nice-to-have trait.

* These string hash tests fail for me locally.  Maybe this is what causes
the CI hang for testament pcat collections?

* Oops. That failure was from me manually patching string hash in hashes.  Revert.

* Import more test improvements from https://github.com/nim-lang/Nim/pull/13410

* Fix bug where I swapped order when reverting the test.  Ack.

* Oh, just accept either order like more and more hash tests.

* Iterate in the same order.

* `return` inside `emit` made us skip `popFrame` causing weird troubles.

* Oops - do Windows branch also.

* `nimV1hash` -> multiply-mnemonic, type-scoped `nimIntHash1` (mnemonic
resolutions are "1 == identity", 1 for Nim Version 1, 1 for
first/simplest/fastest in a series of possibilities.  Should be very
easy to remember.)

* Re-organize `when nimvm` logic to be a strict `when`-`else`.

* Merge other changes.

* Lift constants to a common area.

* Fall back to identity hash when `BigInt` is unavailable.

* Increase timeout slightly (probably just real-time perturbation of CI
system performance).
2020-04-15 20:11:18 +02:00
Miran
4aecc6b346 fix #12508, unaligned access on sparc64 (#13594) 2020-03-09 14:08:50 +01:00
Timothee Cour
6a0e87eb38 cleanup Ordinal (#13501) 2020-02-27 10:43:13 +01:00
Timothee Cour
8c22518d67 [backport] pseudorandom probing for hash collision (#13418) 2020-02-19 17:19:55 +01:00
Miran
352232e62d style fix: change 'JS' to 'js' to make it consistent (#13168) 2020-01-16 14:14:03 +01:00
Miran
734da9e1df fixes #11764, faster hashing of (u)int (#12407) 2019-10-15 16:31:07 +02:00
narimiran
15895ebc3f [backport] run nimpretty on hashes 2019-09-30 13:58:10 +02:00
Miran
ab48d7901e hashes: implement murmur3 (#12022)
* hashes: implement murmur3
* refactoring; there is only one murmurHash and it works at compile-time via VM hooks
* fixes JS tests
* makes toOpenArrayByte work with C++
* make it bootstrap in C++ mode for 0.20
2019-09-01 00:04:10 +02:00
Arne Döring
afbcd1b330 int128 on firstOrd, lastOrd and lengthOrd (#11701)
* fixes #11847
2019-08-07 15:53:16 +02:00
Araq
07d465ca42 [refactoring] remove unused imports in the compiler and in some stdlib modules 2019-07-18 00:36:03 +02:00
Araq
c94647aeca styleCheck: make the compiler and large parts of the stdlib compatible with --styleCheck:error 2019-07-10 12:42:41 +02:00
Miran
bf9f1f7b45 [bugfix] hashes: fix regression for nested containers (#11426)
Move forward declarations earlier.
2019-06-08 00:34:11 +02:00
Arne Döring
88b5dd3362 right shift is now by default sign preserving (#11322)
* right shift is now by default sign preserving
* fix hashString and semfold
* enable arithmetic shift right globally for CI
* fix typo
* remove xxx
* use oldShiftRight as flag
* apply feedback
* add changelog entry
2019-05-29 16:48:00 +02:00
narimiran
247fa431de hashes: quickfix one test 2019-05-27 20:46:33 +02:00
Andy Davidoff
b62f4b1b0c fix spelling [ci skip] (#11307) 2019-05-22 20:50:44 +02:00
Miran
1251e1ad16 faster hashing (#11203)
* faster hashing

* multibyte hashing for:
  * string and string slices
  * cstring
  * string, ignoring case
  * string, ignoring style
  * openArray of byte or char

* address the review comments

* use optimized version for all ints
* add more tests
* make it work in VM
* put warnings about differences between CT and runtime
* minor style tweaks
2019-05-21 21:26:27 +02:00
narimiran
792dfac521 hashes: fix inconsistent tests, fixes #10771 2019-03-03 10:53:37 +01:00
Miran
ca7980f301 improved documentation for several modules (#10752)
More detailed documentation for:
* md5
* hashes

Mostly cosmetic improvements for:
* threadpool
* typetraits
* channels
* threads
2019-03-01 12:57:55 +01:00
Araq
bbb0fd4eb7 remove deprecated stuff from the stdlib; introduce better deprecation warnings 2018-05-05 21:45:07 +02:00
Araq
72115c2b09 fixes #5969 2017-06-09 13:39:42 +02:00
Andreas Rumpf
b652b3cd52 remove en-dash from the language 2017-04-02 15:21:10 +02:00
Fabian Keller
5774145f5d added hash for uints (#5435) 2017-02-26 00:17:21 +01:00
Ruslan Mustakov
92665e6e9a Add hash proc for cstrings (#5386) 2017-02-13 13:38:30 +01:00
JamesP
07eaafca69 added hash procs for handling portions of strings/arrays/seqs.
added tests at bottom of file
changed some doco layout

Makes hashing iteratively through buffers faster when you
don't have to pass copied portions of the buffer to the
hash function
2015-10-07 13:03:31 +10:00
apense
d0f2ce3ae8 Added comma
"e.g." and "i.e." both usually take commas after, as they would in normal English ("for example, ..." and "that is, ..." respectively)
2015-07-06 00:53:49 -04:00
apense
c38956a850 THash -> Hash correction 2015-07-06 00:49:34 -04:00
Fabian Keller
414d69ccea added hash function for ordinal types 2015-07-03 11:19:17 +02:00
pdw
6914244f30 lib/pure/e-o - Dropped 'T' from types 2015-06-04 13:18:35 +02:00
Oscar Campbell
dd30bab480 Restructure branching slighty. Fix error message. 2015-06-01 23:49:04 +02:00
Oscar Campbell
1b4db5a34c Implement #2811 - Unicode en-dash (U+2013) as hump/snake alt. 2015-05-31 01:31:06 +02:00
Nycto
4f88238761 Fix floats in tuples in HashSets
Previously, the added tests would fail to compile with
errors complaining that 'hash(float)' didn't exist
2015-04-24 08:25:58 -07:00
Johanna Berewinkel
04906d6993 Changed some characters (&! -> !&) in the documentation in lib/pure/hashes.nim 2015-03-05 12:01:42 +01:00
Araq
11b6958755 big rename 2014-08-27 23:42:51 +02:00
Grzegorz Adam Hankiewicz
f45a1dbf1d Adds brief intro to hashes module. 2014-06-06 20:58:51 +02:00
Andreas Rumpf
f862e80be9 added 'hash' for set[T]' 2014-04-13 00:32:10 +02:00
Araq
b731e6ef1c case consistency: cs:partial bootstraps on windows 2013-12-29 03:19:10 +01:00