As #9239 points out the old implementation had some serious flaws.
The new implementation is a port of the MIT-licensed one used by
Chromium OS and has been tested against the FIPS-provided vectors and by
generating huge files like the ones mentioned in the issue above.
While I tried my best to take into account the existence of BE machines
the code has only been tested on a LE one.
(cherry picked from commit 18023c023d)