Hi everyone,
I’ve been working on a fix for the test_hash_functions failures on big-endian architectures (s390x, powerpc) reported in Debian Issue #1128.
I have mapped out the fixes, but I need a maintainer’s decision before I open the PR because it involves modifying vendored third-party C code.
The Root Cause: The test suite fails because the C reference generators and the Fortran implementations handle memory differently on big-endian machines:
-
nmhash / waterhash: Fortran normalizes reads to Little-Endian. The C headers use raw memory reads (endian-dependent).
-
SpookyV2 / pengyhash: Both Fortran (
transfer) and C (memcpy) use native memory reads, but the seed handling and partial-block reads differ slightly.
Dilemma: To make the tests pass on big-endian, the easiest fix is to modify the third-party C reference files in test/hash_functions/ (e.g., adding __builtin_bswap32 to waterhash.h and tweaking SpookyV2.cpp to match Fortran’s behavior).
However, modifying third-party C headers carries a maintenance risk if stdlib ever updates these files from upstream.
How would the maintainers like to proceed?
-
Option A: Modify the vendored C reference headers in the
test/folder so they exactly match Fortran’s current behavior. (I have this implementation ready). -
Option B: Change the Fortran implementations to match the C code’s native-endian behavior. (This avoids touching C code, but changes Fortran hash outputs on big-endian machines).
-
Option C: On big-endian machines, skip the C-reference comparison entirely, and only run self-consistency tests (avalanche testing, deterministic output checking).
Let me know which path aligns best with stdlib’s maintenance goals, and I will submit the PR!