Design decision needed for Big-Endian test failures

Hi everyone,

I’ve been working on a fix for the test_hash_functions failures on big-endian architectures (s390x, powerpc) reported in Debian Issue #1128.

I have mapped out the fixes, but I need a maintainer’s decision before I open the PR because it involves modifying vendored third-party C code.

The Root Cause: The test suite fails because the C reference generators and the Fortran implementations handle memory differently on big-endian machines:

  1. nmhash / waterhash: Fortran normalizes reads to Little-Endian. The C headers use raw memory reads (endian-dependent).

  2. SpookyV2 / pengyhash: Both Fortran (transfer) and C (memcpy) use native memory reads, but the seed handling and partial-block reads differ slightly.

Dilemma: To make the tests pass on big-endian, the easiest fix is to modify the third-party C reference files in test/hash_functions/ (e.g., adding __builtin_bswap32 to waterhash.h and tweaking SpookyV2.cpp to match Fortran’s behavior).

However, modifying third-party C headers carries a maintenance risk if stdlib ever updates these files from upstream.

How would the maintainers like to proceed?

  • Option A: Modify the vendored C reference headers in the test/ folder so they exactly match Fortran’s current behavior. (I have this implementation ready).

  • Option B: Change the Fortran implementations to match the C code’s native-endian behavior. (This avoids touching C code, but changes Fortran hash outputs on big-endian machines).

  • Option C: On big-endian machines, skip the C-reference comparison entirely, and only run self-consistency tests (avalanche testing, deterministic output checking).

Let me know which path aligns best with stdlib’s maintenance goals, and I will submit the PR!

1 Like