Writing a binary file in little-endian?

For a small project (see Anecdotal Fortran... :-) - #8 by vmagnin), I need to write binary files in a little-endian order. In the following program, where I suppose the integers to be 4, 2 and 1 bytes (they are on my machine), I have successfully done what I want (but…):

program endian
    implicit none
    integer :: status
    integer(4) :: four
    integer(2) :: two
    integer(1) :: one

    four = 4
    two = 2
    one = 1

    open(unit=1, file='endian.bin', access='stream', status='replace', &
        action='write', iostat=status)

    write(1, iostat=status) four, two, one

    close(1, iostat=status)
end program endian

as you can see here:

$ gfortran -Wall fortran_endian.f90 && ./a.out && hexdump -C endian.bin
00000000  04 00 00 00 02 00 01                              |.......|
00000007

Now, my question is: will my program write in little-endian on all machines? I have not found the term “little-endian” in the Fortran 2018 draft. And I therefore guess it is not guaranteed by the standard… Probably the endianness used in files is just the same as in RAM?

I have found some information that confirms what I was thinking:
https://www.star.le.ac.uk/~cgp/streamIO.html
and here a method using the equivalence statement to detect and reverse endianness:
https://atmos.washington.edu/~salathe/osx_unix/endian.html

I am interested in your thoughts and comments about that problem. To write a WAV file I do need writing little-endian 2 octets and 4 octets integers.

Concerning only the size of the integers, do you agree that I should use ISO_FORTRAN_ENV and int32 and int16 integers?

I know that Intel allows you to specify a default endianess for all output on the OPEN statement ie OPEN(NEWUINT=iunit, CONVERT=“BIG_ENDIAN”, etc. I think this is an extension though and not part of the standard. I believe gfortran also supports CONVERT. If I remember the CONVERT extension has been around for a while in various compilers. I first saw it several years ago in the Cray compilers. Also, as far as I know, the IBM Power chips are big endian but just about everything Intel or AMD X86-64 is little endian so unless you are also targeting the POWER chips I doubt you need to worry about endianess. Just speaking for myself I would like to see CONVERT offically part of the standard as well as the ability to specify an encoding like base64 or XDR because one major application (VTK toolkit) requires big-endian format for all unformatted/binary files for the VTK legacy format and the newer VTK XML format needs base64 encoding to embed binary data into the XML files.

2 Likes

About the integer sizes.
Although must of compilers assume the kind is the size, but it is not the case for all of them, nagfor for instance. So yes, you should use ISO_FORTRAN_ENV, but I don’t know if int8 or int16 exist.

1 Like

I routinely use -fconvert=big-endian with gfortran and -convert big_endian with ifort. I prefer using compiler options over compiler-dependent language extensions as it makes the code itself easier to manage.

2 Likes

Thanks @rwmsu and @milancurcic for the -fconvert flag. I think I will, as a first step, begin assuming the machine is little-endian and tell the other people to use the -fconvert=little-endian flag, if (lucky people) they have it!

I have an article to read concerning that subject, but no time yet:
Cohen, D. ‘On Holy Wars and a Plea for Peace’. Computer 14, no. 10 (October 1981): 48–54. On Holy Wars and a Plea for Peace | IEEE Journals & Magazine | IEEE Xplore.
It is also available here, with a different date: 1 April 1980 (probably the original version!)
https://dcc.ufrj.br/~gabriel/progpar/danny_co.pdf

Thanks @gardhor,
yes, they do:
https://gcc.gnu.org/onlinedocs/gfortran/ISO_005fFORTRAN_005fENV.html

INT8 , INT16 , INT32 , INT64 :
Kind type parameters to specify an INTEGER type with a storage size of 16, 32, and 64 bits. It is negative if a target platform does not support the particular kind. (Fortran 2008 or later.)

Thanks about this precision.

Re. finding endianess. Assuming your compiler supports INT8 and INT16, a quick way to check if you are doing big endian io is:

  1. Create two INT8 variables (byte1 and byte2)
  2. Create a INT16 variable (twobytes) and set it equat to 1_INT16
  3. Open a scratch file for unformatted read/write and write out twobytes
    ie
    Open(newunit=nunit, STATUS=“scratch”, FORM=“unformatted”)
    Write(nunit) twobytes
    4 Rewind nunit and then read byte1 and byte2
    Read(nunit) byte1, byte2
    5 if byte1=0 and byte2=1 the file IO format is big_endian

This comes from some code found on the www.cfdbooks.com web site of Hiroaki Nishikawa (AKA Katate Masatsuka).

1 Like

Here is a program following the instructions of @rwmsu:

program test_endianness

use, intrinsic :: iso_fortran_env, only: int8, int16
implicit none

integer(int8) :: byte1, byte2
integer(int16) :: twobytes

integer :: unit

twobytes = 1_int16

open(newunit=unit,status='scratch',form='unformatted')
write(unit) twobytes

rewind(unit)
read(unit) byte1, byte2

close(unit)

if (byte1 == 0 .and. byte2 == 1) then
  print *, "Big Endian"
else
  print *, "Little Endian"
end if

end program
2 Likes

Just some background: Typically endian (I/O for unformatted files) uses the native endian assumed by the CPU hardware. That way the I/O amounts to a block move of bits with no rearrangements. If you want the endian reversed (typically BIG and LITTLE are the only options), you are usually writing a file intended to be read by a different system that uses the other endian. Most compilers provide a compiler option or that, or some extension to the OPEN statement. (As others have noted above.)

1 Like

@billlong

I’m curious as to if Cray or other vendors ever considered adding a base64 (or other) encoding IO option as a compiler extension. This would be a big help for writing things like the VTK XML format files where you want to mix base64 encoded binary with standard text. I presume it would have to be an option on the read and/or write statement. something like

Write(iunit, *) ‘’
Write(iunit, *, ENCODE=“base64”) some_real_array

etc.

Having that capability in a compiler would have saved me the time and the almost 1000 lines of code it took to develop my own base64 encoding/decoding package.

1 Like

Thanks everybody,
yesterday I have successfully generated my first sinusoidal waves in Fortran:

$ file output.wav
output.wav: RIFF (little-endian) data, WAVE audio, Microsoft PCM, 16 bit, stereo 44100 Hz
$ play output.wav

output.wav:

 File Size: 21.2M     Bit Rate: 1.41M
  Encoding: Signed PCM    
  Channels: 2 @ 16-bit   
Samplerate: 44100Hz      
Replaygain: off         
  Duration: 00:02:00.00  

In:2.32% 00:00:02.79 [00:01:57.21] Out:123k  [!=====|=====!] Hd:0.0 Clip:0   

I will push on GitHub a first version of that personal project, using of course fpm, within one week. Maybe some people here are interested to (modestly) walk with me in the steps of Stockhausen, Kraftwerk, Jean-Michel Jarre, Daft Punk, and other giants! I will make a post in the Discourse when it is ready.

2 Likes

@vmagnin sounds really fun!

Apart from electronic music, you might be able to use it for data sonification.

A few interesting articles about the topic:

https://sonification.de/handbook/

1 Like

Thanks @ivanpribec
data sonification has been in my thoughts for many years (although I didn’t know the term), but I have never done it. It could be another serious motivation to work on that geek toy project (but I believe music is also something serious!).

@milancurcic could sonify its Miami waves :ocean: into WAV :loud_sound:, using only Fortran.

2 Likes

I often have to read and write data in the big-endian order. I do not trust options like -fconvert because they either apply to all external units and that is completely unacceptable, or one has to specify the units to apply at compile time and that is not acceptable for me either.

So I make the conversion manually

https://bitbucket.org/LadaF/elmm/src/master/src/endianness.f90

I check, whether the machine is little-endian or big-endian and if it is little-endian the BigEnd() function makes the conversion.

1 Like

There is really no need to read and write anything to external files just do perform the test.

The endianness describes the order of the bytes in memory, there is no need to work with external files at all.

1 Like