Today I realized something strange. See this simple program that writes a real array in a binary file:
program writeit
implicit none
integer, parameter :: n = 32
real, dimension(n) :: a
integer :: iunit
a(:) = 1.
open(newunit=iunit,file='a.bin',status='replace',access='direct',recl=n*storage_size(1.)/8)
write(iunit,rec=1) a(:)
close(iunit)
end program writeit
To my surprise, while in the gnu, cray and pgi compilers, the file size is, as expected = 32*4 = 128 bytes, with the intel compiler the corresponding file size is four times bigger: 512 bytes. It seems that the intel compiler (and perhaps others) assumes that recl is in units of 4 bytes (link below) unless the code is compiled with a specific flag. Shouldnāt the unit defining the record length be standardized? Iām sure there is a good reason for intel to choose this default value, but I do not see it. This has caused me a bit of trouble today.
The solution as most of you know is to use the inquire statement to fetch the record length of the real number, but if one is not careful this can also cause issues: for a long time I was using the inquire statement to fetch the size of a real in bytes, and now I realize why at that time I was also having an issue when using an intel compiler.
I thought this may be useful to some of you who may not be aware of this.
I believe to avoid any troubles about direct access files, you have to use the inquire with the full record and not with a single real. So that, youāll have:
Still, I believe, it is better to use the full record in the āinquireā. In particular, when the record is composed of several variables with different types/kinds.
The Intel compiler is DEC-heritage, and DEC compilers used the size of a ānumeric storage unitā for RECL=. Fortran 77 said:
If the file is being connected for unformatted input/output, the length is measured in processor-dependent units.
This didnāt change until Fortran 2003, where it said (and still says):
If the file is being connected for unformatted input/output, the length is measured in file storage units.
Since the older standards didnāt nail this down, DEC used, numeric storage units, or the size of an integer. When F2003 changed this, it created an incompatibility that would break many existing programs.
So the solution was to add a new option -assume byterecl that changed the units to āfile storage unitsā (or bytes). This is implied when you use -standard-semantics, the option to tell the compiler to change all defaults that conflict with the current standard.
Thank you @sblionel for making it clear! So it is actually part of the current standard that recl is defined in āstorage unitsā. Then I also learned that I have been wrongly assuming that compiler defaults were complying to more recent standards. Best to explicitly use flags that impose them just in case.
@pcosta, specifically the standard (as of F2003) defines a new storage unit called āfile storage unitā.
Compilers that change behavior of standard-conforming programs tend to be unpopular. Over the decades, the standard has added language to specify things it previously left unsaid, and sometimes this differed from what implementations decided on and what programmers depended on.
I know that the standard doesnāt use the concept of ābyteā, but is the āfile storage unitā guaranteed to be a byte, or could it be different?
The number of bits in a file storage unit is given by the constant FILE_STORAGE_SIZE (16.10.2.11) defined in the intrinsic module ISO_FORTRAN_ENV. It is recommended that the file storage unit be an 8-bit octet where this choice is practical.
The Intel compilers also assume no recursion by default and no reallocation on assignment, unless you use the proper flag (or just -standard-semantics).
I understand the dilemma that vendors have. They must choose between supporting legacy features by default or supporting the current (and sometimes changing) standard by default. Programmers who use a single compiler might prefer the first approach, while those who use multiple compilers tend to prefer the latter approach. Iām in the latter group, where the preferred default would be to support the current standard, but I think vendors should make it simple to support nonstandard (or standard, as the case may be) legacy features too. I think something like a ācircaā option might be nice. Something like
$ ifx -circa 2006 myfile.f
for example would enable whatever the compiler default features were for the ifx/ifort compiler in the year 2006. With this approach, the programmer would not be required to work through every single feature change to get just the right option combinations for the year 2006, he would only need to know that his code compiled correctly in 2006 in order to get it to compile correctly now. This kind of feature would also make it easier for vendors to incorporate new standard features into their compilers without worrying about upsetting the programmers who use their compilers who depend on continuity from version to version.
Note that in the particular case that is discussed here, the problem is that some codes were written in a non-portable way, by assuming that file storage unit was 4 bytes. The right way was (and still is) to inquire the record length of the data object(s) to be written. Itās just one more line in the code.
-circa 2006 was just a suggested syntax. Maybe -legacy 2006 would be better? Yes, if a vendor changed defaults multiple times within a year, then this would require more effort on the programmer to pick and choose which features he wants. Also, if a vendor changed defaults in the middle of a calendar year, then there would be some ambiguity about the feature set. The difference between this and something like a -stand option is that the programmer might not know exactly which standard features he wants to support, or even if they are standard features, he might only know that in 2006 his code compiled correctly with a specific compiler and now it doesnāt. As noted above, with things like allocation on assignment and recursion, he might actually be invoking options where some specific standard behavior is disabled rather than enabled if that is what that particular vendor happened to choose for that year as the default.
There are good and bad aspects to other approaches as well as those described, but note that the Intel compiler allows you to globally or personally or per directory specify which default switches are
on via config files as well an environment variables; and that some programs allow you to specify compatibility with previous versions, much like many compilers let you specify a standard to conform to or a set of extensions.
You can simulate something close to that (groups of compiler switches available via a single name or switch) with the Intel compiler by having multiple config files
Note the vms or dec extensions were close to a standard unto themselves, as were the CDC/Cray extensions so note many compilers have a -dec or -vms switch to change defaults so the compiler acts much like previous Digital compilers.
That all being said the current trend very much appears to be to enforce the standards with a --std switch or equivalent and only use extensions when they are too good to ignore, as mentioned early on in this thread.
So not sure I would want to promote it, but bundling up switch options into named combinations that could be used to get one compiler as close to another, such as --def=ifort11 might be useful for some. Since that is not likely to happen, a wrapper script for multiple compilers is more feasible. In many ways fpm and its --profile switch approaches this for common default behaviors. Hopefully when it soon has user-specified profiles someone could make profiles somewhat like described, where something like
is feasible if imperfect. I suspect that is about as close as you are likely to get to what you describe so I would think the most useful approach for the community is to support fpm providing user-defined fpm profiles.
That is not possible in this universe. When the program produces the expected output, the experienced programmer thinks āDang! the bug is still hiding!ā.