Type kind system limitations

I have always found that the type kind system was imperfect, as the selected_xxx_kind() functions do not say anything about the performances of the returned kinds: it can be a hardware native one (i.e. with existing CPU instructions to load/store them and perform arithmetic on them) or a software emulated one, and even if hardware native it can suffer from performances hits because of poor alignment.

I admit that this is a false problem in practice, at a time where virtually all machines implement the IEEE types. But I have used in the past IBM machines where 2 32bits and 2 64bits floating-points formats were available in the XLF compiler: the IEEE ones, and the legacy IBM ones. As far as I can remember, the performances with the IBM ones were lower, and it was present mainly for compatibility purposes (old codes relying on this particular format, and legacy files written with this format). With selected_real_kinds() there would be no control at all on the returned kind…

The ISO_FORTRAN_ENV does really solve this problem. Or maybe it partly solves it, but also introducing other problems. One can expect that ‘REAL64’ is hardware native (still without any guarantee), but now it relies on the assumption that there exists a 64 bits kind. Again, false problem in practice at times where all machines follow the IEEE conventions. Still, not fully satisfactory as Fortran is supposed to be hardware-agnostic. I’m a bit worried by all these codes with REAL32, REAL64… in case a machine appears that is no longer based on IEEE and even not on 8bit bytes in some future.

And then I discovered only today that C had types like:

  • int32_t : occupies exactly 32 bits
  • intleast32_t : the smallest type that occupies at least 32 bits; this one is close to the mechanism of selected_int_kind() (but the latter is better, as the criteria is a numerical one, not memory-size based)
  • intfast32_t : the fastest type that occupies at least 32 bits

There’s apparently not the equivalent for floating-point types, though…

The ISO_FORTRAN_ENV module may provide similar kinds constants, even for the REAL type : REAL32, REALLEAST32, REALFAST32

And/or, an additional specifier may appear in selected_xxx_kind, e.g. selected_real_kind(p=15,priority="performance") would return the fastest kind with at least 15 digits precision (priority could be "size" | "balanced" | "performance")

That’s a pretty good idea. The problem is that, as you mentioned, current hardware gives limited choice.

Consider the output of:

program test
integer p,r
do p=1,40,4
  write(*,'(10I4)') (selected_real_kind(p,r),r=10,100,10)
end do
end program
   4   4   4   8   8   8   8   8   8   8
   4   4   4   8   8   8   8   8   8   8
   8   8   8   8   8   8   8   8   8   8
   8   8   8   8   8   8   8   8   8   8
  16  16  16  16  16  16  16  16  16  16
  16  16  16  16  16  16  16  16  16  16
  16  16  16  16  16  16  16  16  16  16
  16  16  16  16  16  16  16  16  16  16
  16  16  16  16  16  16  16  16  16  16
  -1  -1  -1  -1  -1  -1  -1  -1  -1  -1

selected_real_kind gives the illusion of choice, but in reality there are only three possibilities at the moment. Unless CPU hardware is likely to permit sliding floating point formats in the future, then adding more options to selected_real_kind just adds to the illusion.

1 Like

Running a slightly modified program with gfortran on Windows gives the 4 real kind numbers 4 8 10 16

program test
use iso_fortran_env, only: integer_kinds, real_kinds
implicit none
integer :: p, r, ikind, old_ikind
print "(a,*(1x,i0))", "real_kinds:", real_kinds
write (*,"(a4,*(I4))") "p/r", (r, r=10,100,10)
do p=1,40,4
  write(*,"(*(I4))") p, (selected_real_kind(p,r),r=10,100,10)
end do
print "(/,a,*(1x,i0))", "integer_kinds:", integer_kinds
print "(/,2a6)", "r", "kind"
do r=2,39
   ikind = selected_int_kind(r)
   if (r == 2 .or. ikind /= old_ikind) print "(2i6,i40)", r, ikind
   old_ikind = ikind
end do
print "(/,8x,'kind  huge')"
print*, 1, huge(0_1)
print*, 2, huge(0_2)
print*, 4, huge(0_4)
print*, 8, huge(0_8)
print*, 16, huge(0_16)
end program

output:

real_kinds: 4 8 10 16
 p/r  10  20  30  40  50  60  70  80  90 100
   1   4   4   4   8   8   8   8   8   8   8
   5   4   4   4   8   8   8   8   8   8   8
   9   8   8   8   8   8   8   8   8   8   8
  13   8   8   8   8   8   8   8   8   8   8
  17  10  10  10  10  10  10  10  10  10  10
  21  16  16  16  16  16  16  16  16  16  16
  25  16  16  16  16  16  16  16  16  16  16
  29  16  16  16  16  16  16  16  16  16  16
  33  16  16  16  16  16  16  16  16  16  16
  37  -1  -1  -1  -1  -1  -1  -1  -1  -1  -1

integer_kinds: 1 2 4 8 16

     r  kind
     2     1
     3     2
     5     4
    10     8
    19    16
    39    -1

        kind  huge
           1  127
           2  32767
           4  2147483647
           8  9223372036854775807
          16  170141183460469231731687303715884105727

Given the restricted possibilities in this realm of modern computers, how useful are the specialised C types? Do you really get different types for int32_t and intfast32_t?

I agree, but

  • this is the same illusion with all the type definitions in C, int32_t, intleast32_t, intfast32_t are likely to all map to int in practice
  • who knows what we will have in some (near/not so distant/distant) future in terms of hardware?
It turns out that yes:
#include <stdint.h>
#include <stdio.h>
int main() {
    printf("%lu %lu %lu %lu\n",sizeof(int) 
                              ,sizeof(int32_t)
                              ,sizeof(int_least32_t)
                              ,sizeof(int_fast32_t));
}

Both gcc and icx on PC return:

4 4 4 8

That is, int_fast32_t is mapped to long long rather than int

I cannot claim to understand the hardware properties that lead to this result, at least not with a straight face, but is this due to alignment? I mean, in general arithmetic with 32 bits integers should be less work than with 64 bits ones? So, like so much in computing, it would boil down to accessing the data, not to your basic computations.

Probably yes, it is related to alignement and the time needed to load/store them in the registers. Once in the registers it makes no difference, as the scalar CPU instructions operate on 64 bits anyway.

That said it’s different for the vector registers, that can handle twice as much 32 bits objects in the amount of time, compared to 64 bits objects. Also, 32 bits objects are preferable in memory bound codes. So, it can get complicated, depending on what we want to do…

Me too, but not because of the issue of performance. One can imagine creating an executable with a fortran compiler that is intended to run on multiple cpus, or even in a cross-compile situation where it will run on an entirely different cpu than the compiler. Some of those cpus may perform better with one integer kind while others perform better with a different kind. The compiler would not be expected to predict the future. The developer/programmer and the user might know those things, but the compiler can’t.

My suggestion for selected_int_kind() is to allow the programmer to simply specify the minimum/exact number of bits in the integer. If we ever get 1-bit and 4-bit integer types in fortran, that is the way they should be specified. If fortran ever decides to support unsigned integers, then there should be some kind of option to specify that too. I’m not holding my breath on any of those things, programmers have been asking for them since the fortran 8x days, and they aren’t here yet.

8-bit, 12-bit, and 16-bit floating point types also present challenges to the selected_real_kind() intrinsic. It already has a RADIX argument, so it is ahead of its integer countepart.

Another missing feature of selected_*_kind() is the ability to match different types and kinds with storage sequence association. The default integer, logical, and real kinds are all required to occupy the same size storage unit, but how do you specify the right logical kind to match, say, INT16?, or REAL128, or any other nondefault kind?

Nonetheless, the underlying performances are part of the general picture.

A compiler can’t know on its own, of course, but the people who are writing the compilers do know. Mapping int_fast32_t to int64 for codes compiled for the x86 platform is the decision of the compiler writers, and the mapping may change in future versions, depending on what the new x86 hardware will be.

I’m not sure. The size in bits is somehow a low level concept that does not fit the philosophy of Fortran IMO. When needing such constraints, I think that constants in ìso_fortran_envare a better approach (and which can differentiate different kinds with sale sizes, e.g.real16ieeeandbfloat16`).

My understanding is that storage association is a legacy feature, which is no longer desired in the standard. For instance, it seems to me that the storage size of complex(kind(1d0)) is unspecified (in particular it is not said that it shall occupy 4 numeric storage units).

Blockquote
I’m not sure. The size in bits is somehow a low level concept that does not fit the philosophy of Fortran IMO.
Blockquote
I think this is over-generalization which I would call even wrong. Fortran was developed when memory was at premium, and understanding exactly how much memory your code takes at every stage was crucial part of Fortran scientific programming as long as I remember ( since early 80-s in my case). Back in the day we will recycle variables to save on memory (and you will get extra points from your prof for that), and we certainly are paying attention whether my main arrays are real4 or real8. Currently I do large scale simulations, and I need to know what I can fit into my 32 GB on the node. 6 1024^3 array of real*4 ? Can I increase my box ?