PDT memory footprint

I have come to believe that compilers store arrays of PDTs as ordinary DTs, that is repeating paramater , at least length type, values for each element, even though it is same for all and should be stored only once.

If true, it bolsters my view that PDTs are kind of abandoned by the compiler developers. Their support is neither widespread not complete. Everytime I try to incorporate them, it ends up either in failure not having any advantage over allocatable components.

How many users find it helpful, especially PDTs with length type parameters. Is this time to deprecate it.

I’ve always questioned why PDTs were saddled with the length parameter for anything other than a character string. With the introduction of deferred length (allocatable) strings the length parameter makes little sense if you have allocatable arrays and strings. Yes you have to take a couple of extra steps to allocate the required arrays or strings but that’s a small price to pay for something that works. In my experience, being able to vary type/kind is a lot more important than length. Unfortunately, decisions about what values to assign to KIND parameters made long before PDTs were introduced makes a truly flexible PDT with varying types impossible for most compilers to implement without rewritting a lot of the compiler and breaking backwards capability for some of the commercial vendors users. With the introduction of things like assumed type variables etc, a well designed PDT would handle the generic programming requirements for a lot of people. Unfortunately, we are stuck with what I consider a poorly thought out and in most compilers poorly implemented feature that the compiler developers whine about being too hard to implement. My personal opinion is that the real reason is that to implement PDTs correctly and in a bulletproof fashion would require a major overhaul of some compilers and that requires a major investment in time and/or money that the powers that be can’t justify.

Often the array components of a derived type must have consistent dimensions. PDTs let you express that clearly, and I hope that compiler support for them will improve. With regular derived types you can and should write a subroutine to allocate the components consistently, but if the components are PUBLIC, the user may change the component sizes and break the derived type.

Given that there are probably few production codes using PDTs, small breaking changes in PDTs could be entertained if that eased their implementation. Any suggestions?

2 Likes

Unfortunately, the major change to make PDTs usable for me is not with PDTs but with the values assigned to KIND parameter, particularly the ones in ISO_FORTRAN_ENV. That makes what I would like to see done with PDTs impossible. I’ve pointed this out in a few other posts here but I’ll repeat it one more time. The values of the kind parameters for integer and real values are the same (ie REAL32=4 and INT32 also equals 4 etc) in most compilers (except for NAG). I believe it was in F2008 that the TYPE statement was allowed to be used to specify intrinsic types but you also had to specify the type anyway so there was no reason to use it ie.

type(real(real32)) :: a ! is the same as real(real32) :: a

what I really want to be able to do is just

type (real32) :: a

or 

type(int32) :: ia

which makes more sense to me but is impossible to implement when real32 and int32 both have a value of 4.

Now imagine a PDT (say for a linked list) where you could just do the following:

type alist_t(list_kind)
   Integer, kind :: list_kind
   type(list_kind) :: list_value ! ie value becomes generic for intrinsic types
   type(alist_t(list_kind), pointer :: next
   type(alist_t(list_kind), pointer :: previous
end type

! then if you want a list of int32 you just do

type(alist_t(INT32)) :: ilist

! or real32

 type(alist_t(REAL32)) :: rlist

Since INT32 and REAL32 both have the value 4 and I’m not naive enough to think that the majority of Fortran compilers that adopted this 30 years ago are going to change now, something like this will never be implemented. All because only NAG had the courage and forethought to use something besides the number of bytes for KIND parameter values. I see this as another problem caused by the overzealous devotion to the “can’t break backwards capability” mantra. I guess the decision to use the number of bytes instead of some logical labels that would differentiate between integer and real values was driven by trying to appease all the people who cling to real(4) or real*4 etc in their codes.

Note also that I think that expanding things like assumed type beyond just C interop support could really open the doors to a truly usable generic programming facility (maybe built around PDTs) in Fortran that would be Fortran centric and not try to emulate C++.

Difficult to imagine how it could be different: an array element must have the exact same representation in memory than a scalar of the same type, whatever the type.

I believe there is some wiggle room that it’s not technically required by the standard, but it would sure make the compilers job a lot harder and likely be inefficient in a lot of common cases. I.e.

call something(pdt_arr(2))

would probably need to do copy-in/copy-out to get things lined up properly for the callee.

1 Like

Yes, and things like p => pdt_arr(2) would be even more trickier (if possible at all).

PDTs seem like a really good idea, but as you say the implementations are not portable among very many compilers. Often, they can be replaced with allocatable components, but then there are also many significant differences. Array slice access is one example. Something like a(:)%b%c(j) is allowed when the member c(:) is a PDT, but not when it is an allocatable array.

Of course, one can also argue that the array slice syntax should be allowed for an allocatable array in the first place, but that is a separate issue. With a PDT member, the array slice has constant spacing between the elements, whereas with an allocatable (if it were allowed) the spacing in memory would be nonuniform, so it is more than just a syntax issue.

Another similar situation is with i/o. One can simply do

   write(*,*) a

when a is a PDT but not when its members are allocatables. Again, one can argue that this is an unnecessary and arbitrary restriction on the use of allocatables within the language, but that is nonetheless a nontrivial difference between allocatables and PDTs within the current language.

2 Likes

The way I think about length-type PDT is that it “breaks” a long-standing decision in Fortran to separate type information and length information. Traditionally, type compatibility could be checked at compile time, but length compatibility was left for run time (meaning that a confident programmer could avoid the performance penalty of enforcing it). By making length information a property of the type (the PDT), and not the object, there was hope that the compiler could prove more programs correct. I don’t think the hope has materialized, most implementations are bug-prone, compiler developers have spent far too much time trying to get it right when they could have been working on other things, and users have been left unsure of the achievable runtime performance (or correctness) of various syntactic forms. I don’t expect length-type PDTs to be removed from the Standard, but I don’t expect users to deploy them either.

3 Likes

I find PDT are actually similar to the character type, regarding this point.

1 Like

IMO PDTs were a good idea, with an apparently good design. Not being a compiler writer at all I don’t know what does make them difficult to implement correctly. Is it really inherent to the design, or is it the same situation than with the array syntax (almost) 35 years ago? It took a long time before the compilers could deliver the same performances with array syntax compared to loops. It didn’t take 20 years, though, because array syntax has many advantages and people eventually wanted to use it… In contrast, as underlined above, allocatable components offer a similar feature to the parametrized length, and which is known to work correctly. So there’s no pressure to get the PDTs working correctly.

One problem with this approach is that the compiler couldn’t check the validity of many parts of the code without knowning the type.

If REAL32 and INT32 had different values (say REAL32=100 and INT32=3) how is doing
type(REAL32) and type(INT32) any different than doing real(real32) and integer(int32) and how
can a compiler not be able to understand the type of the defined variables. You have to two pieces of information that define a unique label (a unique name (REAL32) and a unique value associated with that name (100)). As it stands now all but NAG (at least of the compilers I know about) have REAL32 = 4 and INT32 = 4. Yes they have different names but you can’t just use the value to distinguish between them if all you are looking at is the value. I presume thats why the standard requires you to do type(real(real32)) because the combination of real and real32 uniquely identifies the type. Current PDTs don’t know the value type until you create a specific instance of the PDT and give it a KIND parameter argument. I’m just saying give INT32 and REAL32 different values and you can just use type(real32) and type(int32).

OK, I get it… I was actually thinking ahead, to take advantage of the generic type definition, and write generic code for this type.

Are you referring to case 1 or 2 of the snippet below?

type :: list(n)
   integer, len :: n
   real :: a(n)
end type

type :: container
    type(list(10)) :: values
end type

! Case 1
type(list(10)) :: a(5)

! Case 2
type(container) :: b(5)

print *, storage_size(a)
print *, storage_size(b)   ! ICE with gfortran 13.2 

end