Polymorphic behavior of derived types

dwwork · October 8, 2022, 4:19pm

I’ve been trying to better understand polymorphic behavior of derived types in fortran. I know there are compiler writers and fortran standard experts here, so hopefully they can help explain some questions I have. The godbolt link below has all the code. I will go through it a bit at a time.

godbolt link

The dostuff() procedure below updates the member variable, then calls dostuff2(), which is polymorphic, and dostuff3(), which is not (non_overridable).

type :: A
  integer :: mem = 0
  contains
  procedure, pass :: dostuff
  procedure, pass :: dostuff2
  procedure, pass, non_overridable :: dostuff3
end type

contains

subroutine dostuff(self, i)
  class(A), intent(inout) :: self
  integer, intent(in) :: i

  self%mem = self%mem +1501*i
  call self%dostuff2(i)
  call self%dostuff3(i)
end subroutine

subroutine dostuff2(self, i)
  class(A), intent(inout) :: self
  integer, intent(in) :: i

  self%mem = self % mem + 3001*i
end subroutine

subroutine dostuff3(self, i)
  class(A), intent(inout) :: self
  integer, intent(in) :: i

  self%mem = self%mem + 4501 *i
end subroutine

Both gfortran and ifort successfully inline dostuff3(), but the call to dostuff2() is definitely going through a vtable. Here’s the gfortran assembly, which is clearer IMHO:

        imul    edx, ebp, 1501         // update self %mem in dostuff()
        add     DWORD PTR [rax], edx
        mov     rax, QWORD PTR [rdi+8] // get vtable pointer
        imul    ebp, ebp, 4501         // precompute dostuff3()
        call    [QWORD PTR [rax+56]]   // call dostuff2()

This is all perfectly fine and makes sense. My problem with it is that I had no idea I was always running through a virtual table to make these functions calls! I mean, it makes sense NOW, but it never occurred to me that ALL of these functions were virtual functions, even when I never intended the type to be extended!

My takeaway from this is that almost all type-bound procedures should be marked non_overridable. Maybe you don’t use types this way, but I do.

The next interesting tidbit is this snippet of code

subroutine poly(obj,i)
  class(A), intent(inout) :: obj
  integer, intent(in) :: i

  call obj%dostuff(i)
end subroutine

subroutine not_poly(obj,i)
  type(A), intent(inout) :: obj
  integer, intent(in) :: i

  call obj%dostuff(i)
end subroutine

The only difference here is that the first uses class(A), while the second uses type(A). I wanted to see if type(A) meant “This is a non-polymorphic object and should be treated as such”. What is the difference between using type(A) vs. class(A)?

For the poly() subroutine, both gfortran and ifort immediately delegate to the virtual table. Makes sense.

For the not_poly() subroutine, gfortran successfully devirtualizes the call to dostuff(), but NOT the call to dostuff2(). It does inline the call to dostuff3().

For you experts out there, is this expected behavior, or is it a failure of the compiler to devirtualize the second level of function calls?

For the not_poly() subroutine, intel…well…I have no idea what intel is doing, but it still looks like it is devirtualizing the first call to dostuff(), but not the other 2. There are 2 function calls that … probably … come from the virtual table? I’m really not sure. Does anyone understand the assembly output from intel for this subroutine? (Below, or in the godbolt link above.)

Thanks for the help!

mov       QWORD PTR [56+rsp], offset flat: _TBPLIST_PACK_0
xor       eax, eax                                      
mov       rcx, QWORD PTR [56+rsp]                       
mov       QWORD PTR [48+rsp], offset flat: _DYNTYPE_PACK_0
imul      edx, DWORD PTR [r14], 1501                    
mov       QWORD PTR [64+rsp], rax                       
mov       QWORD PTR [80+rsp], rax                       
mov       QWORD PTR [96+rsp], rax                       
mov       QWORD PTR [88+rsp], rax                       
mov       QWORD PTR [72+rsp], rax                       
mov       QWORD PTR [104+rsp], offset flat: _INFO_LIST_PACK_0
mov       QWORD PTR [112+rsp], rax                      
mov       QWORD PTR [8+rsp], 4                          
mov       QWORD PTR [32+rsp], rax                     
mov       QWORD PTR [rsp], rdi                          
mov       QWORD PTR [16+rsp], rax                   
mov       QWORD PTR [24+rsp], 3                      
add       DWORD PTR [rdi], edx                          
lea       rdi, QWORD PTR [rsp]                          
call      QWORD PTR [8+rcx]           // call here to vtable?                  
mov       rax, QWORD PTR [56+rsp]                 
lea       rdi, QWORD PTR [rsp]                         
mov       rsi, r14                                     
call      QWORD PTR [16+rax]      // call here to dostuff3()?

FortranFan · October 8, 2022, 4:48pm

@dwwork,

Please see this thread at the GitHub Fortran proposals site and note the preamble whereby the Fortran standard currently has each derived type as an inextensible type which leads to severe consequences for the actual practitioners for Fortran, some of which you have noted in this thread:

github.com/j3-fortran/fortran_proposals

Feature proposal: open to derive an inextensible derived type

opened 07:26PM - 22 Oct 19 UTC

FortranFan

Clause 7

I submitted this paper **[19-186](https://j3-fortran.org/doc/year/19/19-186.txt)…** for the joint WG5/J3 meeting in Tokyo (August 5 - 9, 2019) but the paper was ignored on account of the worklist for Fortran 202X being closed. I don't know how to get it considered now for Fortran 202Y, perhaps various "thumbs up" by readers here might help push its case? Introduction ------------ 18-007r1 states in section 7.5.7.1 Extensible, extended, and abstract types, "A derived type, other than the type C_PTR or C_FUNPTR from the intrinsic module ISO_C_BINDING, that 7 does not have the BIND attribute or the SEQUENCE attribute is an extensible type." An extension type can thus be derived from any other user derived type which does not have the BIND or the SEQUENCE attributes. Consider a user derived type in Fortran that does not have the BIND or the SEQUENCE attributes and which is employed toward calculational needs involving data structures of some complexity in scientific and technical computing, particulary in industry: the allowance per current standard to extend said type is a matter of great concern in terms of security and predictability of computer operations and results, especially to senior software design architects, computational technology leaders, and budget and business managers. The facility in current Fortran standard to be able to extend all derived types except as stated above makes it feasible, at least conceptually, to manipulate the data and states of objects of such types and to corrupt or otherwise misuse them via type extension, either intentionally or unknowingly. It then becomes difficult, if not impossible, to develop technical software involving specialized derived types because it leaves open the possibility of violation of a technical/business understanding across or within teams not to override some type behavior or functionality via type inheritance. This situation hinders the adoption of Fortran in new engineering and/or scientific software projects where the design paradigm of object-orientation is important but where the needs of the projects also include the requirement to encapsulate the business/technical logic using **data structures which are 'sealed'** so that objects of such structures can then be consumed across the program or libraries without concern of easy alteration. In addition to the above-stated use case of added security and reliability of a 'sealed' derived type, it is also expected that a few or all of the processor implementations will be able to provide some performance benefit with the use of bound procedures with such 'sealed' types. This is on account of some or all processors succeeding in providing more efficient code via nonpolymorphic descriptors of the passed-object dummy argument utilized in procedure calls.

Instead of the authors being placed in a situation where “all type-bound procedures should be marked non_overridable”, the above proposal seeks a more convenient option for practitioners to derive an inextensible type whereby the type-bound procedures will all be effectively non_overridable. I have been making the case with the Fortran standard committee for this for over 5 years now and I have been tremendously disappointed with the lack of community support on this. It is such an easy thing to add to the language, a low-hanging fruit really.

A couple of other points re: your questions:

the compiler implementations are really a “mixed bag” when it comes to code optimizations with language features introduced starting Fortran 2003 (in some situations, one might argue that is case with even those dating back to Fortran 90). It is especially the case with the object-oriented and SPMD parallel programming (coarray) and concurrent execution (DO CONCURRENT) paradigms. The compilers are happy if they can even get bug-free implementations of the standard semantics, let alone advance to code optimizations! That is just the state of Fortran. Things are starting to improve of late but gfortran really needs a lot further community involvement and with Intel Fortran, users need to buy paid support and submit more and more requests.
With Intel Fortran, you may also want to inquire at their community forum: Intel® Fortran Compiler - Intel Community and if you can afford, via their support subscription site (Online Case Management Lightning)

FedericoPerini · October 8, 2022, 6:42pm

Interesting thread, I’m glad to see others care about this level of optimization

I don’t see a problem using non_overridable, the purpose of this keyword is exactly to allow for better optimization. To your point: in your code, you’re always using the polymorphic version of the function (call self%dostuff2). Should your class be extended, and the parent routine dostaff not overridden, the compiler should call the implementation of dostuff2 from the extended type (or from a non_overridable one down the inheritance tree), so, it cannot be optimized out.

But if you know you want to call the routine for the same class, beyond using non_overridable, you can always call the non-polymorphic version of the routine, i.e., the one you have in the same module:

subroutine dostuff(self, i)
  class(A), intent(inout) :: self
  integer, intent(in) :: i

  self%mem = self%mem +1501*i
  call dostuff2(self,i) ! you're calling function dostuff2 always
  call self%dostuff3(i)
end subroutine

This will be most likely optimized out

rgba · October 9, 2022, 11:37pm

I was just about to link this thread in your proposal, as a comment. I’ve seen it some time ago (August or early September) and left some “” here and there. Didn’t know how to support further, since I had no relevant contribution to add to the discussion, nothing more than “I actually like a lot the word sealed and cannot understand why you all are looking for alternatives”, at least.

dwwork · October 10, 2022, 2:18pm

I knew I couldn’t be the only one who noticed this! Thanks for the relevant history. It seems odd to me that a language focused on performance would have the slower vtable lookup be the default. (Not to mention all the other issues you’ve brought up in your proposal)

dwwork · October 10, 2022, 2:21pm

Oh, that is interesting! I had no idea you could do that! I have confirmed that it gives the expected result. I was about to suggest that any non-polymorphic ‘private’ methods should just be module procedures that accept the type as the first argument, and code it in a C-style OOP fashion, but your suggestion also does that.

FedericoPerini · October 13, 2022, 8:55am

I think you’re 100% spot-on on this. One VERY straightforward extension to the language, super-easy to implement, would be to use the non_overridable keyword for the whole set of type-bound procedures, in the same way that public and private can already be used:


   type, extends(GeneralizedWidget_t) :: SpecializedWidget_t
   ! This type extends a generlized widget type but it is marked itself
   ! inextensible to 'protect' its specializations.

      private
      ..
   contains

      private
      non_overridable  ! set default behavior for all type-bound procedures
      ..
      procedure, pass(this), public :: SuperWidgetProc
      ..
   end type

Topic		Replies	Views
How to avoid multiple type definitions due to different kinds? Help	17	1651	March 14, 2022
How to correctly use type extension and polymorphism? Homework	24	2875	December 3, 2021
Is it possible to detect the type of a variable? Help	23	3167	July 19, 2022
Call overridden procedure of abstract parent type Help	23	2975	January 30, 2021
Extending derived bind(c) types	20	1218	September 25, 2023

Polymorphic behavior of derived types

Related topics