Arguments check : caller / callee

Hello,
To illustrate, here 2 files :

  • main.f90
      program test
      integer a,b
      real(kind=4) r1
      a = 1
      b = 2
      call sub1(a,b,1)
      CALL sub2(a,b,1)
      CALL sub3(r1)
      ! call sub1(a,b)
      end

      subroutine sub2(a,b)
      integer, intent(in) :: a,b
      print*,'a,b=',a,b
      end subroutine sub2

      subroutine sub3(a)
      integer, intent(in) :: a
      print*,'a=',a
      end subroutine sub3
  • sub1.f90
      subroutine sub1(a,b)
      integer, intent(in) :: a,b
      print*,'a,b=',a,b
      end subroutine sub1
  • Compilation :

gfortran -Wall sub1.f90 main.f90
main.f90:13:22:

13 | CALL sub2(a,b,1)
| 1
Error: More actual than formal arguments in procedure call at (1)
main.f90:14:19:

14 | CALL sub3(r1)
| 1
Error: Type mismatch in argument ā€˜aā€™ at (1); passed REAL(4) to INTEGER(4)

  • Remark:
    ** arguments are checked inside the same file
    ** arguments are not checked when the subroutine is defined in another file

Do you know if there is an option (compilation flag or free external script) to get this check even if the subroutine is defined in another file ?

You can (or should) define sub1() in a module and use that module in the main file.

Yes using an interface is a solution. But my code is a old fortran code and I cannot use interfaces or modules

1 Like

But my code is a old fortran code and I cannot use interfaces or modules

Why?. You can always create explicit interfaces for standalone routines (ie external routines not already in a module) ala a C header file with function prototypes and put those in a module. As long as you donā€™t change the argument list (ie signature) of the actual routine, the compiler can use the interfaces to check for valid arguments. Your are free to modify the working parts of the routine as much as you want and recompile as often as you want without having to modify the interfaces. This is basically a ā€œpoor mansā€ way of doing submodules. I sometimes use this approach during code development because it gives many of the benefits of submodules without worrying about how mature or robust the submodule implementation is for a given compiler. Once the code is working I have the option of moving everything into a submodule or merging into a standard module.

Unfortunately interfaces are really not possible due to our memory manager. All arrays are stored in a big chunk of memory which is of type integer ! => Type mismatch almost everywhere ā€¦

Hi @rfournier ,
though I know this might require a lot (and by a lot, I mean a lot) of work, I can tell you that in my team we went for complete rewriting in which we have fully removed such massive static chunk of memory, in place of allocatables/stack arrays, module variables, etc, each one of them then having its correct type, being it built-in or derived.
This helped being able to use morden language construct/features, as well as writing much more rubust, safe, extensible code. With a very tiny (if not almost inexistent) execution time increase.
Might be useless comment after all, but in the remote case, consider it for future developments. You would surely thank yourselves for it.

Maybe Iā€™m misunderstanding something, but I think you just asked for contradictory behavior. I.e. ā€œIā€™d like the compiler to check types when calling procedures (even when it canā€™t see them)ā€ but also ā€œI have lots of type mismatches when calling procedures, and Iā€™d like it to continue ignoring thoseā€.

1 Like

It is true that as @everythingfunctional points out, it looks like you ask for the opposite at the same time. Which is of course not possible.
The command you use for compiling including -Wall indeed includes checks for interfaces. Thus, generating the error you get.
If, instead, you accept your statement ā€œUnfortunately interfaces are really not possible [ā€¦]ā€, then you would need to:

  • declare sub1 as external in the main program;
  • add /warn:nointerfaces (NOTE: this is for Intel on Windows, should be -Wno-implicit-interface for gfortran) after -Wall, so not to check interfaces.

Then, it is to the programmerā€™s responsibility to correctly interpret and use data passed to unchecked-interfaces procedures.

@rfournier,

As pointed out by others, some work will be needed somewhere to do as you asked. There will be some pain for the gain you seek. It can be an INTERFACE construct you introduce in sections of the code that you can indeed change:

..
   interface
      subroutine sub1(a,b)
         integer, intent(in) :: a,b
      end subroutine sub1
      subroutine sub2(a,b)
         integer, intent(in) :: a,b
      end subroutine sub2
      subroutine sub3(a)
         integer, intent(in) :: a
      end subroutine sub3
   end interface
..

This way you achieve what you ask: ā€œget this check even if the subroutine is defined in another fileā€

C:\temp>type p.f
program test
   integer a,b
   real(kind=4) r1
   interface
      subroutine sub1(a,b)
         integer, intent(in) :: a,b
      end subroutine sub1
      subroutine sub2(a,b)
         integer, intent(in) :: a,b
      end subroutine sub2
      subroutine sub3(a)
         integer, intent(in) :: a
      end subroutine sub3
   end interface
   a = 1
   b = 2
   call sub1(a,b,1)
   CALL sub2(a,b,1)
   CALL sub3(r1)
end

C:\temp>gfortran -c -Wall -ffree-form p.f
l.f:17:19:

   17 |    call sub1(a,b,1)
      |                   1
Error: More actual than formal arguments in procedure call at (1)
l.f:18:19:

   18 |    CALL sub2(a,b,1)
      |                   1
Error: More actual than formal arguments in procedure call at (1)
l.f:19:16:

   19 |    CALL sub3(r1)
      |                1
Error: Type mismatch in argument 'a' at (1); passed REAL(4) to INTEGER(4)

C:\temp>

I have the same type of problem with my own legacy codes. One approach is to define local pointers of the correct TKR, and associate those pointers with the appropriate slice of your work array. Here is a short working example.

program localp
   use, intrinsic :: iso_fortran_env, only: wp=>real32
   use, intrinsic :: iso_c_binding, only: c_f_pointer, c_loc
   implicit none
   integer, parameter :: n = 100
   real(wp), target :: work(n)
   integer, pointer :: iwp(:)

   !call sub( n, work )  ! original call violates TKR.

   call c_f_pointer( c_loc(work), iwp, [n] )
   call sub( n, iwp )  ! updated call with correct TKR.
contains
   subroutine sub( n, iwork )
      integer, intent(in)    :: n
      integer, intent(inout) :: iwork(n)
      integer :: i
      do i = 1, n
         iwork(i) = (i*(i+1))/2
      enddo
      write(*,'(*(g0))') 'iwork(1)=', iwork(1), ' iwork(', n, ')=', iwork(n)
      return
   end subroutine sub
end program localp

$ gfortran localp.f90 && a.out
iwork(1)=1 iwork(100)=5050

For this to work, you must define each pointer with the c_f_pointer(c_loc()...) sequence before the local pointer is used. In principle, it only works for interoperable types, but in practice it will also work for many other cases, such as derived types (which you will not have in legacy codes). This uses the C interoperability features of the language, but there is actually no C code involved at all, everything is done in fortran.

The above example uses an explicit shape dummy array. This is probably what you have in your legacy code. However, this approach is more general, for example it also works correctly with assumed shape dummy arrays. My work arrays in my legacy codes are of type REAL, so that is the way I wrote the above example, but this approach also works the other way, with an integer work array and a real pointer.

@RonShepard : I was thinking to the same strategy, but with pointers, I am afraid about array copy when a pointer is passed as subroutine argument, see : Fortran Pointer as argument in a subroutine - Help - Fortran Discourse (fortran-lang.discourse.group)
Do you experience any drawbacks (performance ā€¦?) when using pointers ?

Usually, compilers only make copies when the actual argument is not contiguous and when the dummy argument must be (e.g. an explicit shape array, or an assumed size array). You can tell if a copy is being made by looking at the c_loc() values of the actual and of the dummy arguments (i.e. print out the location of the first array element). I say ā€œusuallyā€ because the argument association mechanism might change depending on compiler options or other external factors. I think in this case of legacy code, that is all that is available for the programmer to work with.

In this discussion we concluded that this approach was not standard conforming.

You cannot ask the compiler to check the number of arguments without checking the types, itā€™s all or nothing.

If your big chunk of memory is a static array, then you can use the equivalence feature:

integer :: iwork(N)   ! N being known at compile time
real :: rwork(N)

equivalence(iwork,rwork)

Then iwork and rwork share the same memory location

If the arrays iwork and rwork are equivalence-d and then passed as separate arguments to a subprogram, then in the subprogram you have to ensure that the Fortran aliasing rules are obeyed. This can be troublesome and may lead to code that will be hard to debug. If the subprogram is in a library for which no source code is available, it can be nearly impossible to verify if aliasing can occur.

@mecej4

Itā€™s no different from passing several times the same integer array. The typical use of a unique memory chunk was something like

call foo(iwork(i1),n1,iwork(i2),n2,...)
....
subroutine foo(a,n,b,m,...)
    integer n, m
    integer a(n)
    real b(m)

It was the developer responsabilty to ensure that there was no overlap between iwork(i1:i1+n1-1) and iwork(i2:i2+n2-1), and consequenly that the dummy arguments were not aliased. Equivalencing with a real array doesnā€™t change that:

equivalence(iwork,rwork)
call foo(iwork(i1),n1,rwork(i2),n2,...)
....

All my Fortran reference books are still boxed up after I cleaned out my office when I retired earlier this year, so Iā€™m relying on memory from the last time I used BLANK COMMON which was about 45 years ago but would something built around BLANK COMMON work (assuming the OPs current memory management scheme is based on setting the memory statically to some max size at compile time. Probably a lot of work to make sure the arrays being passed are define correctly in a BLANK COMMON statement in the routine calling foo in the above examples. You would also have to make sure you set the BLANK COMMON to the maximum size you will ever use in your main routine.

Several years ago I tried to implement (for want of a better word) a heap class that would define a rank one chunk of memory (separate real and integer arrays) and then use methods that would return contiguous pointers to sections of the actual arrays. The goal was to avoid repeated allocations inside a subroutine and mimic the old statically defined global work array approach but avoid the long argument lists that occur when you try to associate sections of the work array with separate dummy arguments. The heap could grow dynamically up to a fixed size or locked against further growth. Also, my hope was that that using the CONTIGUOUS attribute would help with vectorization and other optimizations etc. but I didnā€™t see any improvement with the compiler I was using at the time. I also realized I needed some kind of reference counting scheme to make the whole process bullet proof so I didnā€™t try to develop it further.

The issue in that previous discussion was storage sequence association between types integer and character, for which the standard allows such things as big/little endian addressing and padding bits to be processor dependent and/or unspecified entirely and it allows alignment address constrants. For example, the byte address of an arbitrary character entity might not be a valid integer address (e.g. the integer address might be required to be a multiple of 2 or 4). In this current discussion the issue involves real and integer arrays, and the standard does impose further storage sequence requirements in this case. A valid integer address will also be a valid real address (and visa versa) due to the storage sequence requirements between these two types.

This does introduce the additional alignment feature of various type+kind entities. If you have a work array of type integer (say int32), then it might be possible to have an integer address that is not allowed to be a double precision address (e.g. real64 might be required to have an address that is a multiple of 8, while the int32 address is only required to be a multiple of 4). In my legacy codes where I use work arrays like this, I typically used work arrays of type real*8 (or real64 in modern terms). If there are any alignment constraints imposed on the real*8 addresses, then they will automatically be satisfied with this convention. However, this then introduces the complications associated with other real kinds and of the various integer and logical kind combinations. In F77, this was addressed with preprocessor directives, but with modern fortran (f2008 and later) it can all be done within the standard using the storage_size() intrinsic. At least for the intrinsic types, the storage_size() values are evaluated at compile time, and the resulting expressions are then simplified and any associated dead code is removed. Compared to the old f77 situation, this is almost an ideal compromise between flexibility, portability, and runtime efficiency. Nowadays, it might be better to use an int64 work array instead of a real64 work array. That option was not available in the legacy f77 code because there was only one integer type then, but now the standard requires also a large integer kind to be supported.

This was one of the popular approaches with f77. The problem was that the blank common dimension had to be known at compile time, so when other memory allocation extensions became available in the 1980s that allowed the size to be determined at run time, those superseded the blank common approach.

I would not recommend now going back to any memory management scheme based on blank common. Allocatable arrays, combined with c_f_pointer(c_loc()ā€¦) are much more flexible.

I fully understand your points, and indeed I cannot see any reason why this would not work in practice. But ā€œI cannot see any reasonā€ is not ā€œthereā€™s no reasonā€, as long as the standard conformance is not proven.

In any case, being forced to use C-related stuff to achieve that is somehow worrying, and a f_f_pointer() routine would be justified instead:

call f_f_pointer(x,y[,shape]) 
! equivalent to call c_f_pointer(c_loc(x),y,shape)

Some old tricks I do not see mentioned are to append all of the code into a single file or (depending on the compiler) specify all the source files on a single compiler command. It is certain ly not specified by any standard but compilers often provide more thorough warning messages when presented with all the source at once; and often provide additional optimizations.

In particular some compilers can provide much more inlining when all the source is presented at once. I have seen single source files composed of nothing but include directives of all the source that are maintained (often automatically with scripts) for this purpose. In the past you could get dramatic performance improvements with some codes if you compiled them as a single file. I believe most remaining modern compilers optimize better now than they used to when working with separately compiled object files but it might still provide significant optimizations with some codes.

That was useful even before all the new Modern Fortran features evolved. It is still useful with old code to identify how much interface mismatch exists when just beginning, because those can be time-consuming or very difficult to remove depending on just how fancy the work array usage was.

There are often compiler switches to create automatic interface source code that can be used to create informative messages. They are most useful when you are actually trying to find the interface issues in order to remove them, not to live with them. As one example, for ifort:

     -gen-interfaces

              Tells the compiler to generate an
              interface block for each routine in a source file.

In some cases creating a work ā€œtypeā€ that includes different types can be a crutch when working with code that aliases storage to different types . That is, if you had a work array ā€œfred(100)ā€ that was used for both integer and real storage, creating a type like

type dual
    real  :: re(100)
    integer :: wh(100)
endtype 
type(dual) fred

can be useful. With a minimal number of code edits to use ā€œfred%reā€ when reals are used and ā€œfred%whā€ when integers are used you can often get the code to be standard-conforming and then work from there. It can obviously require several times as much memory for the work arrays but means you can start using modern strict checking options that you otherwise use as they cause compilation errors as noted above.

I have found those two very common pre-f90 uses (using storage as different types and passing scalars to array arguments) are in almost every older code I have used or tried to ā€œmodernizeā€.
They can be very time-consuming or difficult to remove; and can prevent placing collections of routines into modern modules which is very unfortunate. I did not go through everything above so maybe these ā€œtricksā€ have been mentioned; but I have found them very valuable as a first step in creating modules from old libraries, which has a huge number of advantages in preventing interface errors and so on (so is often a goal when working with older pre-module code).