Is Fortran Memory-Safe?

In another thread (about the US administration advocating the use of memory safe languages) I mentioned a chart I used to maintain which compared different compilers with regard to their run-time error checking.

That chart is very old, and I have now put an updated version at Is Fortran “Memory Safe”? - Fortran UK, paying particular attention to the memory safety issue. I know that Ian and Jane at Fortranplus have also been publishing data, but for different compilers.

  • I tested Silverfrost 9.0, gfortran 12.3, Intel ifx 2024.0.2, nvFortran 24.3 and NAG 7.1 on a Windows machine (using WSL for nvFortran)
  • The FH1 test was removed (I believe F2018 permits a file to be opened simultaneously on multiple units)
  • I used a different benchmark for timing, and added an extra row for optimized timings.
  • I removed the floating point exception tests, because I think they don’t really relate to memory safety. I kept the integer overflow test.
  • I added a row for garbage collection, which I believe the Linux version NAG 7.1 has. (I haven’t tested it.)

My conclusion is that Fortran can be regarded as “Memory Safe” if the run time tests in that chart are enabled. I think some tests may even go beyond what other reportedly memory safe languages do. The main issue I see is the possibility of memory leaks. I think these can only occur when allocating memory using pointers (allocatable arrays and scalars cannot leak), and it’s possible NAG has that covered too.

11 Likes

I have not checked the chart in much detail, but it looks quite interesting. Have you thought about the use of assumed-size arrays? They make it impossible (if I understand it correctly) to make run-time checks on the actual array bounds.
Another thing that comes to mind: deallocating memory that is being pointed to by a pointer but was not allocated via that pointer.

Assumed-size arrays can only be dummies. The argument association can be tracked at run-time, at very little cost (as the NAG compiler demonstrates), and all the extents can be known, however deep in the call tree you are. Then, bounds-checking everything is possible, if you can afford the run-time.

Same with pointers, the NAG compiler will catch the situation you describe, if you ask for the appropriate runtime check.

I don’t understand why people think such feats are “impossible”, decades after they have been demonstrated.

2 Likes

Nice work.

I’m interested in seeing a similar chart for Linux, particularly with the newly released 2024.1 version of ifx, which should fix the false positives with the LLVM Memory Sanitizer. (This feature isn’t available on Windows for some reason.)

Which of these problems Valgrind on Linux can detect also interests me.

I downloaded the tests and might run them on Linux this weekend to answer my own questions, time permitting.

1 Like

I believe WATFIV was doing most of these checks over 50 years ago.

2 Likes

For me, it is simply because I have not seen it at work for my programs and I have no idea what would be required :slight_smile: But the way you state it makes a lot of sense.

Is there any common fortran compiler that includes all of the aliasing checks at runtime that if violated would cause UB? If not, Fortran is not memory safe.

There isn’t, but there could be. What does that tell you about Fortran itself?

Running code will always be unpredictable. That’s, WHY we run it.

The users, since 1959, have had opportunity to say which features they find most intellectually challenging and would like more help with (in the form of runtime checks or cleaner semantics). And tools and language features have developed accordingly.

That tells you that Fortran is not memory safe. Memory safety isn’t just bounds checks. It’s whether there exist easy ways to corrupt memory that don’t look obviously wrong. Fortran’s assumption that users will never alias arrays is a massive source of memory unsafety.

2 Likes

Absolutely correct. It’s trivial to violate array bounds with assumed size arguments into routines. Sometimes that results in a crash, sometimes you can just write directly into the next variable over in memory.

1 Like

It’s trivial to violate array bounds with assumed size arguments into routines.

The NAG compiler checks that in Debug mode.

I think a compiler could check all these things, the language does not prevent it. Once we can compile valid codes, I would like to extend LFortran to catch all these bugs for invalid codes.

So I would say Fortran as a language is memory safe.

3 Likes

Yeah, it’s been brought up before in this thread and elsewhere that most of the mainstream compilers have options to check for these kinds of things. I use them myself. In the minds of many people, myself included, that is not enough to say that the language is memory safe. There is no “official implementation,” because there is no strongly backed community compiler that actively tests new features and actually develops the language. It’s kind of stuck in the standard committee circlejerk process, where implementation and standard specification are entirely separate things. I hope that LFortran can become this community resource and actively developed place.

So, because there is no official implementation, and nothing in the standard says anything about most of these memory unsafe methods, I think that aligns with the view that the language is not inherently memory safe.

Somewhat aside, is any language? I know you can still intentionally leak memory in Rust, so maybe it would have to be a garbage collected language with always on runtime checks for things like aliasing and array bounds? At that point any memory unsafe behavior would be a bug in the garbage collector.

1 Like

There are two tests in the chart that require the compiler to detect aliasing problems, and one compiler passes both tests. However, I accept that these are not exhaustive - I think it’s not hard to devise other tests that current compilers fail. There’s an excellent article here that goes into some of the intricacies.

However, I think it’s an exaggeration to say this is a “massive source of memory unsafety”. I can’t recall any instance where I’ve had to track down a bug that was caused by aliasing rules, whereas bounds violations, uninitialized variables and the like are part of the daily grind.

Fortran is not like Java - it doesn’t aim to have all implementations produce identical outputs. For example the rules on short-circuiting logic evaluation allow compilers to go different ways, and we accept that optimization may change our outputs. This need not be a failure in memory-safety: you’re not letting the squirrel out of the box, but allowing it to bounce off different walls.

I guess the definitions of what memory-safe actually means don’t always match. For example the NSA Software Memory Safety report states that:

Examples of memory safe language include Python®, Java®, C#, Go, Delphi/Object Pascal, Swift®,
Ruby™, Rust®, and Ada.

And yet in Python you clearly have memory aliasing of certain objects:

>>> L1 = [1,2,3]
>>> L2 = L1
>>> L2[2] = 4
>>> print(L1)
[1, 2, 4]

Aliasing bugs do happen in practice, particularly in older codes which didn’t enforce array declarations and usage as strictly. Here was a bug I encountered previously, where U and V were aliasing by accident, due to a wrong declaration in a calling routine:

SUBROUTINE UVINIT( NPDE, NPTS, X, U, NV, V)
IMPLICIT NONE
INTEGER :: NPDE, NPTS, NV
DOUBLE PRECISION :: X(NPTS), U(NPDE,NPTS), TIME, V(NV)
U(1,:) = 0
U(1,NPTS) = 1
V(1) = 0               ! <-- eliminates effect of previous line
END SUBROUTINE

So far I haven’t investigated if any of the Fortran compilers could catch this bug.

aliasing isn’t the problem. the problem is a standard that says that it is fair game to optimize under an assumption that aliasing doesn’t occur.

The standard pushes the problem into the programmer’s hands by stating it’s the programmers responsibility to comply. How should a Fortran standard guarantee safety, if ultimately even the hardware itself is vulnerable to exploits? For example the prefetcher attack shown recently for the M1 cpu.

The Fortran standard tries to remain hardware-agnostic, so I’m kind of doubtful it even defines what optimization means. It’s the compiler-writers and users of the language which are willing to trade safety for optimization opportunities.

The way I see it, if you are after safety, you will always need to put some level of trust in your toolchain. If Fortran is used in contexts which are safety-critical, this might mean having a compiler supporting a runtime check -fcheck=alias implementing some formal aliasing verification techniques. (Edit: the commercial static analyzer FORCHECK is mentioned in the article.)

The NAG Fortran compiler supports some level of aliasing checks:

The -C=alias option will produce a runtime error when it is detected that assignment to a dummy argument affects another dummy argument. At this release this is only detected for scalar dummy arguments.

With some compilers you can opt out of the default aliasing assumption. With Intel the option is -assume dummy_aliases. GCC used to support a -fargument-alias flag, but it’s been removed.

Edit: correction concerning gcc’s -fargument-alias flag, which has been removed.

2 Likes
SUBROUTINE UVINIT( NPDE, NPTS, X, U, NV, V)  ! U & V accidentally aliased
IMPLICIT NONE
INTEGER :: NPDE, NPTS, NV
DOUBLE PRECISION :: X(NPTS), U(NPDE,NPTS), TIME, V(NV)
U(1,:) = 0
U(1,NPTS) = 1
V(1) = 0               ! <-- eliminates effect of previous line
END SUBROUTINE

I suppose that in other “memory safe” languages this would be not be a bug. Horrible - but not a bug. Fortran defines it as a bug, and may with optimization produce code that generates a different even more horrible result. That is the outcome that would represent a “bug caused by aliasing rules”.

Surely the best outcome would be to define this as a bug, and have the compiler tell you about it. Which languages do that?

Since its not obvious (at least to me) where the aliasing is occuring in your example I assume it’s because the routine is being called something like this.

call uvinit(npde, npts, x, u, nv, u(1,1))

Unfortunately, you see this kind of thing a lot in old code. When I started my first real engineering job I was required to take some classes about running and programming on the Johnson Space Center’s Univac 1108 systems. One of the classes was on debugging and had an example where you could effectively add 1+1 and get an answer of 3.

  1. I don’t see why modern Fortran programs could not be examined using the formalism of separation logic, developed specifically for pointer manipulating programs. The challenge is that there aren’t a lot of widely available tools for using it. AFAICT, it is more general than concepts from Rust (ie. the borrow checker) in that it can deal with data structures such as circularly linked lists. Such structures are either used in unsafe segments, or more complicated alternatives are used.
  1. The Rust conception of “memory safety” excludes prevention of memory leaks; programs that leak memory are still considered “safe.” This substantially limits the attractiveness of Rust from a correctness POV in my opinion.

That’s what happens, but with one more step of indirection involved,

      NVST = NPDE*NPTS + 1   ! start of ODE equations
C ...
      IV = NPDE*NPTS       ! number of ODEs from the PDE discretization 
      IF (NV.GT.0) THEN    ! if we have additional coupled ODE equations
         IV = NVST
C ...
      END IF
C ...
      CALL CSET(NPDE,NPTS,U,WK(I6),WK,WK(I2),WK(I5),NEL,NPTL,WK(I4),
     *          WK(I12),XBK,IBK,WK(I3),U(IV),NV)

If NV = 0, then U(IV) is incorrectly passed as dummy argument V, because of the buggy declaration:

       SUBROUTINE CSET(NPDE,NPTS,U,X,OMEGA,DU,XBK,NEL,NPTL,XC,CCR,XBH, 
      *                IBK,DUTEM,V,NV) 
C      .. Scalar Arguments .. 
       INTEGER         IBK, NEL, NPDE, NPTL, NPTS, NV 
C      .. Array Arguments .. 
       DOUBLE PRECISION CCR(NPTL), DU(NPTL,NPTL), DUTEM(NPTL,NPTL), 
      *                OMEGA(NPTL,NPTL), U(NPDE,NPTS), V(1), X(NPTS)   ! (!) V(NV)
C ...
C ...
       CALL UVINIT(NPDE,NPTS,X,U,NV,V) 

Finally you end up with the situation in UVINIT, that U and V accidentally alias. This is from a package published in ACM TOMS, showing that human peer-review is not good enough. With the help of bounds-checking I was able to pinpoint the problem.

2 Likes