What about security issues in Fortran?

When we think about Fortran, security issues are generally not our first concern. A lot of us has probably never really think about it. For the kind of applications we work on, we may not feel really concerned.

But today, it came to my mind, just for fun. If a developer is using Fortran instead of C, what security issues would he face?

For example, in C you can have the famous buffer overflow issue, and problems concerning C pointers:
https://www.quora.com/What-are-security-issues-in-the-C-language

But what about Fortran? :thinking:

1 Like

Steve Lionel tweeted about this two years ago: https://twitter.com/DoctorFortran/status/1063160236415090689

Since Fortran arrays and lower and upper bounds, and character strings have fixed length, compilers should be able to check things pretty strictly.

I don’t know what is the status with assumed size arrays or pointer arrays.

3 Likes

Assumed-size arrays (* bounds) are not checked (typically), but assumed-shape arrays (: bounds) are. This includes pointers, which is a major difference between Fortran and C.

Fortran DOES have security issues, but less than C. There is an ISO working group, WG23 “Vulnerabilities”, working on a document that calls out issues and makes recommendations for many languages, including Fortran. See http://www.open-std.org/jtc1/sc22/wg23/docs/documents

5 Likes

Two questions:

  1. Assumed-size (*) - what do you mean by typically not checked? Could they be? How? I mean the last bound, of course.

  2. Assumed-shape (:) - bounds ‘are’ or ‘may be’ checked? That’s not the same. Compilers used to have, and apparently still do, options to force bound checking (-fbound-check in GFortran). It is by no means enforced and, as it adds overhead, often omitted. I do not recall many makefiles including the option.

  1. Some compilers, NAG for example, have an option to pass extent info as a hidden argument and the called procedure can check it. This requires that all Fortran procedures be built with this option. The standard puts the burden on the programmer to ensure they don’t access past the extent of the actual argument.

  2. What I meant to say here is that assumed-shape arrays have extent information passed that can be checked. You’re right that this is typically not an option enabled by default, but it doesn’t require potentially-incompatible calling conventions.

Assumed-size arrays cannot be “checked” as such but some processors can propagate the extent information and install run-time checks for explicit-shape arrays. This is what the NAG compiler does with -C=calls. This works even though the array is passed through an assumed-size layer. Example:

Program main
  Implicit None
  Integer :: a(4, 3) = -1

  Call f1(a)
End Program
Subroutine f1(a1)
  Integer :: a1(3, *)

  Call f2(a1)
End Subroutine
Subroutine f2(a2)
  Integer :: a2(3, 5)

  Print *, size(a2), a2(3, 5)
End Subroutine

The check is not against the a2(3,5) reference in the PRINT statement but the a2(3,5) declaration. The bounds checking option -C=array would NOT catch the reference in PRINT.

The -C=calls is fail-safe in the following sense. If some subprograms are not compiled with it, the code will still run but without the benefit of the check. The -C=undefined check is not fail-safe in that sense.

I think Steve Lionel may have confused the mode of operation of these two options. The -C=calls option does not involve modifying the calling convention.

I believe C++ and C arrays are not bounds checked, but that the C++ vector is. I think the use of bare arrays is discouraged in C++. They should be encapsulated in a class. Has anyone written a Fortran analog to the C++ vector that can be used when built-in bounds-checking is more important than speed? My programs to analyze financial data have arrays of stock symbols. I don’t do much computation with them, and it would be nice if there were a safe container for them. By contrast, I would not want to store matrices of stock returns in a slower container than an array. Admittedly, most Fortran compilers do have good bounds-checking options.

Ivan Pribec created an issue on stdlib concerning possible Fortran implementations of such containers - currently focussing on lists of strings - List of strings (implementation ideas) · Issue #322 · fortran-lang/stdlib · GitHub.

One particular problem I can imagine here is how to manage errors/exceptions.

Another potentially insecure issue is using sequence associacion as in

    program test
      real :: a(10)=[1,2,3,4,5,6,7,8,9,0]
      call sub(a(3))
    end program test
    subroutine sub(t)
      real :: t(10)
      print *, t(10)
    end subroutine

BTW, to my surprise newer gfortran (I checked 9, 10) gives an error on this code
Actual argument contains too few elements for dummy argument 't' (8/10)),
if put in a single source file. Surely not when sub is compiled separately. Is that standard conforming for a compiler to act as if there was sort of explicit interface when there is no? I’d guess that external subprograms should compile identically whether in separate source files or just in a single one.

It’s nothing to do with explicit interface, I think. The more problematic case is when the CALL is conditional on some input data, so might never actually happen.

But I think the compiler is perfectly entitled to refuse to compile this. The NAG compiler does the same. Think of it as a short-circuit to the runtime error that would happen.

If, at some point at runtime, you depart from the Fortran rules, the standard does not specify an interpretation of your program and the compiler might as well not bother producing a non-Fortran program. If you don’t, then go ahead and comment out the erroneous CALL or PRINT.

Realistically, compilers offer some switch to reduce the level of error-checking. For NAG, it’s -dusty

Some Fortran compilers use thunks to implement the use of internal procedures as actual arguments or procedure pointer targets. These thunks contain executable code that passes an extra actual argument to the internal procedure for its implementation of host association. Some Fortran compilers allocate these thunks on the stack, which requires that their stack be executable, and that’s a famous security risk.

I’d be astonished if any current compilers did this, as the stack is marked as “non-executable” in current OSes. I touch on this in Doctor Fortran in “Think, Thank, Thunk” - Doctor Fortran (stevelionel.com).

I share your astonishment, but it’s the case that GNU Fortran, Intel Fortran, nagfor, and XLF all have executable stacks during execution on Linux. You can observe this by compiling this mixed program with host association and thunks that dumps the memory segment map while the thunk is running.

module callit
 contains
  subroutine callthunk(t)
    interface
      subroutine t
      end subroutine
    end interface
    call t
  end subroutine
end module

module m
  use callit
  interface
    subroutine showmaps() bind(c)
    end subroutine
  end interface
 contains
  subroutine outer
    integer hostassoc
    hostassoc = 666
    call callthunk(inner)
   contains
    subroutine inner
      print *, hostassoc
      call showmaps
    end subroutine
  end subroutine
end module

use m
call outer
end
#include <errno.h>
#include <fcntl.h>
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <unistd.h>

static void dump(const char *fn) {
  int fd = open(fn, O_RDONLY);
  if (fd < 0) {
    fprintf(stderr, "open('%s') failed: %s\n", fn, strerror(errno));
  }
  char buff[512];
  for (;;) {
    ssize_t got = read(fd, buff, sizeof buff);
    if (got < 0) {
      fprintf(stderr, "read failed: %s\n", strerror(errno));
    }
    if (got <= 0) {
      break;
    }
    fwrite(buff, 1, got, stdout);
  }
  close(fd);
}

void showmaps (void) {
  dump("/proc/self/maps");
}

Look for the ‘x’ in the line of the output with [stack], like

3ffffabf0000-3ffffac20000 rwxp 00000000 00:00 0                          [stack]
----------------------------^  the x means this segment is executable
1 Like

Interesting. I don’t use Linux but had assumed that it adopted the same protection Windows did. ifort on Windows allocates the thunks on the heap.

Ten months ago I posted “What about security issues in Fortran?” and it was just a theoretical game in my mind, having in all my career always written my own Fortran programs, with a very few copy/paste of some functions found on the net. But those last days, installing fpm on several machines and playing with it, I realized it’s now a practical question, and a deep one.

Is Fortran harmless?

Definitely not since it can access the file system. But the more dangerous feature is probably execute_command_line() (or its ancestors like the system() GNU extension).

The deletion game

Let’s play in a Linux virtual machine:

$ fpm new killing_joke
$ cd killing_joke

In the killing_joke.f90, let’s add just one Fortran line:

call execute_command_line("rm * -rf")

:bomb::boom:

$ fpm run
 Hello, killing_joke!
$ tree
.

0 directories, 0 files

Let’s create another project:

$ fpm new armaggedon

with that line (NEVER TYPE THAT COMMAND! :japanese_ogre:):

call execute_command_line("rm ~/* -rf")

A little fpm run and your home is empty… (in fact, you still have the hidden files :woozy_face:).

Happily, I have never run Fortran programs as root. And fpm does not need to be launched as root (interestingly, the same day I am thinking about that, there is that discussion on fpm). So your system partition is safe…

Other attacks

With execute_command_line() you can of course call commands like wget or curl to download malicious code or upload some files somewhere :pirate_flag:. If there is an encryption command available, you may also encrypt some directories (but we can hope such commands are configured to be called with sudo) => ransomware… Note also that in your Linux home are stored: your emails, your ssh keys (for example your GitHub keys), etc.

It’s all about trust… and vigilance

Of course, the problem is not specific to Fortran. A good practice is to use only the official repos of your OS. Because you trust them: you imagine that there is some security mechanisms (algorithmic and social) to detect malicious code. And also simply code quality (with rm an unfortunate error can quickly happen!).

If you download programs from other locations, if it’s open source you can theoretically read the source to be sure the risk is null, which implies you know the language and the source is not too long. You also trust the protocols and tools: https, git clone, GitHub… And finally, you trust people (but note that on GitHub anybody can fork a repository).

2 Likes

Thanks for your warnings. On Windows Subsystem for Linux, I am sometimes annoyed by having to “sudo” various commands or to type ./a.out instead of just a.out, but such restrictions are there for a reason.

On Windows, isn’t one is effectively running commands as root? The majority of Windows users never open a command prompt, but many programmers do. I wonder if there is advice for using Windows CMD safely.

What are some strings one could search for in Fortran programs for potential vulnerabilities? I can think of at least EXECUTE_COMMAND_LINE, SYSTEM, SYSTEMQQ.

1 Like

@beliavsky
My knowledge of Windows is quite limited but NTFS is a modern system which can deal with users and groups. I remember having been asked a password when trying to access the system directories on C:\ from the file explorer. Probably it would be the same thing if you try to navigate into those directories from CMD?

I have never tried WSL, and I don’t know if the sudo rights concern only the directories of the Linux system or also the whole Windows filesystem.

But note that even without sudo, on a Linux system you can cause great damages as you can delete all the user’s files!

And yes, looking for the strings you cited could be a good practice when using big codes, using for example grep:

$ grep -inRI EXECUTE_COMMAND_LINE

See also that discussion about the necessity to use use, intrinsic ::

Some interesting discussions about Cargo and security:
https://www.reddit.com/r/rust/comments/lzw5br/cargo_security/
https://www.reddit.com/r/rust/comments/b4tdfm/cargo_package_security/
https://www.cryptologie.net/article/505/why-not-rust-for-security/

Not at all, on modern Windows. A lot of stuff is actually owned by SYSTEM, which is the one account that has a free reign on the system, but it’s not an account that belongs to any interactive user. The Administrator (sort of “root”) account (and anyone in Administrators group) have a lot of control of the machine, but it’s not usually a direct control. For example, to remove files not belonging to Administrators, one has to first take ownership of them as an explicit step. And even Administrator accounts are running non-elevated by default, and any action that requires elevation brings up annoying User Account Control dialog box, effectively asking “are you sure you want to elevate?”. A default non-elevated Administrator account will do much less damage by running equivalent of “rm -rf /” than a typical Unix root account. To get protections on Linux similar to what’s available on Windows, one must run something like SELinux.

2 Likes

I don’t think C++ vector’s array element access operator [] is bounds checked. To get bounds checking, one would need to use at() function: vector::at - C++ Reference