Practical use of "scratch" file?

Fortran open function has a STATUS='SCRATCH' option which, according to Scratch Files in Fortran Wiki, opens a temporary file.

However:

  • The scratch file is automatically deleted upon closing it, so it cannot be reused later in the program;
  • It’s not possible (in pure Fortran) to query the path of the temporary file, so it’s not possible to rename it or copy it.

I was thinking of the following use case:

  • A program wanting to store data for later use, and cannot store it in memory for any reason (but don’t want it to persist after program termination);
  • A program generating a file from a template: the output is written in a temporary file until the generation procedure is completed without error. Then, the temporary file is moved to a user folder, and renamed.

While scratch files seem to be a good idea at first glance, I could not yet find an application for them.

Note that, in the Microsoft C example at Creating and Using a Temporary File - Win32 apps | Microsoft Learn, the scratch file is not automatically deleted when CloseHandle, is called, unlike Fortran close, making it possible to copy it or reopen it later.

Does anyone has experience with scratch files?

1 Like

I’ve never used scratch files and I struggle to see much of a use for them either. My biggest use of temporary files is probably when bootstrapping/calibrating a model/code that requires a file (e.g. a config file) as an input. However, I typically need to know the filename and path to pass to the model/code, so this renders scratch files useless, if I understand correctly.

I suspect this is a legacy from the days when memory was scarce. I first saw Fortran on a mainframe with 96K of memory, split into fixed partitions. Scratch files (say for sort work or merges) would be allocated on tape drives. Write, rewind, read again. Discard when the job completed.

They were used far more frequently in the past when memory was smaller. They are still used most commonly by users working with very large problems; typically ones that would require terabytes of memory otherwise. They are very often binary direct access fiiles, another feature much more rarely used today than in the past. Typically it is important that the user be able to specify where the files reside as they are almost always large and so space and performance are an issue. This is typically done via setting an environment variable such as $TMPDIR. Several compilers have extensions that let you save the files if you desire, typically for unexpected check-pointing or debugging. The IBM compiler probably has the most extensions in that regard.

Usually, as mentioned, many applications can keep their scratch data in memory now that HPC machines can commonly have hundreds of GB of memory.

In my experience they were and are used as a type of high latency but vast capacity extended memory; so if your problems can fit in today’s much (much) larger available memory it is unlikely you have a need for the feature.

6 Likes

I have learned by experience that what I do not use or find useful is not necessarily useless. Last week, I replaced an internal namelist IO with a scratch file for better automatic namelist IO error messages by the compiler. There is no extra work to create temporary files or folders, and there is no need to delete them explicitly afterward.

Of course there are many uses for scratch files, files whose contents are used within the program execution and are automatically deleted afterwards. My problem with fortran scratch files has always been that you cannot specify the name of the file. On most systems, including unix/posix systems, the name contains the device and directory path information. A large scratch file typically needs to be located on a particular device, not necessarily on the system /tmp location, in order to have sufficient capacity, i/o bandwidth, shared/private access permissions, etc. The fact that the file name cannot be specified has always made fortran scratch files useless to me.

On unix/posix machines, a common work around for this is to open a file in the normal way, with a directory/file name specified. Then immediately after the file is opened, the unlink system call is executed. This removes the file from the file system, but it is still available to the running fortran program. Upon a close, the file space is deleted immediately. The downside for this approach is that you cannot monitor the file size externally while the program is running, you can only monitor the total disk usage (e.g. with du).

1 Like

I have never used a scratch file in 50 years of Fortran. It has always been better to keep temporary files and delete them if not required.
There is no advantage in scratch files on the computers I have used, as they still take disk space, just like a named file.
Their advantage could have been when doing batch runs on old mainframes, where scratch files might have avoided the JCL requirement to name them. Running a large job with a deck of cards was before my time !

I learned Fortran while working as a student computer operator in college. Cards everywhere. I remember one sociology professor had a monster program that used multiple tape drives as scratch files. But that’s back in the era of small memory, expensive disk space, and batch processing.

Disk catalog management was a pain before the days of VSAM and then more modern systems, scratch space for throwaway use was common in my experience.

Thank you all for these very interesting answers.

They make me say our wiki page Scratch Files in Fortran Wiki could benefit from an update saying scratch files are essentially a legacy feature with little use in modern Fortran.

I disagree. This would suggest this is an obsolete feature from a language point of view, which is not the case at all. It can be considered as an obsolete feature from a hardware point of view, as modern machines have much more RAM than before. Nonetheless, they are not completely useless.

One practical use I’ve seen (in fpm if I remember) is capturing command line output as in:

cmd = "git config --get·user.name >" // temp_filename
call execute_command_line(cmd,...)

I don’t see this practical use.
Scratch files do not have a name, so how can you capture the information?

Having an un-named file is not of any use.
If you don’t open a file unit in Gfortran, but use the unit number, then the file will be opened and given a name. (may apply to formatted or unformatted sequential only?)

Un-named scratch files have no use for me. Just give them a name and decide after.

And for me they have some use, even if not frequent. I have used them in a code about 10 years ago, and this code is still actively used. Scratch files are not an essential part of the language and it’s possible to do without them, but they can be convenient (no need to pick a filename and to pay attention to name collisions)

The nice feature of scratch files, for me, is that they are automatically deleted upon close or program termination. In most of my career, I have always faced file capacity limitations in my calculations. In quantum chemistry, scratch files are used to store long lists of repulsion integrals and wave function coefficients. Some of these are just temporary, used while the program in running, and then deleted afterwards. If a disk file exceeds capacity, then the program crashes. If the temp file was open with a file name, then the close(unit=n,status='delete') is never executed and the file will persist after the crash, preventing other jobs and other users from running their codes on that device. In some cases, the full disk might crash the operating system itself. The user of the program must then try to locate the files and delete them to allow the system to continue running. A fortran scratch file could eliminate that step because the file is deleted automatically. But, as I stated previously, the fact that you cannot control the device and directory path for a fortran scratch file makes them useless for this task too.

That never happened ! They were long gone and left the mess for someone else. The days when 300 Mb CDC disks were big but never big enough.

I found a practical use of scratch in my code here.

I broke down the user input for my code into several input buffers with status=scratch. Then my code would read the information from those input buffers. It is very convenient because I don’t need to name those input buffers and they are destroyed immediately when I don’t longer need them.

A belated thought: since one cannot print from one’s application, as far as I know, only by means of notepad, scratch-files might be useful.

Note: there’s really no need to do such a thing. See: GitHub - jacobwilliams/popen-fortran: Simple Fortran module for popen

1 Like