Cursor rules for Fortran

Hi, lately I use Cursor for my programming (an AI-assisted programming tool based on VScode). It has the option to insert rules for the specific programming language. Some examples of these are at this link: GitHub - PatrickJS/awesome-cursorrules: πŸ“„ A curated list of awesome .cursorrules files

I tried to build one for Fortran based on suggested practices I gathered. Can you revise or suggest additions?

---
description: 
globs: **/*.f,**/*.f90,**/*.f95,**/*.f03,**/*.f08,**/*.for,**/*.ftn,CMakeLists.txt,*.cmake,Makefile
alwaysApply: false
---
# Fortran Programming Guidelines

## Basic Principles

- Use English for all code and documentation.
- Always declare variables with `implicit none` in every program component.
- Create necessary derived types and modules.
- Use comments to document public modules, subroutines, and functions.
- Follow consistent indentation in control structures.
- Use structured programming techniques.
- Avoid programming tricks that subvert the intended purpose of the language.
- Do not write programs that modify themselves as they execute.

## Nomenclature

- Use lowercase for all Fortran constructs (do, subroutine, module, etc.).
- Follow short mathematical notation for mathematical variables/functions (Ylm, Gamma, etc.).
- Use all lowercase for other names: try to keep names to one or two syllables.
- Use underscores to clarify longer names (spline_interp, stop_error, etc.).
- Avoid abbreviations that could be ambiguous (e.g., "int" could mean integration or integer).
- Variables should be defined with units specified in comments.
- Use consistent naming conventions across modules.
- Do not use the same name with different uppercase/lowercase spelling.
- Avoid using the letters O and l and the digits 0 and 1 in names where they might be confused.

## Subroutines and Functions

- Write short subroutines and functions with a single purpose.
- Name functions with a verb and something else.
- If it returns a logical value, use isX or hasX, canX, etc.
- Always use `intent` for all arguments (in, out, inout).
- Functions should have no side effects - all function arguments should be intent(in).
- Avoid nesting blocks by:
  - Early checks and returns.
  - Extraction to utility subroutines.
- Use a single level of abstraction.
- Use 'Javadoc'-style comments to define inputs, outputs, and purpose.

## Data

- Don't abuse primitive types and encapsulate data in derived types.
- Avoid data validations in functions and use modules with internal validation.
- Prefer immutability for data.
- Use parameter for constants.
- Specify array bounds explicitly when needed.
- Use allocatable arrays rather than pointers when possible.
- Check array sizes at the beginning of subroutines if necessary.

## Modules

- Modules should always have their own file.
- Related functions, subroutines, and variables should be grouped together into modules.
- Avoid using module level variables when possible.
- Always specify `use, only` when importing from another module.
- Set entities within a module to private by default and declare public only when necessary.
- Avoid COMMON blocks.

## Exceptions and Error Handling

- Build error checks into the program.
- Use clear error messages.
- Implement consistent error handling mechanisms.
- Consider using error codes or status variables for expected failures.
- Document error handling procedures.

## Memory Management

- Properly deallocate allocatable arrays when no longer needed.
- Check if arrays are already allocated before allocation.
- Avoid dynamic memory and pointers when possible as they make code difficult to navigate.
- Use allocatable arrays instead of pointers when feasible.

## Testing

- Write unit tests for each public subroutine and function.
- Use test doubles to simulate dependencies.
- Write integration tests for each module.
- Test boundary conditions and error cases.

## Project Structure

- Use modular architecture.
- Organize code into logical directories.
- Use a build system like CMake.
- Use namespaces (modules) to organize code logically.
- Create utility modules for common functions.

## Standard Compliance

- Follow modern Fortran standards (Fortran 2003, 2008, or newer).
- Avoid obsolete features.
- Use modern syntax when available (e.g., array constructors with square brackets).
- Avoid GOTO statements and numeric labels.

## Code Formatting

- Use consistent indentation (typically 2-3 spaces).
- Indent blocks in control structures (do, if, select case).
- Use blank lines to separate related parts of a program.
- Use white space freely in statements (around delimiters).
- FORTRAN keywords must be clearly set off by a blank character.
- Indent the executable statements in a DO loop at least three spaces from the DO statement.
- All elements related to a specific level of the control structure should be aligned to the same column.
- Repeat function/subroutine/module name after `end`.
- Use `end do` and `end if` consistently.
- Variable declarations must use a double colon `::` even if no other attributes are present.
- If multiple declarations occur with the same attributes, list the attributes in the same order.
4 Likes

By default some LLMs will produce Fortran code with default reals. To avoid this, I often instruct them to

Declare real variables as

real(kind=dp) with

dp a kind constant imported from module kind_mod, which I will provide

I tell them to put procedures in modules, and in my Fortran agents I tell them to iterate until the code compiles with the gfortran options

-O0 -fmax-errors=1 -Wall -Werror=unused-parameter -Werror=unused-variable -Werror=unused-function -Wno-maybe-uninitialized -Wno-surprising -fbounds-check -static -g

Common mistakes that LLM make are to

  1. Assume pi is a built-in constant
  2. Declare variables after executable code, without using block
  3. print or write in pure procedures or use random_number
  4. Think that random_number is a function rather than a subroutine
  5. Declare superfluous variables such as i and j.
  6. Declare variables more than once
  7. Not to explicitly declare as public module entities that need to be public. This is allowed by the standard, but I want code that compiles with gfortran -fmodule-private.
  8. Have two variables that differ only by case in the same scope. In other languages those variables are distinct.

You can instruct them not to do this.

3 Likes

I would suggest adding construct labels for all control structures. It has two benefits: the compiler enforces proper nesting and matching end statements, and it’s easier for humans to parse.

1 Like

I think doing so for all control structures is verbose, but it’s a matter of taste. One could require that loops with cycle or exit have a construct label that is used in those statements, for clarity.