Deepseek and Grok LLMs for Fortran coding

Asking Grok 3 to do something related,

Write a function that has an input argument of a 1D array of reals and a format string such as “(f0.6)” and returns a deferred-length character array of sufficient LEN to hold the real numbers converted to strings.

first gave a function where it used only the first element of the array to determine the LEN, with Grok mentioning this limitation. Asked to use a LEN sufficient to hold all the numbers converted to character variables, Grok gave

MODULE string_conversion
  IMPLICIT NONE
  PRIVATE
  PUBLIC :: real_to_string_array  ! Expose only the function to users

CONTAINS

  FUNCTION real_to_string_array(real_array, fmt) RESULT(str_array)
    REAL, INTENT(IN) :: real_array(:)          ! Input 1D array of reals
    CHARACTER(LEN=*), INTENT(IN) :: fmt        ! Format string, e.g., "(f0.6)"
    CHARACTER(LEN=:), ALLOCATABLE :: str_array(:)  ! Deferred-length character array output
    INTEGER :: i, n, len_str, max_len
    CHARACTER(LEN=256) :: temp_str             ! Temporary string for formatting

    ! Get the size of the input array
    n = SIZE(real_array)

    ! Determine the maximum length needed by checking all elements
    max_len = 0
    DO i = 1, n
      WRITE(temp_str, fmt) real_array(i)       ! Format each real number
      len_str = LEN_TRIM(ADJUSTL(temp_str))    ! Get trimmed length of formatted string
      IF (len_str > max_len) max_len = len_str ! Update max length if current is larger
    END DO

    ! Allocate the deferred-length character array with the maximum length
    ALLOCATE(CHARACTER(LEN=max_len) :: str_array(n))

    ! Convert each real number to a string using the provided format
    DO i = 1, n
      WRITE(temp_str, fmt) real_array(i)       ! Write real to temporary string
      str_array(i) = TRIM(ADJUSTL(temp_str))   ! Trim and left-adjust the string
    END DO

  END FUNCTION real_to_string_array

END MODULE string_conversion

! Example program to test the module
PROGRAM test_real_to_string
  USE string_conversion    ! Use the module
  IMPLICIT NONE
  REAL :: numbers(5) = [1.234567, 89.123, -0.005, 12345.6789, -123456.789]
  CHARACTER(LEN=:), ALLOCATABLE :: strings(:)
  INTEGER :: i

  ! Call the function with format "(f0.6)"
  strings = real_to_string_array(numbers, "(f0.6)")

  ! Print the results
  DO i = 1, SIZE(strings)
    PRINT *, "Number ", i, ": '", strings(i), "' (Length: ", LEN(strings(i)), ")"
  END DO

  ! Clean up
  DEALLOCATE(strings)
END PROGRAM test_real_to_string

with output

 Number            1 : '1.234567      ' (Length:           14 )
 Number            2 : '89.123001     ' (Length:           14 )
 Number            3 : '-.005000      ' (Length:           14 )
 Number            4 : '12345.678711  ' (Length:           14 )
 Number            5 : '-123456.789062' (Length:           14 )

I think having coders who are at least mid-level proficient in every programming language, available 24/7, for free, is a boon.

1 Like

Grok can do it: Fortran Function with Deferred-Length Array | Shared Grok Conversation. It made a tiny mistake (forgot to put allocatable on one of the declaration), I just posted a compiler error, and it fixed it. The last printout compiles and runs with both GFortran and LFortran as is:

$ gfortran b.f90 && ./a.out
 Function returned:            3
$ lfortran b.f90
Function returned:     3
1 Like

Grok’s output is in a different league from what I tried, more impressive. But when I see the “each” in “The length of each character string is deferred and set at allocation time” I wonder if it would verify whether “two elements of the same character array can/cannot have different lengths”.

1 Like

Sure, I just posted your comment to Grok: Fortran Function with Deferred-Length Array | Shared Grok Conversation.

Thus, in response to your query: two elements of the same deferred-length character array cannot have different lengths.

I think that’s correct.

Mercury Coder is a new LLM that uses diffusion (used in image generation) rather than autoregression to generate code, as explained in this video. It is extremely fast since it generates the whole code at once, but it did get stuck on the prompt about simulating and fitting a mixture of normals. For the prompt

Define a Fortran derived type for a dataframe with character variable column names and integer indices. Simulate such a dataframe with random uniform data and print statistics on the columns. Declare real variables as real(kind=dp), where dp is a module parmeter set to kind(1.0d0)

it generated a working code once I told it to use fixed-length rather than deferred length character variables for the column names.

module DataFrameModule
    implicit none
    integer, parameter :: dp = kind(1.0d0)
    integer, parameter :: nlen = 10  ! Define the length of column names

    type :: DataFrame
        character(len=nlen), allocatable :: column_names(:)
        real(kind=dp), allocatable :: data(:, :)
        integer :: n_rows, n_cols
    contains
        procedure :: initialize => initialize_dataframe
        procedure :: simulate_data => simulate_dataframe_data
        procedure :: print_stats => print_dataframe_stats
    end type DataFrame

contains

    subroutine initialize_dataframe(this, n_rows, n_cols, column_names)
        class(DataFrame), intent(inout) :: this
        integer, intent(in) :: n_rows, n_cols
        character(len=*), intent(in) :: column_names(n_cols)
        this%n_rows = n_rows
        this%n_cols = n_cols
        allocate(this%column_names(n_cols))
        this%column_names = column_names
        allocate(this%data(n_rows, n_cols))
    end subroutine initialize_dataframe

    subroutine simulate_dataframe_data(this)
        class(DataFrame), intent(inout) :: this
        call random_number(this%data)
    end subroutine simulate_dataframe_data

    subroutine print_dataframe_stats(this)
        class(DataFrame), intent(in) :: this
        real(kind=dp) :: mean, std_dev
        integer :: j
        do j = 1, this%n_cols
            call calculate_mean_std(this%data(:, j), mean, std_dev)
            print *, "Column: ", trim(this%column_names(j))
            print *, "Mean: ", mean
            print *, "Standard Deviation: ", std_dev
            print *
        end do
    end subroutine print_dataframe_stats

    subroutine calculate_mean_std(data, mean, std_dev)
        real(kind=dp), intent(in) :: data(:)
        real(kind=dp), intent(out) :: mean, std_dev
        real(kind=dp) :: sum, sum_sq
        integer :: n, i
        n = size(data)
        sum = 0.0_dp
        sum_sq = 0.0_dp
        do i = 1, n
            sum = sum + data(i)
            sum_sq = sum_sq + data(i)**2
        end do
        mean = sum / n
        std_dev = sqrt((sum_sq / n) - mean**2)
    end subroutine calculate_mean_std

end module DataFrameModule

program Main
    use DataFrameModule
    implicit none
    type(DataFrame) :: df
    character(len=nlen), dimension(3) :: column_names = ["A", "B", "C"]
    integer :: n_rows, n_cols

    n_rows = 100
    n_cols = 3

    call df%initialize(n_rows, n_cols, column_names)
    call df%simulate_data()
    call df%print_stats()

end program Main

giving sample output

 Column: A
 Mean:   0.47517106404222736     
 Standard Deviation:   0.31178191704286989     

 Column: B
 Mean:   0.49751105740387586     
 Standard Deviation:   0.30548776910706921     

 Column: C
 Mean:   0.50007198678270182     
 Standard Deviation:   0.29607418178862621

It is fun to experiment with given its speed, but I think other LLMs are a bit better for more involved coding problems.

A colleague (from the informatics department) once told me that he does not use LLMs because “if I need a wrong answer given with extreme confidence, I will just ask an engineer”.

1 Like

Today’s featured LLM is qwen, described here and the subject of this video. It is an open-source model that is relatively small and which can run on a PC. It presents a lot of “thinking” before it shows code. For the normal mixture simulation and estimation prompt, feeding it gfortran error messages twice gave the following code in 5 minutes.

module mixture_module
  implicit none
  integer, parameter :: dp = kind(1.0d0)

  type :: MixtureModel
    integer :: K
    real(dp), allocatable :: weights(:)
    real(dp), allocatable :: means(:)
    real(dp), allocatable :: sds(:)
  end type MixtureModel

contains

  pure function normal_pdf(x, mu, sigma) result(pdf)
    real(dp), intent(in) :: x, mu, sigma
    real(dp) :: pdf
    real(dp) :: z
    z = (x - mu) / sigma
    pdf = exp(-0.5_dp * z**2) / (sigma * sqrt(2.0_dp * acos(-1.0_dp)))
  end function normal_pdf

  subroutine fit_mixture(data, K, max_iter, tol, model)
    real(dp), intent(in) :: data(:)
    integer, intent(in) :: K, max_iter
    real(dp), intent(in) :: tol
    type(MixtureModel), intent(out) :: model

    real(dp), allocatable :: responsibilities(:,:), new_weights(:), new_means(:), new_sds(:)
    real(dp) :: prev_loglik, current_loglik, sample_mean, sample_sd, r, component_sum
    integer :: iter, n, comp, i, idx
    logical :: converged

    n = size(data)
    model%K = K
    allocate(model%weights(K), model%means(K), model%sds(K))
    
    ! Initialize parameters
    model%weights = 1.0_dp / K
    call random_seed()
    do comp = 1, K
      call random_number(r)
      idx = floor(r * n) + 1
      model%means(comp) = data(idx)
    end do
    sample_mean = sum(data) / n
    sample_sd = sqrt(sum((data - sample_mean)**2) / (n - 1))
    model%sds = sample_sd

    allocate(responsibilities(n, K), new_weights(K), new_means(K), new_sds(K))
    
    prev_loglik = -huge(1.0_dp)
    converged = .false.

    do iter = 1, max_iter
      ! E-step
      do i = 1, n
        component_sum = 0.0_dp
        do comp = 1, K
          responsibilities(i, comp) = model%weights(comp) * normal_pdf(data(i), model%means(comp), model%sds(comp))
          component_sum = component_sum + responsibilities(i, comp)
        end do
        responsibilities(i, :) = responsibilities(i, :) / component_sum
      end do

      ! M-step
      do comp = 1, K
        new_weights(comp) = sum(responsibilities(:, comp)) / n
        new_means(comp) = sum(responsibilities(:, comp) * data) / sum(responsibilities(:, comp))
        new_sds(comp) = sqrt(sum(responsibilities(:, comp) * (data - new_means(comp))**2) / sum(responsibilities(:, comp)))
      end do

      model%weights = new_weights
      model%means = new_means
      model%sds = new_sds

      ! Compute log-likelihood
      current_loglik = 0.0_dp
      do i = 1, n
        component_sum = 0.0_dp
        do comp = 1, K
          component_sum = component_sum + model%weights(comp) * normal_pdf(data(i), model%means(comp), model%sds(comp))
        end do
        current_loglik = current_loglik + log(component_sum)
      end do

      if (abs(current_loglik - prev_loglik) < tol) then
        converged = .true.
        exit
      end if
      prev_loglik = current_loglik
    end do

    deallocate(responsibilities, new_weights, new_means, new_sds)
  end subroutine fit_mixture

  subroutine print_mixture(model)
    type(MixtureModel), intent(in) :: model
    integer :: comp
    print *, "Component | Weight     | Mean       | Std Dev"
    print *, "--------------------------------------------"
    do comp = 1, model%K
      print '(I9, 3F12.6)', comp, model%weights(comp), model%means(comp), model%sds(comp)
    end do
  end subroutine print_mixture

end module mixture_module

program test_mixture
  use mixture_module
  implicit none

  integer, parameter :: K_true = 2
  integer, parameter :: N = 1000
  real(dp) :: data(N)
  type(MixtureModel) :: true_model, estimated_model
  integer :: i
  real(dp) :: u

  ! Set true parameters
  true_model%K = K_true
  allocate(true_model%weights(K_true), true_model%means(K_true), true_model%sds(K_true))
  true_model%weights = [0.3_dp, 0.7_dp]
  true_model%means = [0.0_dp, 5.0_dp]
  true_model%sds = [1.0_dp, 1.0_dp]

  ! Generate data
  call random_seed()
  do i = 1, N
    call random_number(u)
    if (u < true_model%weights(1)) then
      data(i) = true_model%means(1) + true_model%sds(1) * normal_rand()
    else
      data(i) = true_model%means(2) + true_model%sds(2) * normal_rand()
    end if
  end do

  ! Fit the mixture model
  call fit_mixture(data, K_true, 100, 1.0e-6_dp, estimated_model)

  ! Print results
  print *, "True parameters:"
  call print_mixture(true_model)
  print *, "Estimated parameters:"
  call print_mixture(estimated_model)

contains

  function normal_rand() result(z)
    real(dp) :: z, u1, u2, r
    call random_number(u1)
    call random_number(u2)
    r = sqrt(-2.0_dp * log(u1))
    z = r * cos(2.0_dp * acos(-1.0_dp) * u2)
  end function normal_rand

end program test_mixture

which compiles and gives sample output

True parameters:
Component | Weight | Mean | Std Dev

    1    0.300000    0.000000    1.000000
    2    0.700000    5.000000   10.000000

Estimated parameters:
Component | Weight | Mean | Std Dev

    1    0.711504    5.011159    9.874043
    2    0.288496    0.073350    1.129037
1 Like

Google Gemini and other LLMs can compare two versions of a Fortran source file, describe the differences, and give a commit message. The Python script using Gemini

import subprocess
import sys
import google.generativeai as genai
import os
import textwrap

def print_wrapped_squeeze_blanks(text, width=80):
    # Split into lines and process each
    lines = text.split('\n')
    result = []
    prev_blank = False
    
    for line in lines:
        # Wrap long lines
        wrapped = [line] if len(line) <= width else textwrap.wrap(line, width)
        
        if line.strip():  # Non-blank line
            result.extend(wrapped)
            prev_blank = False
        elif not prev_blank:  # Blank line, but not after another blank
            result.append('')
            prev_blank = True
            
    print('\n'.join(result))

def get_git_diff(filename):
    """Get the diff between current file and last committed version."""
    try:
        # Get the diff using git command
        diff = subprocess.check_output(
            ['git', 'diff', 'HEAD', filename],
            text=True,
            stderr=subprocess.STDOUT
        )
        return diff if diff else "No changes detected in the file."
    except subprocess.CalledProcessError as e:
        return f"Error getting diff: {e.output}"
    except Exception as e:
        return f"Error: {str(e)}"

def generate_commit_message(diff):
    """Generate a commit message using Gemini API."""
    with open(r"c:\python\code\gemini_key.txt", "r") as key_file:
        api_key = key_file.read().strip()
    
    genai.configure(api_key=api_key)
    model_name = "gemini-1.5-flash"
    model = genai.GenerativeModel(model_name)  # Using a fast model
    
    # Create prompt for Gemini
    prompt = (
        "Describe the substantive differences between the local and committed versions of the file"
        f"{diff}. Then provide a one-line commit message, with commit message appearing on its own line."
    )
    
    try:
        response = model.generate_content(prompt)
        return response.text.strip()
    except Exception as e:
        return f"Error generating commit message: {str(e)}"

def main():
    # Check if filename is provided as command line argument
    if len(sys.argv) != 2:
        print("Usage: python script.py <filename>")
        sys.exit(1)
    
    filename = sys.argv[1]
    
    # Verify file exists
    if not os.path.exists(filename):
        print(f"Error: File '{filename}' not found")
        sys.exit(1)
    
    # Get the diff
    diff = get_git_diff(filename)
    
    # Generate and print the commit message
    commit_message = generate_commit_message(diff)
    print_wrapped_squeeze_blanks(commit_message)

if __name__ == "__main__":
    main()

run on a source file basic_stats.f90 gave

The committed version adds functions and subroutines for computing and printing covariance matrices (cov, cov_mat, print_cov_mat). It also modifies the basic_stats module to include covariance-related variables.

Add covariance functions and subroutines

I previously mentioned a simple Groq cloud Fortran coding agent. There are now analogs for Gemini and OpenAI. The latter requires a credit card, and the repo gives an example prompt and the cost ($0.07) of iterating until compilable code was generated.