Deepseek and Grok LLMs for Fortran coding

Deepseek is a new open-source LLM that is good at Fortran programming, comparable to ChatGPT o1 I think.

Using the free chat interface to give the task

Write a Fortran subroutine thats a finite mixture of normals with a specified # of components to univariate data, using the EM algorithm, and write a main program that tests it for data simulated from a known mixture distribution.

and feeding compiler error messages back to Deepseek, I soon got the working program below. For production use you would probably tell the LLM to assume that common algorithms such as generating normal deviates or sorting are already available, for example in stdlib, so that the LLM does not reinvent the wheel.

module mixture_of_normals
    implicit none
    integer, parameter :: dp = kind(1.0d0)  ! Double precision kind
    real(dp), parameter :: pi = 3.141592653589793238462643383279502884197_dp

contains

    ! Subroutine to fit a finite mixture of normals using the EM algorithm
    subroutine fit_mixture_of_normals(data, num_components, weights, means, variances, max_iter, tol, debug)
        real(dp), intent(in) :: data(:)  ! Input data
        integer, intent(in) :: num_components  ! Number of components
        real(dp), intent(out) :: weights(:), means(:), variances(:)  ! Output parameters
        integer, intent(in) :: max_iter  ! Maximum number of iterations
        real(dp), intent(in) :: tol  ! Convergence tolerance
        logical, intent(in), optional :: debug  ! Debugging flag
        logical :: debug_  ! Local debugging flag
        real(dp), allocatable :: responsibilities(:,:), log_likelihoods(:)
        real(dp) :: log_likelihood, prev_log_likelihood
        integer :: n, iter

        ! Initialize local debugging flag
        debug_ = .false.
        if (present(debug)) debug_ = debug

        ! Number of data points
        n = size(data)

        ! Allocate memory for responsibilities and log-likelihoods
        allocate(responsibilities(n, num_components), log_likelihoods(n))

        ! Initialize parameters
        call initialize_parameters(data, num_components, weights, means, variances)

        ! EM algorithm
        prev_log_likelihood = -huge(1.0_dp)
        do iter = 1, max_iter
            ! E-step: Compute responsibilities
            call compute_responsibilities(data, weights, means, variances, responsibilities, log_likelihoods)

            ! Compute log-likelihood
            log_likelihood = sum(log_likelihoods)

            ! Check for convergence
            if (abs(log_likelihood - prev_log_likelihood) < tol) exit

            ! M-step: Update parameters
            call update_parameters(data, responsibilities, weights, means, variances)

            ! Debugging: Print current parameters
            if (debug_) then
                print '(a20, i0)', "Iteration:", iter
                print '(a20, 100(f12.6, 1x))', "Weights:", weights
                print '(a20, 100(f12.6, 1x))', "Means:", means
                print '(a20, 100(f12.6, 1x))', "Variances:", variances
                print '(a20, f12.6)', "Log-likelihood:", log_likelihood
                print *, "----------------------------------------"
            end if

            prev_log_likelihood = log_likelihood
        end do

        deallocate(responsibilities, log_likelihoods)
    end subroutine fit_mixture_of_normals

    ! Initialize parameters
    subroutine initialize_parameters(data, num_components, weights, means, variances)
        real(dp), intent(in) :: data(:)
        integer, intent(in) :: num_components
        real(dp), intent(out) :: weights(:), means(:), variances(:)
        integer :: i, n  ! Declare 'n' here
        real(dp) :: data_min, data_max

        ! Number of data points
        n = size(data)
        data_min = minval(data)
        data_max = maxval(data)

        ! Initialize weights uniformly
        weights = 1.0_dp / num_components

        ! Initialize means and variances
        do i = 1, num_components
            means(i) = data_min + (data_max - data_min) * (i - 1) / (num_components - 1)
            variances(i) = 1.0_dp
        end do
    end subroutine initialize_parameters

    ! Compute responsibilities (E-step)
    subroutine compute_responsibilities(data, weights, means, variances, responsibilities, log_likelihoods)
        real(dp), intent(in) :: data(:), weights(:), means(:), variances(:)
        real(dp), intent(out) :: responsibilities(:,:), log_likelihoods(:)
        integer :: n, num_components, i, j
        real(dp), allocatable :: likelihood(:,:), total_likelihood(:)

        n = size(data)
        num_components = size(weights)

        ! Allocate memory for likelihood and total_likelihood
        allocate(likelihood(n, num_components), total_likelihood(n))

        ! Compute likelihood for each component
        do j = 1, num_components
            do i = 1, n
                likelihood(i, j) = weights(j) * normal_pdf(data(i), means(j), variances(j))
            end do
        end do

        ! Compute total likelihood for each data point
        total_likelihood = sum(likelihood, dim=2)

        ! Compute responsibilities
        do j = 1, num_components
            responsibilities(:, j) = likelihood(:, j) / total_likelihood
        end do

        ! Compute log-likelihood
        log_likelihoods = log(total_likelihood)

        deallocate(likelihood, total_likelihood)
    end subroutine compute_responsibilities

    ! Update parameters (M-step)
    subroutine update_parameters(data, responsibilities, weights, means, variances)
        real(dp), intent(in) :: data(:), responsibilities(:,:)
        real(dp), intent(inout) :: weights(:), means(:), variances(:)
        integer :: n, num_components, j
        real(dp), allocatable :: sum_responsibilities(:)

        n = size(data)
        num_components = size(weights)

        ! Allocate memory for sum_responsibilities
        allocate(sum_responsibilities(num_components))

        ! Update weights
        sum_responsibilities = sum(responsibilities, dim=1)
        weights = sum_responsibilities / n

        ! Update means
        do j = 1, num_components
            means(j) = sum(data * responsibilities(:, j)) / sum_responsibilities(j)
        end do

        ! Update variances
        do j = 1, num_components
            variances(j) = sum((data - means(j))**2 * responsibilities(:, j)) / sum_responsibilities(j)
        end do

        deallocate(sum_responsibilities)
    end subroutine update_parameters

    ! Probability density function of a normal distribution
    function normal_pdf(x, mean, variance) result(pdf)
        real(dp), intent(in) :: x, mean, variance
        real(dp) :: pdf

        pdf = exp(-0.5_dp * (x - mean)**2 / variance) / sqrt(2.0_dp * pi * variance)
    end function normal_pdf

end module mixture_of_normals

program test_mixture_of_normals
    use mixture_of_normals
    implicit none
    integer, parameter :: n = 10**5  ! Number of data points
    integer, parameter :: num_components = 2  ! Number of components
    integer, parameter :: num_simulations = 5  ! Number of simulations
    real(dp) :: data(n)
    real(dp) :: weights(num_components), means(num_components), variances(num_components)
    real(dp) :: true_weights(num_components), true_means(num_components), true_variances(num_components)
    integer :: i, j, idx(num_components)

    ! True parameters of the mixture distribution
    true_weights = [0.99_dp, 0.01_dp]
    true_means = [0.0_dp, 0.0_dp]
    true_variances = [1.0_dp, 100.0_dp]
    print*,"#obs:", n ! line added by me
    ! Print true parameters once
    print '(a20, a)', "True Parameters:", ""
    print '(a20, 100(f12.6, 1x))', "Weights:", true_weights
    print '(a20, 100(f12.6, 1x))', "Means:", true_means
    print '(a20, 100(f12.6, 1x))', "Variances:", true_variances
    print *, "========================================"

    ! Loop for multiple simulations and fittings
    do i = 1, num_simulations
        print '(a, i0)', "Simulation:", i
        print *, "----------------------------------------"

        ! Simulate data from a known mixture distribution
        call simulate_mixture_data(data, true_weights, true_means, true_variances)

        ! Fit mixture of normals to the data
        call fit_mixture_of_normals(data, num_components, weights, means, variances, max_iter=100, tol=1.0e-6_dp, &
           debug=.false.)

        ! Sort components by weight in descending order
        do j = 1, num_components
            idx(j) = j
        end do
        call sort_by_weight(weights, means, variances, idx, num_components)

        ! Print estimated parameters
        print '(a20, a)', "Estimated Parameters:", ""
        print '(a20, 100(f12.6, 1x))', "Weights:", weights
        print '(a20, 100(f12.6, 1x))', "Means:", means
        print '(a20, 100(f12.6, 1x))', "Variances:", variances
        print *, "========================================"
    end do

contains

    ! Simulate data from a known mixture distribution
    subroutine simulate_mixture_data(data, true_weights, true_means, true_variances)
        real(dp), intent(out) :: data(:)
        real(dp), intent(in) :: true_weights(:), true_means(:), true_variances(:)
        integer :: i

        ! Simulate data
        do i = 1, n
            call random_number(data(i))
            if (data(i) < true_weights(1)) then
                data(i) = true_means(1) + sqrt(true_variances(1)) * randn()
            else
                data(i) = true_means(2) + sqrt(true_variances(2)) * randn()
            end if
        end do
    end subroutine simulate_mixture_data

    ! Random number generator for standard normal distribution
    function randn() result(r)
        real(dp) :: r
        real(dp) :: u1, u2

        call random_number(u1)
        call random_number(u2)
        r = sqrt(-2.0_dp * log(u1)) * cos(2.0_dp * pi * u2)
    end function randn

    ! Sort components by weight in descending order
    subroutine sort_by_weight(weights, means, variances, idx, num_components)
        real(dp), intent(inout) :: weights(:), means(:), variances(:)
        integer, intent(inout) :: idx(:)
        integer, intent(in) :: num_components
        integer :: i, j
        real(dp) :: temp_weight, temp_mean, temp_variance
        integer :: temp_idx

        ! Bubble sort (simple sorting algorithm)
        do i = 1, num_components - 1
            do j = i + 1, num_components
                if (weights(i) < weights(j)) then
                    ! Swap weights
                    temp_weight = weights(i)
                    weights(i) = weights(j)
                    weights(j) = temp_weight

                    ! Swap means
                    temp_mean = means(i)
                    means(i) = means(j)
                    means(j) = temp_mean

                    ! Swap variances
                    temp_variance = variances(i)
                    variances(i) = variances(j)
                    variances(j) = temp_variance

                    ! Swap indices
                    temp_idx = idx(i)
                    idx(i) = idx(j)
                    idx(j) = temp_idx
                end if
            end do
        end do
    end subroutine sort_by_weight

end program test_mixture_of_normals

with sample output

 #obs:      100000
    True Parameters:
            Weights:    0.990000     0.010000
              Means:    0.000000     0.000000
          Variances:    1.000000   100.000000
 ========================================
Simulation:1
 ----------------------------------------
Estimated Parameters
            Weights:    0.989977     0.010023
              Means:    0.002358     0.044858
          Variances:    0.999245    98.262609
 ========================================
Simulation:2
 ----------------------------------------
Estimated Parameters
            Weights:    0.989784     0.010216
              Means:    0.004232     0.504152
          Variances:    1.001481   101.687353
 ========================================
Simulation:3
 ----------------------------------------
Estimated Parameters
            Weights:    0.989921     0.010079
              Means:   -0.001570     0.154525
          Variances:    1.003607   107.735091
 ========================================
Simulation:4
 ----------------------------------------
Estimated Parameters
            Weights:    0.989776     0.010224
              Means:   -0.001740    -0.078073
          Variances:    0.993489   101.462980
 ========================================
Simulation:5
 ----------------------------------------
Estimated Parameters
            Weights:    0.989377     0.010623
              Means:    0.001496     0.394484
          Variances:    0.995740    96.620077
 ========================================

I think we will soon see agents that use Deepseek to solve coding tasks, so that the user never sees code that does not compile.

5 Likes

Deepseek will spur OpenAI to release cheaper, more powerful models. Today they released ChatGPT o3-mini-high to $20/month subscribers. A video about it is here.

Using almost the same prompt as before

Write a Fortran subroutine in a module that fits a finite mixture of normals with a specified # of components to univariate data, using the EM algorithm, and write a main program that tests it for data simulated from a known mixture distribution. Declare real variables as real(kind=dp) with

integer, parameter :: dp = kind(1.0d0)

gave, after two iterations of gfortran error messages, the program

module mixture_model
  implicit none
  integer, parameter :: dp = kind(1.0d0)
contains
  subroutine fit_mixture(x, k, pi, mu, sigma2, max_iter, tol)
    implicit none
    ! Input arguments:
    real(kind=dp), intent(in)  :: x(:)       ! Data vector
    integer, intent(in)       :: k          ! Number of mixture components
    integer, intent(in), optional :: max_iter ! Maximum iterations
    real(kind=dp), intent(in), optional :: tol  ! Convergence tolerance

    ! Output arguments:
    real(kind=dp), intent(out) :: pi(k)      ! Mixing proportions
    real(kind=dp), intent(out) :: mu(k)      ! Means
    real(kind=dp), intent(out) :: sigma2(k)  ! Variances

    ! Local variables:
    integer :: n, i, j, iter, maxIters
    real(kind=dp) :: tolerance, sum_val, pdf_val, norm_const
    real(kind=dp) :: prev_ll, ll, diff
    real(kind=dp) :: rtmp, sum_gamma, tmp
    real(kind=dp), allocatable :: gamma(:,:)
    real(kind=dp), parameter :: pi_const = acos(-1.0_dp)

    n = size(x)
    if (present(max_iter)) then
      maxIters = max_iter
    else
      maxIters = 1000
    end if
    if (present(tol)) then
      tolerance = tol
    else
      tolerance = 1.0d-6
    end if

    allocate(gamma(n, k))

    !-------------------------------------------------
    ! Initialization
    !-------------------------------------------------
    ! Set equal mixing proportions.
    pi = 1.0_dp / real(k, dp)
    ! Initialize the means by choosing k random data points.
    do j = 1, k
      call random_number(rtmp)
      i = int(rtmp * n) + 1
      mu(j) = x(i)
    end do
    ! Initialize variances using the overall data variance.
    do j = 1, k
      sigma2(j) = sum( (x - mu(j))**2 ) / real(n, dp)
      if (sigma2(j) < 1.0d-6) sigma2(j) = 1.0d-6
    end do

    prev_ll = -huge(1.0_dp)
    !-------------------------------------------------
    ! EM Iterations
    !-------------------------------------------------
    do iter = 1, maxIters
      ! E-step: compute responsibilities gamma(i,j)
      do i = 1, n
        sum_val = 0.0_dp
        do j = 1, k
          norm_const = 1.0_dp / sqrt(2.0_dp * pi_const * sigma2(j))
          pdf_val = norm_const * exp(-0.5_dp * (x(i) - mu(j))**2 / sigma2(j))
          gamma(i, j) = pi(j) * pdf_val
          sum_val = sum_val + gamma(i, j)
        end do
        if (sum_val > 0.0_dp) then
          gamma(i, :) = gamma(i, :) / sum_val
        else
          gamma(i, :) = 1.0_dp / real(k, dp)
        end if
      end do

      ! M-step: update parameters using the responsibilities.
      do j = 1, k
        sum_gamma = sum(gamma(:, j))
        pi(j) = sum_gamma / real(n, dp)
        mu(j) = sum(gamma(:, j) * x) / sum_gamma
        sigma2(j) = sum(gamma(:, j) * (x - mu(j))**2) / sum_gamma
        if (sigma2(j) < 1.0d-6) sigma2(j) = 1.0d-6
      end do

      ! Compute the log-likelihood for convergence check.
      ll = 0.0_dp
      do i = 1, n
        tmp = 0.0_dp
        do j = 1, k
          norm_const = 1.0_dp / sqrt(2.0_dp * pi_const * sigma2(j))
          pdf_val = norm_const * exp(-0.5_dp * (x(i) - mu(j))**2 / sigma2(j))
          tmp = tmp + pi(j) * pdf_val
        end do
        ll = ll + log(tmp + 1.0d-12)
      end do

      diff = abs(ll - prev_ll)
      if (diff < tolerance) exit
      prev_ll = ll
    end do

    deallocate(gamma)
  end subroutine fit_mixture
end module mixture_model

program test_mixture
  use mixture_model
  implicit none
  ! No need to redeclare dp here because it's imported from mixture_model.
  integer, parameter :: n = 1000          ! Number of data points to simulate
  integer, parameter :: k_true = 2        ! True number of components
  integer, parameter :: k_fit  = 2        ! Number of components to fit

  real(kind=dp), allocatable :: x(:)
  real(kind=dp), dimension(k_fit) :: pi_est, mu_est, sigma2_est
  real(kind=dp) :: u, u1, u2
  real(kind=dp), dimension(k_true) :: pi_true, mu_true, sigma_true
  real(kind=dp), parameter :: pi_const = acos(-1.0_dp)
  integer :: i

  ! Set true mixture parameters.
  pi_true    = (/ 0.3_dp, 0.7_dp /)
  mu_true    = (/ 0.0_dp, 5.0_dp /)
  sigma_true = (/ 1.0_dp, 2.0_dp /)

  allocate(x(n))
  call random_seed()  ! Initialize the random number generator

  ! Simulate n data points from the true mixture.
  do i = 1, n
    call random_number(u)
    if (u < pi_true(1)) then
      call random_number(u1)
      call random_number(u2)
      x(i) = mu_true(1) + sigma_true(1) * sqrt(-2.0_dp * log(u1)) * &
             cos(2.0_dp * pi_const * u2)
    else
      call random_number(u1)
      call random_number(u2)
      x(i) = mu_true(2) + sigma_true(2) * sqrt(-2.0_dp * log(u1)) * &
             cos(2.0_dp * pi_const * u2)
    end if
  end do

  ! Call the EM algorithm to fit a mixture of k_fit normals.
  call fit_mixture(x, k_fit, pi_est, mu_est, sigma2_est)

  ! Print out the estimated parameters.
  print*, "Estimated mixing proportions:"
  do i = 1, k_fit
     print*, "  Component", i, ":", pi_est(i)
  end do

  print*, "Estimated means:"
  do i = 1, k_fit
     print*, "  Component", i, ":", mu_est(i)
  end do

  print*, "Estimated variances:"
  do i = 1, k_fit
     print*, "  Component", i, ":", sigma2_est(i)
  end do

  deallocate(x)
end program test_mixture

which appears to work correctly, giving sample output

 Estimated mixing proportions:
   Component           1 :  0.30593190730133246     
   Component           2 :  0.69406809269866721     
 Estimated means:
   Component           1 :  -5.9430891191276567E-004
   Component           2 :   5.1418473613103606     
 Estimated variances:
   Component           1 :  0.98673179043578729     
   Component           2 :   4.3167008671496028
1 Like

Nice! They both look like reasonable implementations. What I notice is that they have none of Fortranā€™s advanced constructs (pure, elemental, etc.), probably thereā€™s not enough such content on GitHub!

3 Likes

The fault is never in the LLM but in our prompts :slight_smile: ā€“ I could have initially asked it to declare procedures as pure or elemental where possible. Asking it to do so for the code it generated, it declared normal_pdf as pure elemental and wrote

  ! Subroutine to fit a finite mixture of normals via the EM algorithm.
  ! This subroutine is not declared pure because it calls RANDOM_NUMBER.
  !--------------------------------------------------------------------
  subroutine fit_mixture(x, k, pi, mu, sigma2, max_iter, tol)

Starting a new session and prompting it to use more features of modern Fortran with

Write a Fortran subroutine in a module that fits a finite mixture of normals with a specified # of components to univariate data, using the EM algorithm, and write a main program that tests it for data simulated from a known mixture distribution. Declare real variables as real(kind=dp) with

integer, parameter :: dp = kind(1.0d0)

Define a derived type to store the parameters of a finite mixture, and write a subroutine to display the parameters in a formatted table. Use the subroutine to display the true and estimated parameters. Declare procedures as pure or elemental when feasible.

gave, after 4 iterations of being shown gfortran error messages,

module mixture_mod
  implicit none
  ! Declare doubleā€precision kind and constant pi.
  integer, parameter :: dp = kind(1.0d0)
  real(dp), parameter :: pi = 3.14159265358979323846_dp

  ! Derived type to store finite mixture parameters.
  type :: mixture_type
     integer :: ncomp
     real(dp), allocatable :: pi(:)
     real(dp), allocatable :: mu(:)
     real(dp), allocatable :: sigma(:)
  end type mixture_type

contains

  !---------------------------------------------------------------------
  ! Elemental pure function for the Normal density.
  elemental pure function normpdf(x, mu, sigma) result(val)
    real(dp), intent(in) :: x, mu, sigma
    real(dp) :: val
    val = 1.0_dp / (sqrt(2.0_dp*pi)*sigma) * exp(-0.5_dp*((x-mu)/sigma)**2)
  end function normpdf
  !---------------------------------------------------------------------

  !---------------------------------------------------------------------
  ! Subroutine to fit a finite mixture of normals using the EM algorithm.
  subroutine fit_mixture(data, k, mix)
    implicit none
    integer, parameter :: max_iter = 1000
    real(dp), intent(in) :: data(:)
    integer, intent(in) :: k
    type(mixture_type), intent(out) :: mix

    ! Declaration section.
    integer :: n, iter, i, j
    real(dp) :: tol, diff, loglik, loglik_old, pdf_val, weight_j, sum_pdf
    real(dp), allocatable :: gamma(:,:)
    real(dp) :: x_min, x_max, range

    n = size(data)
    tol = 1.0e-6_dp

    ! Allocate the responsibility (gamma) matrix.
    allocate(gamma(n,k))

    ! Allocate arrays in the derived type and set the number of components.
    mix%ncomp = k
    allocate(mix%pi(k))
    allocate(mix%mu(k))
    allocate(mix%sigma(k))

    ! --- Initialization ---
    x_min = minval(data)
    x_max = maxval(data)
    range = x_max - x_min
    do j = 1, k
       mix%pi(j)    = 1.0_dp/k
       mix%mu(j)    = x_min + range*(j - 0.5_dp)/k
       mix%sigma(j) = range/(2.0_dp*k)
    end do

    loglik_old = -huge(1.0_dp)
    do iter = 1, max_iter
       loglik = 0.0_dp
       ! --- E-step: compute responsibilities ---
       do i = 1, n
          sum_pdf = 0.0_dp
          do j = 1, k
             pdf_val = mix%pi(j) * normpdf(data(i), mix%mu(j), mix%sigma(j))
             gamma(i, j) = pdf_val
             sum_pdf = sum_pdf + pdf_val
          end do
          if (sum_pdf > 0.0_dp) then
             gamma(i, :) = gamma(i, :) / sum_pdf
             loglik = loglik + log(sum_pdf)
          end if
       end do

       diff = abs(loglik - loglik_old)
       if (diff < tol) exit
       loglik_old = loglik

       ! --- M-step: update parameters ---
       do j = 1, k
          weight_j = sum(gamma(:, j))
          if (weight_j > 0.0_dp) then
             mix%mu(j) = sum(gamma(:, j)*data(:)) / weight_j
          end if
       end do

       do j = 1, k
          weight_j = sum(gamma(:, j))
          if (weight_j > 0.0_dp) then
             mix%sigma(j) = sqrt(sum(gamma(:, j)*(data(:) - mix%mu(j))**2) / weight_j)
          end if
       end do

       mix%pi = sum(gamma, dim=1) / n
    end do

    deallocate(gamma)
  end subroutine fit_mixture
  !---------------------------------------------------------------------

  !---------------------------------------------------------------------
  ! Subroutine to display the parameters in a formatted table.
  subroutine display_mixture(mix)
    implicit none
    type(mixture_type), intent(in) :: mix
    integer :: j

    print *
    print '(A)', "Component   Mixing Proportion         Mean         Std. Deviation"
    do j = 1, mix%ncomp
       print '(I9, 3(2X,F16.6))', j, mix%pi(j), mix%mu(j), mix%sigma(j)
    end do
    print *
  end subroutine display_mixture
  !---------------------------------------------------------------------

end module mixture_mod

!=====================================================================
program test_mixture
  use mixture_mod
  implicit none
  ! All declarations must come before any executable statements.
  integer, parameter :: n = 50000, k = 3
  integer :: i, comp
  real(dp), allocatable :: data(:)
  real(dp) :: u
  real(dp), dimension(k) :: true_pi, true_mu, true_sigma
  real(dp), dimension(k) :: cumulative
  type(mixture_type) :: est_mix, true_mix

  ! --- Define the "true" mixture parameters ---
  true_pi    = (/0.3_dp, 0.4_dp, 0.3_dp/)
  true_mu    = (/-2.0_dp, 0.0_dp, 3.0_dp/)
  true_sigma = (/1.0_dp, 0.5_dp, 1.5_dp/)

  allocate(data(n))

  ! Create cumulative probabilities for sampling.
  cumulative(1) = true_pi(1)
  do i = 2, k
     cumulative(i) = cumulative(i-1) + true_pi(i)
  end do

  call random_seed()

  ! --- Simulate data from the mixture ---
  do i = 1, n
     call random_number(u)
     comp = 1
     do while (comp <= k .and. u > cumulative(comp))
        comp = comp + 1
     end do
     ! Generate a sample from the comp-th normal using the Boxā€“Muller method.
     data(i) = true_mu(comp) + true_sigma(comp)*box_muller()
  end do

  ! Pack the true parameters into a mixture_type variable.
  true_mix%ncomp = k
  allocate(true_mix%pi(k))
  allocate(true_mix%mu(k))
  allocate(true_mix%sigma(k))
  true_mix%pi    = true_pi
  true_mix%mu    = true_mu
  true_mix%sigma = true_sigma

  ! --- Fit the mixture model ---
  call fit_mixture(data, k, est_mix)

  ! --- Display the results ---
  print *, "True Mixture Parameters:"
  call display_mixture(true_mix)
  print *, "Estimated Mixture Parameters:"
  call display_mixture(est_mix)

contains

  !---------------------------------------------------------------------
  ! Function to generate a standard normal variate using the Boxā€“Muller method.
  function box_muller() result(r)
    real(dp) :: r
    real(dp) :: u1, u2
    call random_number(u1)
    call random_number(u2)
    if (u1 == 0.0_dp) u1 = 1.0e-10_dp
    r = sqrt(-2.0_dp*log(u1)) * cos(2.0_dp*pi*u2)
  end function box_muller
  !---------------------------------------------------------------------

end program test_mixture

with sample output

 True Mixture Parameters:

Component   Mixing Proportion         Mean         Std. Deviation
        1          0.300000         -2.000000          1.000000
        2          0.400000          0.000000          0.500000
        3          0.300000          3.000000          1.500000

 Estimated Mixture Parameters:

Component   Mixing Proportion         Mean         Std. Deviation
        1          0.300766         -1.974683          1.001844
        2          0.394030          0.005373          0.488677
        3          0.305204          2.977401          1.499262

It takes ChatGPT o3-mini-high about a minute to respond to prompts, so the ā€œdebuggingā€ took 5 minutes. To streamline AI code generation we should have an agent that feeds compiler error messages to the AI until it compiles. My ChatGPT-Fortran-generator project from 2 years ago did this, and I should revisit it with the latest OpenAI models.

3 Likes

I gave the o3-mini-high the same task, simple code but with parallel execution, on which Code Copilot previously did not do a great job, generating instructions like call sync all
The new model did noticeably better, with just one compiler-errors-feedback to get working code. So, itā€™s improving.

1 Like

I asked it to give me a code to solve 2x4 mesh with mitc4 shell. It gave a skeleton code which detailed the steps needed.

The I asked for the detail for the stiffness matrix of mitc4 shell. With low speed internet, quite often the server is busy, need to ask in new session.

To combine the two codes, I found X has two meanings. The one in the first hold all 8 nodes. The one in the second if only the 4 node for one element. So I have to get 4 in turn and feed into the second code.

So on for other details.

Seems I donā€™t need to understand the detail how the matrix is calculated now. Just make sure the flow is right and check to get a right result.

1 Like

Grok 3, from X/Twitter, has just been released, and it does well on some LLM benchmarks. It costs about $40/month. Grok 2 is free and seems good at Fortran coding. It is very fast. XAI has said it will open source Grok 2 eventually.

The prompt

Do a Fortran simulation to find the optimal trimmed mean estimator of the location of the Cauchy distribution, trying trimming proportions of 0%, 10%, 20%, 30%, 40%, and 45%. Declare real variables as real(kind=dp) with dp a module constant, and put procedures in a module.

gave the code

module cauchy_trimmed_mean_module
    implicit none
    integer, parameter :: dp = kind(0.0d0) ! Double precision kind
    real(kind=dp), parameter :: PI = 4.0_dp * atan(1.0_dp)

contains

    function cauchy_pdf(x, location, scale) result(pdf)
        ! Probability density function of Cauchy distribution
        real(kind=dp), intent(in) :: x, location, scale
        real(kind=dp) :: pdf
        pdf = 1.0_dp / (PI * scale * (1.0_dp + ((x - location) / scale)**2))
    end function cauchy_pdf

    subroutine generate_cauchy_sample(n, location, scale, sample)
        ! Generate random sample from Cauchy distribution
        integer, intent(in) :: n
        real(kind=dp), intent(in) :: location, scale
        real(kind=dp), intent(out) :: sample(n)
        real(kind=dp) :: u
        integer :: i

        do i = 1, n
            call random_number(u)
            sample(i) = location + scale * tan(PI * (u - 0.5_dp))
        end do
    end subroutine generate_cauchy_sample

    function trimmed_mean(data, trim_prop) result(mean)
        ! Calculate trimmed mean for given trimming proportion
        real(kind=dp), intent(in) :: data(:)
        real(kind=dp), intent(in) :: trim_prop
        real(kind=dp) :: mean
        integer :: n, trim_num, i, j
        real(kind=dp), allocatable :: sorted_data(:)

        n = size(data)
        trim_num = int(trim_prop * n)
        allocate(sorted_data(n))

        ! Sort data
        sorted_data = data
        call quicksort(sorted_data)

        ! Calculate trimmed mean
        mean = sum(sorted_data(trim_num+1:n-trim_num)) / (n - 2*trim_num)

        deallocate(sorted_data)
    end function trimmed_mean

    recursive subroutine quicksort(arr)
        ! Quick sort subroutine
        real(kind=dp), intent(inout) :: arr(:)
        integer :: n, i, j
        real(kind=dp) :: pivot, temp
        n = size(arr)

        if (n > 1) then
            pivot = arr(n/2)
            i = 1
            j = n

            do while (i <= j)
                do while (arr(i) < pivot)
                    i = i + 1
                end do
                do while (arr(j) > pivot)
                    j = j - 1
                end do
                if (i <= j) then
                    temp = arr(i)
                    arr(i) = arr(j)
                    arr(j) = temp
                    i = i + 1
                    j = j - 1
                end if
            end do

            if (j > 0) call quicksort(arr(:j))
            if (i < n) call quicksort(arr(i:))
        end if
    end subroutine quicksort

    subroutine simulate_trimmed_mean(n_samples, n_simulations, location, scale, trim_props, results)
        ! Simulate and find optimal trimmed mean
        integer, intent(in) :: n_samples, n_simulations
        real(kind=dp), intent(in) :: location, scale, trim_props(:)
        real(kind=dp), intent(out) :: results(size(trim_props))
        real(kind=dp), allocatable :: sample(:), estimate(:)
        integer :: i, j
        real(kind=dp) :: mse, sum_sq_errors

        allocate(sample(n_samples), estimate(n_simulations))

        do i = 1, size(trim_props)
            sum_sq_errors = 0.0_dp
            do j = 1, n_simulations
                call generate_cauchy_sample(n_samples, location, scale, sample)
                estimate(j) = trimmed_mean(sample, trim_props(i))
                sum_sq_errors = sum_sq_errors + (estimate(j) - location)**2
            end do
            mse = sum_sq_errors / n_simulations
            results(i) = mse
        end do

        deallocate(sample, estimate)
    end subroutine simulate_trimmed_mean

end module cauchy_trimmed_mean_module

program main
    use cauchy_trimmed_mean_module
    implicit none
    integer, parameter :: n_samples = 1000, n_simulations = 100000
    real(kind=dp), parameter :: location = 0.0_dp, scale = 1.0_dp
    real(kind=dp), dimension(6) :: trim_props, mse_results
    integer :: i

    ! Trim proportions to test: 0%, 10%, 20%, 30%, 40%, 45%
    trim_props = [0.0_dp, 0.1_dp, 0.2_dp, 0.3_dp, 0.4_dp, 0.45_dp]

    ! Run simulation
    call simulate_trimmed_mean(n_samples, n_simulations, location, scale, trim_props, mse_results)

    ! Output results
    print *, "Simulation Results for Cauchy Distribution Location Estimation"
    print *, "True Location:", location
    print *, "Scale:", scale
    print *, "Number of Samples per Simulation:", n_samples
    print *, "Number of Simulations:", n_simulations
    print *, "Trim Proportion | Mean Squared Error"
    print *, "---------------|-------------------"
    do i = 1, size(trim_props)
        print *, trim_props(i)*100, "% |", mse_results(i)
    end do

    ! Find and print optimal trim proportion
    i = minloc(mse_results, dim=1)
    print *, ""
    print *, "Optimal Trim Proportion:", trim_props(i)*100, "%"
    print *, "With MSE:", mse_results(i)

end program main

which compiles and runs with gfortran (with some warnings about declared but unused variables), giving sample output

 Simulation Results for Cauchy Distribution Location Estimation
 True Location:   0.0000000000000000     
 Scale:   1.0000000000000000     
 Number of Samples per Simulation:        1000
 Number of Simulations:      100000
 Trim Proportion | Mean Squared Error
 ---------------|-------------------
   0.0000000000000000      % |   1430115.4351366321     
   10.000000000000000      % |   4.8349842361479342E-003
   20.000000000000000      % |   2.8873999307277897E-003
   30.000000000000000      % |   2.3813135474551627E-003
   40.000000000000000      % |   2.2826908544652394E-003
   45.000000000000000      % |   2.3443599454624460E-003
 
 Optimal Trim Proportion:   40.000000000000000      %
 With MSE:   2.2826908544652394E-003

According to the literature, the optimal trimming proportion is 38%, which is consistent with the simulation.

4 Likes

I have not paid for Grok 3, but now I have access to it. Like Grok 2 it is very fast. Giving it the prompt upthread

Write a Fortran subroutine in a module that fits a finite mixture of normals with a specified # of components to univariate data, using the EM algorithm, and write a main program that tests it for data simulated from a known mixture distribution. Declare real variables as real(kind=dp) with

integer, parameter :: dp = kind(1.0d0)

Define a derived type to store the parameters of a finite mixture, and write a subroutine to display the parameters in a formatted table. Use the subroutine to display the true and estimated parameters. Declare procedures as pure or elemental when feasible.

gave after one iteration of feeding it error messages and my declaring the kind parameter dp public the following working code:

module mixture_mod
    implicit none
    integer, parameter :: dp = kind(1.0d0)  ! Double precision kind
    private
    public :: fit_mixture, display_params, mixture_params, dp

    ! Derived type to store mixture parameters
    type mixture_params
        real(kind=dp), allocatable :: weights(:)  ! Mixture weights (sum to 1)
        real(kind=dp), allocatable :: means(:)    ! Component means
        real(kind=dp), allocatable :: variances(:) ! Component variances
    contains
        procedure :: init => initialize_params
    end type mixture_params

contains

    ! Subroutine to initialize mixture parameters (no longer PURE)
    subroutine initialize_params(this, n_components)
        class(mixture_params), intent(out) :: this
        integer, intent(in) :: n_components
        allocate(this%weights(n_components))
        allocate(this%means(n_components))
        allocate(this%variances(n_components))
        this%weights = 0.0_dp
        this%means = 0.0_dp
        this%variances = 1.0_dp  ! Default variance
    end subroutine initialize_params

    ! Subroutine to fit mixture of normals using EM algorithm
    subroutine fit_mixture(data, n_components, params, max_iter, tol)
        real(kind=dp), intent(in) :: data(:)  ! Input data
        integer, intent(in) :: n_components    ! Number of mixture components
        type(mixture_params), intent(out) :: params  ! Fitted parameters
        integer, intent(in) :: max_iter       ! Maximum iterations
        real(kind=dp), intent(in) :: tol      ! Convergence tolerance

        real(kind=dp), allocatable :: responsibilities(:,:)  ! N x K matrix
        real(kind=dp) :: log_likelihood, prev_log_likelihood
        integer :: n, k, iter, i
        real(kind=dp) :: sum_resp, norm_const

        n = size(data)  ! Number of data points
        allocate(responsibilities(n, n_components))

        ! Initialize parameters randomly
        call params%init(n_components)
        params%weights = 1.0_dp / n_components  ! Uniform weights
        params%means = linspace(minval(data), maxval(data), n_components)
        params%variances = 1.0_dp  ! Initial variances

        ! EM algorithm
        do iter = 1, max_iter
            prev_log_likelihood = log_likelihood

            ! E-step: Compute responsibilities
            log_likelihood = 0.0_dp
            do k = 1, n_components
                responsibilities(:, k) = params%weights(k) * normal_pdf(data, params%means(k), params%variances(k))
            end do

            ! Normalize responsibilities
            do i = 1, n
                norm_const = sum(responsibilities(i, :))
                if (norm_const > 0.0_dp) responsibilities(i, :) = responsibilities(i, :) / norm_const
                log_likelihood = log_likelihood + log(norm_const)
            end do

            ! Check convergence
            if (abs(log_likelihood - prev_log_likelihood) < tol) exit

            ! M-step: Update parameters
            do k = 1, n_components
                sum_resp = sum(responsibilities(:, k))
                params%weights(k) = sum_resp / n
                params%means(k) = sum(data * responsibilities(:, k)) / sum_resp
                params%variances(k) = sum(responsibilities(:, k) * (data - params%means(k))**2) / sum_resp
            end do

            if (any(params%variances <= 0.0_dp)) params%variances = max(params%variances, 1.0e-6_dp)  ! Avoid zero variance
        end do

        deallocate(responsibilities)

    contains

        ! Pure elemental function to compute normal PDF
        pure elemental function normal_pdf(x, mu, sigma2) result(pdf)
            real(kind=dp), intent(in) :: x, mu, sigma2
            real(kind=dp) :: pdf, sqrt_2pi, exponent
            sqrt_2pi = sqrt(2.0_dp * acos(-1.0_dp))
            exponent = -((x - mu)**2) / (2.0_dp * sigma2)
            pdf = exp(exponent) / (sqrt_2pi * sqrt(sigma2))
        end function normal_pdf

        ! Pure function to create linearly spaced points (helper)
        pure function linspace(start, end, n) result(points)
            real(kind=dp), intent(in) :: start, end
            integer, intent(in) :: n
            real(kind=dp) :: points(n)
            integer :: i
            real(kind=dp) :: step
            step = (end - start) / (n - 1)
            do i = 1, n
                points(i) = start + (i-1) * step
            end do
        end function linspace

    end subroutine fit_mixture

    ! Subroutine to display parameters in a formatted table (no longer PURE)
    subroutine display_params(params, label)
        type(mixture_params), intent(in) :: params
        character(len=*), intent(in) :: label
        integer :: k

        print *, "Mixture Parameters - ", trim(label)
        print *, "Component  Weight     Mean       Variance"
        print *, "-----------------------------------------"
        do k = 1, size(params%weights)
            write(*, '(I9, 3F12.6)') k, params%weights(k), params%means(k), params%variances(k)
        end do
        print *, "-----------------------------------------\n"

    end subroutine display_params

end module mixture_mod

program test_mixture
    use mixture_mod
    implicit none

    integer, parameter :: n_samples = 1000, n_components = 2
    real(kind=dp), allocatable :: data(:)
    type(mixture_params) :: true_params, estimated_params
    real(kind=dp) :: r
    integer :: i, seed
    real(kind=dp), parameter :: max_iter = 100, tol = 1.0e-6_dp

    ! Set random seed
    seed = 123456789
    call random_seed(seed)

    ! Simulate data from a known mixture
    allocate(data(n_samples))
    call true_params%init(n_components)

    ! True parameters -- I edited these
    true_params%weights = [0.1_dp, 0.9_dp]
    true_params%means = [-2.0_dp, 2.0_dp]
    true_params%variances = [5.0_dp, 0.5_dp]

    ! Generate mixed data
    do i = 1, n_samples
        call random_number(r)
        if (r < true_params%weights(1)) then
            call random_normal(data(i), true_params%means(1), sqrt(true_params%variances(1)))
        else
            call random_normal(data(i), true_params%means(2), sqrt(true_params%variances(2)))
        end if
    end do

    ! Fit mixture model
    call fit_mixture(data, n_components, estimated_params, int(max_iter), tol)

    ! Display results
    call display_params(true_params, "True")
    call display_params(estimated_params, "Estimated")

    deallocate(data)

contains

    ! Subroutine to generate random normal deviate
    subroutine random_normal(x, mu, sigma)
        real(kind=dp), intent(out) :: x
        real(kind=dp), intent(in) :: mu, sigma
        real(kind=dp) :: u1, u2

        call random_number(u1)
        call random_number(u2)
        x = mu + sigma * sqrt(-2.0_dp * log(u1)) * cos(2.0_dp * acos(-1.0_dp) * u2)
    end subroutine random_normal

end program test_mixture

with sample output

 Mixture Parameters - True
 Component  Weight     Mean       Variance
 -----------------------------------------
        1    0.100000   -2.000000    5.000000
        2    0.900000    2.000000    0.500000
 -----------------------------------------\n
 Mixture Parameters - Estimated
 Component  Weight     Mean       Variance
 -----------------------------------------
        1    0.095724   -1.645698    5.345505
        2    0.904276    2.011618    0.552886
 -----------------------------------------\n

Maybe it prints ā€œ\nā€ because it was trained on more C and C++ code, where it would be idiomatic to end a string this way, than Fortran.

2 Likes

With the same prompt, Claude 3.7 Sonnet Extended ($20/month) thought for 2 minutes and gave a working program. Further iterations regarding formatting were much faster and produced the code

! Module for normal mixture model fitting using EM algorithm
module normal_mixture_em
    implicit none
    private
    
    ! Define double precision parameter
    integer, parameter, public :: dp = kind(1.0d0)
    
    ! Define constants
    real(kind=dp), parameter :: PI = 3.14159265358979323846_dp
    
    ! Define derived type for mixture model parameters
    type, public :: normal_mixture_type
        integer :: n_components                ! Number of components
        real(kind=dp), allocatable :: weights(:)     ! Mixing proportions
        real(kind=dp), allocatable :: means(:)       ! Component means
        real(kind=dp), allocatable :: stdevs(:)      ! Component standard deviations
    end type normal_mixture_type
    
    ! Public procedures
    public :: initialize_mixture, fit_mixture, display_parameters
    public :: generate_mixture_data, normal_pdf, log_likelihood
    public :: smart_initialize
    
contains
    ! Initialize mixture parameters
    subroutine initialize_mixture(mixture, n_components, init_weights, init_means, init_stdevs)
        type(normal_mixture_type), intent(inout) :: mixture
        integer, intent(in) :: n_components
        real(kind=dp), intent(in), optional :: init_weights(:)
        real(kind=dp), intent(in), optional :: init_means(:)
        real(kind=dp), intent(in), optional :: init_stdevs(:)
        
        mixture%n_components = n_components
        
        if (allocated(mixture%weights)) deallocate(mixture%weights)
        if (allocated(mixture%means)) deallocate(mixture%means)
        if (allocated(mixture%stdevs)) deallocate(mixture%stdevs)
        
        allocate(mixture%weights(n_components))
        allocate(mixture%means(n_components))
        allocate(mixture%stdevs(n_components))
        
        ! Initialize with default or provided values
        if (present(init_weights)) then
            mixture%weights = init_weights
        else
            mixture%weights = 1.0_dp / n_components
        end if
        
        if (present(init_means)) then
            mixture%means = init_means
        else
            mixture%means = 0.0_dp
        end if
        
        if (present(init_stdevs)) then
            mixture%stdevs = init_stdevs
        else
            mixture%stdevs = 1.0_dp
        end if
    end subroutine initialize_mixture
    
    ! Calculate normal PDF (pure elemental function)
    pure elemental function normal_pdf(x, mu, sigma) result(pdf_val)
        real(kind=dp), intent(in) :: x, mu, sigma
        real(kind=dp) :: pdf_val
        
        pdf_val = exp(-0.5_dp * ((x - mu) / sigma)**2) / (sigma * sqrt(2.0_dp * PI))
    end function normal_pdf
    
    ! Calculate mixture PDF for a single data point (pure function)
    pure function mixture_pdf(x, mixture) result(pdf_val)
        real(kind=dp), intent(in) :: x
        type(normal_mixture_type), intent(in) :: mixture
        real(kind=dp) :: pdf_val
        integer :: j
        
        pdf_val = 0.0_dp
        do j = 1, mixture%n_components
            pdf_val = pdf_val + mixture%weights(j) * normal_pdf(x, mixture%means(j), mixture%stdevs(j))
        end do
    end function mixture_pdf
    
    ! Calculate log-likelihood (pure function)
    pure function log_likelihood(data, mixture) result(loglik)
        real(kind=dp), intent(in) :: data(:)
        type(normal_mixture_type), intent(in) :: mixture
        real(kind=dp) :: loglik
        real(kind=dp) :: prob
        integer :: i, n
        
        n = size(data)
        loglik = 0.0_dp
        
        do i = 1, n
            prob = mixture_pdf(data(i), mixture)
            loglik = loglik + log(prob)
        end do
    end function log_likelihood
    
    ! EM algorithm to fit mixture model
    subroutine fit_mixture(mixture, data, max_iter, tol, loglik_history)
        type(normal_mixture_type), intent(inout) :: mixture
        real(kind=dp), intent(in) :: data(:)
        integer, intent(in) :: max_iter
        real(kind=dp), intent(in) :: tol
        real(kind=dp), allocatable, intent(out), optional :: loglik_history(:)
        
        real(kind=dp), allocatable :: responsibilities(:,:)
        real(kind=dp) :: old_loglik, new_loglik, diff
        integer :: n, iter
        
        n = size(data)
        
        ! Allocate responsibilities matrix
        allocate(responsibilities(n, mixture%n_components))
        
        ! First log-likelihood calculation
        old_loglik = log_likelihood(data, mixture)
        
        ! Allocate loglik_history if requested
        if (present(loglik_history)) then
            allocate(loglik_history(max_iter+1))
            loglik_history(1) = old_loglik
        end if
        
        ! EM iterations
        do iter = 1, max_iter
            ! E-step: Calculate responsibilities
            call e_step(mixture, data, responsibilities)
            
            ! M-step: Update parameters
            call m_step(mixture, data, responsibilities)
            
            ! Calculate new log-likelihood
            new_loglik = log_likelihood(data, mixture)
            
            ! Store log-likelihood if requested
            if (present(loglik_history)) then
                loglik_history(iter+1) = new_loglik
            end if
            
            ! Check convergence
            diff = new_loglik - old_loglik
            if (abs(diff) < tol) exit
            
            old_loglik = new_loglik
        end do
        
        ! Trim loglik_history if needed
        if (present(loglik_history) .and. iter < max_iter) then
            loglik_history = loglik_history(1:iter+1)
        end if
    end subroutine fit_mixture
    
    ! Smart initialization for EM algorithm
    subroutine smart_initialize(mixture, data)
        type(normal_mixture_type), intent(inout) :: mixture
        real(kind=dp), intent(in) :: data(:)
        real(kind=dp) :: data_min, data_max, step, data_mean, data_var
        integer :: i, n
        
        n = size(data)
        
        ! Simple initialization: spread means uniformly across data range
        data_min = minval(data)
        data_max = maxval(data)
        step = (data_max - data_min) / (mixture%n_components + 1)
        
        do i = 1, mixture%n_components
            mixture%means(i) = data_min + i * step
        end do
        
        ! Initialize weights to be equal
        mixture%weights = 1.0_dp / mixture%n_components
        
        ! Initialize stdevs to data standard deviation
        data_mean = sum(data) / n
        data_var = sum((data - data_mean)**2) / n
        mixture%stdevs = sqrt(data_var)
    end subroutine smart_initialize
    
    ! E-step: Calculate responsibilities
    subroutine e_step(mixture, data, resp)
        type(normal_mixture_type), intent(in) :: mixture
        real(kind=dp), intent(in) :: data(:)
        real(kind=dp), intent(out) :: resp(:,:)
        real(kind=dp) :: norm_const
        integer :: i, j, n
        
        n = size(data)
        
        do i = 1, n
            norm_const = 0.0_dp
            
            ! Calculate unnormalized responsibilities
            do j = 1, mixture%n_components
                resp(i, j) = mixture%weights(j) * normal_pdf(data(i), mixture%means(j), mixture%stdevs(j))
                norm_const = norm_const + resp(i, j)
            end do
            
            ! Normalize responsibilities
            if (norm_const > 0.0_dp) then
                resp(i,:) = resp(i,:) / norm_const
            else
                ! Handle numerical issues
                resp(i,:) = 1.0_dp / mixture%n_components
            end if
        end do
    end subroutine e_step
    
    ! M-step: Update parameters
    subroutine m_step(mixture, data, resp)
        type(normal_mixture_type), intent(inout) :: mixture
        real(kind=dp), intent(in) :: data(:)
        real(kind=dp), intent(in) :: resp(:,:)
        real(kind=dp) :: sum_resp, weighted_sum_sq
        integer :: j, n
        
        n = size(data)
        
        ! Update weights, means, and stdevs for each component
        do j = 1, mixture%n_components
            ! Sum of responsibilities for component j
            sum_resp = sum(resp(:, j))
            
            ! Update weight
            mixture%weights(j) = sum_resp / n
            
            ! Update mean
            if (sum_resp > 0.0_dp) then
                mixture%means(j) = sum(resp(:, j) * data) / sum_resp
            end if
            
            ! Update standard deviation
            if (sum_resp > 0.0_dp) then
                weighted_sum_sq = sum(resp(:, j) * (data - mixture%means(j))**2)
                mixture%stdevs(j) = sqrt(weighted_sum_sq / sum_resp)
                
                ! Avoid zero standard deviation
                if (mixture%stdevs(j) < 1.0e-6_dp) then
                    mixture%stdevs(j) = 1.0e-6_dp
                end if
            end if
        end do
        
        ! Normalize weights to ensure they sum to 1
        mixture%weights = mixture%weights / sum(mixture%weights)
    end subroutine m_step
    
    ! Display parameters in a formatted table
    subroutine display_parameters(mixture, title)
        type(normal_mixture_type), intent(in) :: mixture
        character(len=*), intent(in), optional :: title
        integer :: i
        
        if (present(title)) then
            print '(A)', trim(title)
            print '(A)', repeat('-', len_trim(title))
        end if
        
        print '(A5,A10,A15,A15)', 'Comp', 'Weight', 'Mean', 'Std Dev'
        print '(A)', repeat('-', 45)
        
        do i = 1, mixture%n_components
            print '(I5,F10.4,F15.6,F15.6)', i, mixture%weights(i), mixture%means(i), mixture%stdevs(i)
        end do
        
        print *, ''  ! Empty line
    end subroutine display_parameters
    
    ! Generate data from a mixture of normals
    subroutine generate_mixture_data(mixture, data)
        type(normal_mixture_type), intent(in) :: mixture
        real(kind=dp), intent(out) :: data(:)
        real(kind=dp) :: u, cumsum, x1, x2, z
        integer :: i, j, component
        
        do i = 1, size(data)
            ! Select component based on weights
            call random_number(u)
            cumsum = 0.0_dp
            component = mixture%n_components  ! Default to last component
            
            do j = 1, mixture%n_components
                cumsum = cumsum + mixture%weights(j)
                if (u <= cumsum) then
                    component = j
                    exit
                end if
            end do
            
            ! Generate normal random variable using Box-Muller transform
            call random_number(x1)
            call random_number(x2)
            z = sqrt(-2.0_dp * log(x1)) * cos(2.0_dp * PI * x2)
            
            ! Scale and shift to match component parameters
            data(i) = mixture%means(component) + mixture%stdevs(component) * z
        end do
    end subroutine generate_mixture_data
    
end module normal_mixture_em

! Main program to test the EM algorithm
program test_mixture_em
    use normal_mixture_em
    implicit none
    
    ! Parameters
    integer, parameter :: n_samples = 10**6
    integer, parameter :: n_components = 3
    integer, parameter :: max_iter = 100
    real(kind=dp), parameter :: tol = 1.0e-6_dp
    
    ! Variables
    type(normal_mixture_type) :: true_mixture, estimated_mixture
    real(kind=dp), allocatable :: data(:), loglik_history(:)
    integer :: i, seed_size
    integer, allocatable :: seed(:)
    
    ! Initialize random seed
    call random_seed(size=seed_size)
    allocate(seed(seed_size))
    seed = 123456789  ! Fixed seed for reproducibility
    call random_seed(put=seed)
    
    ! Print simulation parameters
    print '(A)', "Simulation Parameters"
    print '(A)', "---------------------"
    print '(A,I10)', "Number of observations:", n_samples
    print '(A,I6)', "Number of components:", n_components
    print '(A,I6)', "Maximum EM iterations:", max_iter
    print '(A,E10.2)', "Convergence tolerance:", tol
    print *, ""
    
    ! Initialize true mixture model
    call initialize_mixture(true_mixture, n_components)
    true_mixture%weights = [0.3_dp, 0.5_dp, 0.2_dp]
    true_mixture%means = [-5.0_dp, 2.0_dp, 8.0_dp]
    true_mixture%stdevs = [1.5_dp, 1.0_dp, 2.0_dp]
    
    ! Generate data from true mixture
    allocate(data(n_samples))
    call generate_mixture_data(true_mixture, data)
    
    ! Initialize estimated mixture model
    call initialize_mixture(estimated_mixture, n_components)
    
    ! Apply smart initialization before fitting
    call smart_initialize(estimated_mixture, data)
    
    ! Display initial guess before fitting
    call display_parameters(estimated_mixture, "Initial Parameter Guess")
    
    ! Fit the mixture model
    call fit_mixture(estimated_mixture, data, max_iter, tol, loglik_history)
    
    ! Display true and estimated parameters
    call display_parameters(true_mixture, "True Mixture Parameters")
    call display_parameters(estimated_mixture, "Estimated Mixture Parameters")
    
    ! Print log-likelihood history
    print '(A)', "Log-likelihood history (first 10 iterations):"
    do i = 1, min(10, size(loglik_history))
        print '(I5, 1X, F15.6)', i-1, loglik_history(i)  ! Use 1X for explicit space
    end do
    
end program test_mixture_em

with sample output

Simulation Parameters
---------------------
Number of observations:   1000000
Number of components:     3
Maximum EM iterations:   100
Convergence tolerance:  0.10E-05
 
Initial Parameter Guess
-----------------------
 Comp    Weight           Mean        Std Dev
---------------------------------------------
    1    0.3333      -5.690788       4.804305
    2    0.3333       1.757827       4.804305
    3    0.3333       9.206442       4.804305
 
True Mixture Parameters
-----------------------
 Comp    Weight           Mean        Std Dev
---------------------------------------------
    1    0.3000      -5.000000       1.500000
    2    0.5000       2.000000       1.000000
    3    0.2000       8.000000       2.000000
 
Estimated Mixture Parameters
----------------------------
 Comp    Weight           Mean        Std Dev
---------------------------------------------
    1    0.3002      -5.001405       1.500456
    2    0.4998       2.000873       0.998456
    3    0.2000       8.000607       2.001037
 
Log-likelihood history (first 10 iterations):
    0 -3201915.480510
    1 -2980430.735157
    2 -2977997.519079
    3 -2974250.857147
    4 -2967859.132988
    5 -2955794.663638
    6 -2930849.252435
    7 -2878688.517813
    8 -2797730.382526
    9 -2729296.550722

For this prompt Google Gemini 2.0 Flash Thinking Experimental (free) failed to create working program, giving estimates of the variances that were much too low, even after the error was pointed out. For now I will primarily use Grok, DeepSeek, ChatGPT o3-mini-high, and Claude.

ETA: Google Gemini Advanced 2.0 Pro Experimental ($20/month) does solve the problem after one iteration, giving the code

module mixture_model

  implicit none

  integer, parameter :: dp = kind(1.0d0)

  type mixture_params
     integer :: n_components  ! Number of components
     real(kind=dp), allocatable :: weights(:)    ! Mixing weights
     real(kind=dp), allocatable :: means(:)      ! Means of components
     real(kind=dp), allocatable :: std_devs(:)  ! Standard deviations of components
  end type mixture_params

contains

  ! Subroutine to display mixture parameters
  subroutine display_mixture_params(params, title)
    type(mixture_params), intent(in) :: params
    character(len=*), intent(in) :: title

    integer :: k

    print *, title
    print *, "------------------------------------------"
    print *, "Component | Weight   | Mean     | Std. Dev."
    print *, "------------------------------------------"
    do k = 1, params%n_components
       print '(I9, 3F10.4)', k, params%weights(k), params%means(k), params%std_devs(k)
    end do
    print *, "------------------------------------------"
  end subroutine display_mixture_params


  ! Subroutine to fit a mixture of normals using EM
  subroutine fit_mixture_em(data, n_components, max_iterations, tolerance, estimated_params, initial_params, ll_log, print_ll_trace)
    real(kind=dp), intent(in) :: data(:)
    integer, intent(in) :: n_components, max_iterations
    real(kind=dp), intent(in) :: tolerance
    type(mixture_params), intent(out) :: estimated_params
    type(mixture_params), intent(in), optional :: initial_params
    real(kind=dp), allocatable, intent(out), optional :: ll_log(:)
    logical, intent(in), optional :: print_ll_trace  ! New: Option to print LL trace


    integer :: n_data, k, iter
    real(kind=dp), allocatable :: responsibilities(:,:), log_likelihoods(:)
    real(kind=dp) :: old_log_likelihood, new_log_likelihood, delta_log_likelihood
    real(kind=dp), allocatable :: component_pdfs(:,:)
    real(kind=dp), allocatable :: sum_component_pdfs(:)
    logical :: print_trace  ! Local variable for trace control

    n_data = size(data)

    ! Allocate arrays
    allocate(estimated_params%weights(n_components))
    allocate(estimated_params%means(n_components))
    allocate(estimated_params%std_devs(n_components))
    allocate(responsibilities(n_data, n_components))
    allocate(component_pdfs(n_data, n_components))
    allocate(sum_component_pdfs(n_data))

    if (present(ll_log)) then
      allocate(ll_log(max_iterations))
    endif
    allocate(log_likelihoods(max_iterations))

     ! Set print_trace based on optional argument
    print_trace = .false.  ! Default to not printing
    if (present(print_ll_trace)) then
        print_trace = print_ll_trace
    end if

    ! Initialization (either from initial_params or random)
    if (present(initial_params)) then
        ! Check dimensions
        if (initial_params%n_components /= n_components) then
            error stop "initial_params n_components mismatch"
        end if
        if (size(initial_params%weights) /= n_components .or. &
            size(initial_params%means) /= n_components .or. &
            size(initial_params%std_devs) /= n_components) then
            error stop "initial_params size mismatch"
        end if

        estimated_params%weights = initial_params%weights
        estimated_params%means = initial_params%means
        estimated_params%std_devs = initial_params%std_devs

    else ! Random initialization
       estimated_params%weights = 1.0_dp / n_components  ! Equal weights
       call random_number(estimated_params%means)
       estimated_params%means = minval(data) + (maxval(data) - minval(data)) * estimated_params%means ! Means within data range
       call random_number(estimated_params%std_devs)
       estimated_params%std_devs = 0.1_dp + estimated_params%std_devs * (maxval(data) - minval(data)) * 0.25_dp   ! Std. devs. reasonable scale
    endif
     estimated_params%n_components = n_components

    ! EM Algorithm
    old_log_likelihood = -huge(1.0_dp)
    do iter = 1, max_iterations

       ! E-step: Calculate responsibilities
       do k = 1, n_components
          component_pdfs(:, k) = estimated_params%weights(k) * normal_pdf(data, estimated_params%means(k), estimated_params%std_devs(k))
       end do

       ! Handle potential underflow in component_pdfs
        where (component_pdfs < tiny(1.0_dp)) component_pdfs = tiny(1.0_dp)

       ! Corrected responsibility calculation
        sum_component_pdfs = sum(component_pdfs, dim=2)
        do k = 1, n_components
            responsibilities(:, k) = component_pdfs(:, k) / sum_component_pdfs
        end do

       ! M-step: Update parameters
       do k = 1, n_components
          estimated_params%weights(k) = sum(responsibilities(:, k)) / n_data
          estimated_params%means(k) = sum(responsibilities(:, k) * data) / sum(responsibilities(:, k))
          estimated_params%std_devs(k) = sqrt(sum(responsibilities(:, k) * (data - estimated_params%means(k))**2) / sum(responsibilities(:, k)))

            ! Prevent std_devs from collapsing to zero.
          estimated_params%std_devs(k) = max(estimated_params%std_devs(k), 1.0d-6)
       end do

       ! Calculate log-likelihood
       new_log_likelihood = sum(log(sum(component_pdfs, dim=2)))
       log_likelihoods(iter) = new_log_likelihood

       if (present(ll_log)) then
          ll_log(iter) = new_log_likelihood
       endif

       ! Check for convergence
       delta_log_likelihood = new_log_likelihood - old_log_likelihood
       if (abs(delta_log_likelihood) < tolerance) then
          exit
       end if

         ! Print log-likelihood trace if requested
        if (print_trace) then
            print *, "Iteration ", iter, ": ", new_log_likelihood
        end if

       old_log_likelihood = new_log_likelihood
    end do

    ! Clean up allocations
    deallocate(responsibilities)
    deallocate(component_pdfs)
    deallocate(log_likelihoods)
    deallocate(sum_component_pdfs)

  contains
    ! Elemental function for the normal PDF
    elemental real(kind=dp) function normal_pdf(x, mu, sigma)
      real(kind=dp), intent(in) :: x, mu, sigma
      real(kind=dp), parameter :: pi = 4.0_dp * atan(1.0_dp)
      normal_pdf = (1.0_dp / (sigma * sqrt(2.0_dp * pi))) * exp(-0.5_dp * ((x - mu) / sigma)**2)
    end function normal_pdf

  end subroutine fit_mixture_em



  !  function to generate data from a mixture of normals
  subroutine generate_mixture_data(params, n_samples, data)
    type(mixture_params), intent(in) :: params
    integer, intent(in) :: n_samples
    real(kind=dp), intent(out) :: data(n_samples)

    integer :: i, k
    real(kind=dp) :: u, cum_weight
    real(kind=dp), allocatable :: cumulative_weights(:)

    allocate(cumulative_weights(params%n_components))
    cumulative_weights(1) = params%weights(1)
    do k = 2, params%n_components
       cumulative_weights(k) = cumulative_weights(k-1) + params%weights(k)
    end do

    do i = 1, n_samples
       call random_number(u)  ! Generate uniform random number [0, 1)
       k = 1
       do while (u > cumulative_weights(k) .and. k < params%n_components)
            k = k + 1
       end do
       call random_normal(data(i), params%means(k), params%std_devs(k))
    end do

    deallocate(cumulative_weights)

  contains
    !  function for generating a standard normal random variable
     subroutine random_normal(x, mu, sigma)
        real(kind=dp), intent(out) :: x
        real(kind=dp), intent(in) :: mu, sigma
        real(kind=dp) :: u1, u2, z

        call random_number(u1)
        call random_number(u2)

        ! Box-Muller transform
        z = sqrt(-2.0_dp * log(u1)) * cos(2.0_dp * 4.0_dp * atan(1.0_dp) * u2)

        x = mu + sigma * z
    end subroutine random_normal
  end subroutine generate_mixture_data

end module mixture_model


! Main program for testing
program test_mixture_fitting
  use mixture_model
  implicit none

  integer, parameter :: n_samples = 10**6
  integer, parameter :: n_components = 3
  integer, parameter :: max_iterations = 100
  real(kind=dp), parameter :: tolerance = 1.0e-6
  type(mixture_params) :: true_params, estimated_params, initial_params
  real(kind=dp), allocatable :: data(:)
    real(kind=dp), allocatable :: ll_log(:)
  integer :: seed(8)
  integer :: i
  logical :: print_ll_trace_flag  ! Flag to control LL trace printing

  ! Define true parameters
  allocate(true_params%weights(n_components))
  allocate(true_params%means(n_components))
  allocate(true_params%std_devs(n_components))
  true_params%n_components = n_components
  true_params%weights = [0.3_dp, 0.5_dp, 0.2_dp]
  true_params%means = [-2.0_dp, 2.0_dp, 5.0_dp]
  true_params%std_devs = [1.0_dp, 1.5_dp, 0.8_dp]

    ! Provide *some* initial parameters (not too far, not too close)
    allocate(initial_params%weights(n_components))
    allocate(initial_params%means(n_components))
    allocate(initial_params%std_devs(n_components))
    initial_params%n_components = n_components
    initial_params%weights = [0.4_dp, 0.3_dp, 0.3_dp]
    initial_params%means = [-1.0_dp, 3.0_dp, 6.0_dp]
    initial_params%std_devs = [1.2_dp, 1.7_dp, 0.9_dp]

  ! Generate data
  allocate(data(n_samples))

  ! Fixed seed for reproducibility.
  seed = [1234, 5678, 9012, 3456, 7890, 1234, 5678, 9012]
  call random_seed(put=seed)

  call generate_mixture_data(true_params, n_samples, data)


    ! Set the flag to control log-likelihood trace printing (change to .true. to enable)
    print_ll_trace_flag = .false.

  ! Fit the mixture model, passing the print_ll_trace_flag
    call fit_mixture_em(data, n_components, max_iterations, tolerance, estimated_params, &
         initial_params=initial_params, ll_log=ll_log, print_ll_trace=print_ll_trace_flag)

  ! Display results
    print *, "Initial Guess:"
    call display_mixture_params(initial_params, "Initial Parameters")
  call display_mixture_params(true_params, "True Parameters")
  call display_mixture_params(estimated_params, "Estimated Parameters")


    ! The log-likelihood trace is now controlled by the flag in fit_mixture_em

  ! Deallocate
  deallocate(true_params%weights)
  deallocate(true_params%means)
  deallocate(true_params%std_devs)
  deallocate(estimated_params%weights)
  deallocate(estimated_params%means)
  deallocate(estimated_params%std_devs)
  deallocate(data)
    if(allocated(ll_log)) deallocate(ll_log)

    deallocate(initial_params%weights)
    deallocate(initial_params%means)
    deallocate(initial_params%std_devs)

end program test_mixture_fitting

with sample output

 Initial Guess:
 Initial Parameters
 ------------------------------------------
 Component | Weight   | Mean     | Std. Dev.
 ------------------------------------------
        1    0.4000   -1.0000    1.2000
        2    0.3000    3.0000    1.7000
        3    0.3000    6.0000    0.9000
 ------------------------------------------
 True Parameters
 ------------------------------------------
 Component | Weight   | Mean     | Std. Dev.
 ------------------------------------------
        1    0.3000   -2.0000    1.0000
        2    0.5000    2.0000    1.5000
        3    0.2000    5.0000    0.8000
 ------------------------------------------
 Estimated Parameters
 ------------------------------------------
 Component | Weight   | Mean     | Std. Dev.
 ------------------------------------------
        1    0.3013   -1.9907    1.0057
        2    0.4972    2.0045    1.4884
        3    0.2015    4.9986    0.8012
 ------------------------------------------

I think Grok and DeepSeek are the best free Fortran coding tools.

2 Likes

So, we stop asking for improvements when we think the answer is what we expect. What happens if we pretend we are never satisfied?

1 Like

Grok-3 is also very good at understanding physics. I write down my notes down using myst markdown, attach my current file and then ask for derivations and improvements. To me it feels like an average middle school student in terms of the reasoning, but with a solid Ph.D. level knowledge of all physics. Itā€™s a weird combination, but very powerful and already extremely useful.

I had some long-standing physics questions from 20 years ago but didnā€™t have the occasion to sit down with some experienced colleague, and I was able to figure out a lot of the answers with Grok now. It already feels like magic, and it will only get better.

4 Likes

I have a naive question, perhaps off topic, but I am just curious.

Can those Grok, ChatGPT LLM stuff, really ā€œunderstandā€ the problem, and can really ā€œsolveā€ the problem?
Or it is just more or less generate the replied based on the data already existed on the internet?

I mean, for example you ask it to write a code to solve a problem. It is likely similar problem has already been posted on the internet, and some one has already posted the answer. So Grok, ChatGPT more or less just use those existed data and generate its answer?

Currently, those Grok, ChatGPT stuff can help us solve existing problem quickly. But, for true and new problem solving, it still requires the human to really make innovative solutions, right? So human need to continue to post new questions and new answer. Then Grok, ChatGPT stuff can continue to learn and evolve from those new questions and new answers.

Do you think someday those tools will really have the ā€˜intelligenceā€™ and ā€˜inferenceā€™ ability like human?
Like, you teach it a first-principle theory or ab initio theory, say, like F=ma, and then it can really ā€˜thinkā€™ and solve ā€˜newā€™ problems based on its understanding of the first-principle?

This was very close to my experience with Grok as well. I havenā€™t really been using any of the AI chatbot tools before this, but Grok 3 really impressed me.

It seems very good at vector math and is extremely useful to just brainstorm yourself with an interactive + summarizing partner.

All of these tools can and do make mistakes in the minutiae, which is why I am still wary of trying to fully learn new things by interacting with them. But, for topics you already know some about and just want someone to check work, give opinions, or consider alternatives with, Grok 3 is far better than my initial experiences with ChatGPT were.

Itā€™s crazy that this is free too, and only about 2.5 years after ChatGPTā€™s meteoric rise to fame.

3 Likes

For me it solves physics problems that I was not able to find any solution online otherwise, so it seems innovative to me, but whether it is truly ā€œinnovativeā€ that I donā€™t know. The reasoning does not seems very advanced, kind of like an average middle school student would reason, but coupled with knowledge about almost everything. Yes, it does make simple mistakes. But overall it is already very useful for me.

2 Likes

My current understanding is: LLMs developed high competency in narrow skills, e.g. writing prose, coding, math derivations, conceptual understanding of physics etc. For each of these, I think itā€™s fair to say that each narrow skill is developed as an extrapolation of existing training data. However, I also believe that emergent patterns in terms of intelligence or reasoning, arise when these narrow skills are combined to solve a new complex task.

My experience over the past 2 years actively using AI in my work is similar to Ondrejā€™s, although, I would as of now promote the reasoning level to that of an advanced undergraduate student, or, a an entry level graduate student.

I also find models that use chain of thought approach, or, thinking and breaking down the problem and then feeding that to itself is input, achieve Superior results in reasoning and problem solving over one shot solutions.

2 Likes

Grok and other LLMs are good at translating Python/NumPy code to Fortran, but something they stumble over is the case-insensitivity of Fortran. A Python code can have distinct variables T and t, unlike Fortran. If you have a numerical Python code that you may want to translate to Fortran some day, I suggest checking that no two variables in the same scope differ only in case. A toy script to detect such code is

import ast

class CaseInsensitiveVarChecker(ast.NodeVisitor):
    def __init__(self):
        self.scopes = [{}]

    def visit_FunctionDef(self, node):
        # Enter a new scope for the function
        self.scopes.append({})
        self.generic_visit(node)
        self.scopes.pop()

    def visit_Name(self, node):
        if isinstance(node.ctx, (ast.Store)):  # Variable assignment
            name = node.id
            current_scope = self.scopes[-1]
            # Check for case-insensitive matches
            for existing_name in current_scope:
                if existing_name.lower() == name.lower() and existing_name != name:
                    print(f"Warning: '{name}' differs only in case from '{existing_name}' "
                          f"at line {node.lineno}")
            current_scope[name] = node.lineno
        self.generic_visit(node)

def check_file(filename):
    with open(filename, 'r') as file:
        code = file.read()
    tree = ast.parse(code)
    checker = CaseInsensitiveVarChecker()
    checker.visit(tree)

# Example usage
code = """
def example():
    T = 10
    t = 20
    print(T, t)

x = 5
X = 10
"""
with open("test.py", "w") as f:
    f.write(code)

check_file("test.py")

giving output

Warning: 't' differs only in case from 'T' at line 4
Warning: 'X' differs only in case from 'x' at line 8

ChatGPT is prone to generating invalid Fortran code with declarations interspersed with executable code (and not using the block construct), so I use it less now. Some tips for using LLMs for coding are universal (feed the LLM error messages to have it fix them), but some are Fortran-specific and should be gathered in one place. A few are

(1) Tell LLMs to declare real variables as real(kind=dp) with dp imported from module kind_mod. Otherwise they are prone to using default reals.
(2) Put procedures in modules.
(3) Tell it to import common procedures from specified modules. For example if you have random_normal in module random_mod, tell it that so it wonā€™t recreate it.

1 Like

I used a ā€œlower boundā€ with the average middle school. :slight_smile: The worst mistakes that I can see in reasoning an average high school student wouldnā€™t make, say, it makes a simple mistake in a small matrix multiplication, you correct it, and it does it again. But then at the same time sometimes the reasoning is really advanced (even Ph.D. level for sure!). So I think of Grok-3 (arguably the best model right now) as having a middle school student assistant with encyclopedic knowledge, and then I am never disappointed. Itā€™s wild, what a time we live in!

2 Likes

Indeed, actually now that you say that, I realize that even for the state of the art models, there is a high variability in the accuracy of answers given by the same model. A very advanced LLM will occasionally make and repeat a really silly mistake (more common with young students) and then immediately go back to very advanced and insightful responses akin to a professional philosopher.

1 Like

Indeed, exactly. Lower bound is middle school, upper bound is the most educated and intelligent person I ever met. Iā€™ve seen LLMs hit both ends! And I think you are right that on average it actually is a grad school level at least in physics, since I can judge that.

Pick your favourite Fortran-aware LLM and ask it to write a minimal test for calling a function that takes a deferred-length character array with an actual argument that is a deferred-length character array component of a derived type.

My first experience with one, just now, is that it is incapable of saying ā€œI donā€™t knowā€, cannot distinguish between assumed-length and deferred-length, or scalars and arrays, lies about what is and is not allowed in Fortran, changes the subject, introduces DO CONCURRENT for no reason at all, etc.

I fully endorse Rocco Petroneā€™s approach to this Rocco Petrone - Wikipedia