Reading File knowing certain variables

jorvz · November 8, 2022, 7:49pm

Hey! I have a question about reading .dat ou .txt files. Let`s say i have a file containing a table just like the one that follows, in the code i define de temperature (column 0) and the chemical percentage (line 0), knowing that, for example, i define T=300 and C=0.40 so i need the code to read the line from 300 and column from 0.40 and return the value 23.4588. Can someone help me with that?

Arjen · November 8, 2022, 7:55pm

Do you know how many columns there are? That would make it a bit easier. Or a maximum number?
My plan would be:

Read the first line and identify the column that holds the percentage 0.40.
Then read the following lines at least up to and including that column, until the temperature, the first number on the line is the one I need.
That gives you exactly the entry.

However, another possibility is to read in the entire table. IIRC, the stdlib has some routines for that. That would be beneficial if you need more than one entry.

everythingfunctional · November 8, 2022, 7:57pm

This sounds like a job for pandas. Of course if you really need to do it in Fortran, it’s going to be tricky. The question you really have to ask is “what if you ask for the value at 299.9999 or 0.40001?” What about at 299.01? or 0.351? Do you interpolate? Take the nearest entry?

I’ve had delusions of implementing something like pandas in Fortran, but really don’t have the time to do it for free. If you’re interested, I do offer consulting/contract work to help with your specific problem.

JohnCampbell · November 9, 2022, 3:40am

It should not be too hard to read the table into an array to perform simple interpolation:

get estimates of number of C values (columns) and number of T values (rows)
read the table into an array.
use linear interpolation to generate a column for C = required value
use linear interpolation on this interpolated column for T = required value.

As I have no understanding of the values that are tabulated, so to better understand the deficiencies in a bilinear interpolation approach, I would also use Excel to plot the surface.
A first difference table (dv/dT or dv/dC ) would help to identify if linear interpolation is suitable if the first derivitives are uniform.

Based on 2D Finite Element derivations, for a rectangular grid, it can be fairly easy to do 4-point linear interpolation using lagrange shape functions or even 9-point quadratic interpolation. It is surprising how straight forward this approach can be using Fortran, providing a clear definition with minimal debugging.

jorvz · November 10, 2022, 1:16am

i’ve tried this
open(10,file='muHA.dat') \\ read(10,*)Tmu,(cm(i),i=1,11) \\ do 128 jt=1,269 \\ read(10,*)Tmu,(amur(i),i=1,11) \\ 128 if(abs(Tmu-T0)/T0 < 1.d-8) go to 111 \\ print*, 'wrong T0' \\ stop \\ 111 do 133 j=1,11 \\ 133 if(abs(cm(j)-c0)<1.d-8)amu=amur(j)*1.d-6 \\ close(10) \
But, it is returning the error:

`63 | read(10,*)Tmu,(cm(i),i=1,11)
| 1
Error: Expecting variable in READ statement at (1)
shockHAp.for:65:22:

65 | read(10,*)Tmu,(amur(i),i=1,11)
| 1
Error: Expecting variable in READ statement at (1)
`

JohnCampbell · November 10, 2022, 2:10am

It is difficult to understand your problem from the part of the code you posted.
As you have not provided muHA.dat, the alternative code below compiles with no compile errors, but has not been tested.

Rather than (as discussed in other posts) interpolate the best estimate of amur, the program intends to find the closest value for amur listed in the table provided, based on the values of cm and Tmu that are listed. This would also overcome any round-off problems with the required real values when you are testing for cm and Tmu to be read within 1.d-8 of the required values.
Your coded method of looking for exact T0,C0 to within 1.d-8 is problematic if 1) the values are not provided and 2) if the values provided are not exactly T0 and C0 as a real(wp) value.

  integer, parameter :: wp = selected_real_kind (10)
  real(wp), allocatable :: cm(:), Tmu(:), amur(:,:)
  real(wp) :: T0, C0
  integer :: ncm, ntmu, kc, kt, i, j, iostat

  ncm  = 11
  ntmu = 269
  allocate ( cm(ncm), Tmu(ntmu), amur(ncm,ntmu) )

  T0 = 300.0_wp
  c0 = 0.4_wp

  kc   = 1   !  closest cm(i)   for c0
  kt   = 1   !  closest Tmu(jt) for T0

  open(10,file='muHA.dat') 

  read (10,*, iostat = iostat) Tmu(1), (cm(i),i=1,ncm) 
  if ( iostat /= 0 ) then
    write (*,*) ' problem reading cm from muHA.dat'
    stop
  end if
  
  do j = 1,ntmu
    read (10,*, iostat = iostat) Tmu(j), (amur(i,j),i=1,ncm) 
     if ( iostat /= 0 ) then
       write (*,*) ' problem reading amur from muHA.dat at record',j
       stop
     end if
    if ( abs(Tmu(j)-T0) < abs(Tmu(kt)-T0) ) kt = j
  end do
  ntmu = j-1
  close (10)

  if ( j <= ntmu ) then
    print*, 'unable to read table' 
    stop 
  end if

  do i=1,ncm
    if ( abs(cm(i)-c0) < abs (cm(kc)-c0) ) kc = i
  end do

  write (*,*) 'Closest value for'
  write (*,*) ' C0 =',c0,' is', cm(kc)
  write (*,*) ' T0 =',t0,' is', Tmu(kt)
  write (*,*) ' amur =',amur(kc,kt)

  end

A next step could be to interpolate the best value about the point kc,kt in the supplied table.

urbanjost · November 10, 2022, 3:28am

Are you just going to read the one input file, or do you need a generic method of reading different input files? How big are the files, if so? Several posts have referenced how unreliable it is to look for an exact match with floating point values, but there are several procedures for reading basic tables from a file
available.

Is this something just quick and dirty? Do you use fpm?
If you do, enter

fpm new proto
cd proto

and put the following fortran file in app/main.f90
and the datafile into “inputfile” in the top of the project directory
and add the following to the fpm.toml file:

[dependencies]
M_io = { git = "https://github.com/urbanjost/M_io" }

program main
use M_io,  only : read_table
implicit none
real,allocatable :: array(:,:)
integer :: i,j, ierr
   ! read file as a table
   call read_table('inputfile',array,ierr)
   ! print values
   ! show some attributes of the array read in
   write(*,*)'size=       ',size(array)
   write(*,*)'size(dim=1)=',size(array,dim=1)
   write(*,*)'size(dim=2)=',size(array,dim=2)
   ! show the entire array
   do i=1,size(array,dim=1)
      write(*,*)array(i,:)
   enddo
   ! give nice names to the subarrays of interest
   associate (x=>array(1,2:),y=>array(2:,1),z=>array(2:,2:) )
      ! show your subarrays
      write(*,*)'X=',x
      write(*,*)'Y=',y
      write(*,*)'Z='
      do i=1,size(z,dim=1)
         write(*,*)z(i,:)
      enddo
      ! find value (REALLY do this more robustly)
      i=findloc(y,30.0,dim=1)
      j=findloc(x,0.40,dim=1)
      write(*,*)'location=',i,j
      write(*,*)z(i,j)
   end associate

end program main

sample “inputfile”

0   .1           .2           .3           .4           .5
5   20526.3716   14572.9521   27718.159    22593.21442  27145.10622
10  32401.4931   5641.11114   27078.18790  19103.3273   5604.23500
15  855.775      20600.1222   25916.23962  25393.20786  23221.16154
20  29004.23590  20097.9097   12671.16411  17876.26376  30291.635
25  8187.20373   22478.21666  1628.3514    19902.32148  8906.4292
30  16514.10185  32576.30782  23101.1582   21897.16568  8106.30035
35  6549.6536    26006.25288  19449.24701  14215.14241  21347.9012

then run

fpm run

you should get

 size=                 48
 size(dim=1)=           8
 size(dim=2)=           6
   0.00000000      0.100000001      0.200000003      0.300000012      0.400000006      0.500000000    
   5.00000000       20526.3711       14572.9521       27718.1582       22593.2148       27145.0996    
   10.0000000       32401.4922       5641.11133       27078.1875       19103.3281       5604.22998    
   15.0000000       855.775024       20600.1230       25916.2402       25393.2070       23221.0996    
   20.0000000       29004.2363       20097.9102       12671.1641       17876.2637       30291.5996    
   25.0000000       8187.20361       22478.2168       1628.35144       19902.3223       8906.41992    
   30.0000000       16514.1016       32576.3086       23101.1582       21897.1660       8106.29980    
   35.0000000       6549.65381       26006.2520       19449.2461       14215.1426       21347.9004    
 X=  0.100000001      0.200000003      0.300000012      0.400000006      0.500000000    
 Y=   5.00000000       10.0000000       15.0000000       20.0000000       25.0000000       30.0000000       35.0000000    
 Z=
   20526.3711       14572.9521       27718.1582       22593.2148       27145.0996    
   32401.4922       5641.11133       27078.1875       19103.3281       5604.22998    
   855.775024       20600.1230       25916.2402       25393.2070       23221.0996    
   29004.2363       20097.9102       12671.1641       17876.2637       30291.5996    
   8187.20361       22478.2168       1628.35144       19902.3223       8906.41992    
   16514.1016       32576.3086       23101.1582       21897.1660       8106.29980    
   6549.65381       26006.2520       19449.2461       14215.1426       21347.9004    
 location=           6           4
   21897.1660

The readtable() routine slurps a file into an array in memory.
It would not be a good idea to read very large files with it;
but it is a quick and dirty procedure for doing what you want, I think

then you would REALLY want to replace the FINDLOC() calls with
something that interpolates on your Z array, as FINDLOC() is
quite risky, as it would have to find exact values (not making
a check in the example it found a value which it should do, but you want to
do something fancier anyway).

If this is not something small you are doing quick and dirty
you want to do the reading of the file yourself. If the input
file is large, reading it all into memory has major issues; and
you would need to use something else, like an indexed direct
access file.

It would probably be a nice option on the intrinsic FINDLOC to have options to find a value within
a certain error range, or the nearest value, or to interpolate a value making some assumptions about
the array. Not too hard for a vector, and you could allow different interpolation options like linear, spline,
polynomial, and so on; gets tricky very quickly with generic multi-dimensional arrays though.

JohnCampbell · November 10, 2022, 5:53am

I took @urbanjost 's “inputfile”,
prefixed it with ncolumn and nrowdata ,
appended C0,T0 values list.
There is considerable non-linearity in the surface generated, especially near cm=0.1, tmu=15. which shows that linear interpolation is poor for the saddles (see below)

However, I then included a bi-linerar interpolation estimate to the program, based on the 4 surounding table values and provided an estimate.
Estimates near the identified saddle show significant variation are probably not adequate, but the program as presented does achieve values, hopefully better than the previous nearest table samples.

The file ‘muHA.dat’ is

 5 7
 0   .1           .2           .3           .4           .5
 5   20526.3716   14572.9521   27718.159    22593.21442  27145.10622
 10  32401.4931   5641.11114   27078.18790  19103.3273   5604.23500
 15  855.775      20600.1222   25916.23962  25393.20786  23221.16154
 20  29004.23590  20097.9097   12671.16411  17876.26376  30291.635
 25  8187.20373   22478.21666  1628.3514    19902.32148  8906.4292
 30  16514.10185  32576.30782  23101.1582   21897.16568  8106.30035
 35  6549.6536    26006.25288  19449.24701  14215.14241  21347.9012
 .2 25
 .25 25
 .25 27
 .12 14.
 .12 15.
 .12 16.
 -1 -1

The program, including a bilinear interpolation is now:

  implicit none
  integer, parameter :: wp = selected_real_kind (10)
  real(wp), external :: get_linear_value
  real(wp), allocatable :: cm(:), Tmu(:), amur(:,:)
  real(wp) :: T0, C0, val_amur
  integer :: ncm, ntmu, kc, kt, i, j, iostat

  ncm  = 11
  ntmu = 269

  T0 = 300.0_wp
  c0 = 0.4_wp

  kc   = 1   !  closest cm(i)   for c0
  kt   = 1   !  closest Tmu(jt) for T0

  open(10,file='muHA.dat') 

  read (10,*) ncm, ntmu
  write (*,*) ncm, ' columns of cm values'
  write (*,*) ntmu,' rows of Tmu values'
  
  allocate ( cm(ncm), Tmu(ntmu), amur(ncm,ntmu) )

  read (10,*, iostat = iostat) Tmu(1), (cm(i),i=1,ncm)
  if ( iostat /= 0 ) then
    write (*,*) ' problem reading cm from muHA.dat'
    stop
  end if
  
  do j = 1,ntmu
    read (10,*, iostat = iostat) Tmu(j), (amur(i,j),i=1,ncm) 
     if ( iostat /= 0 ) then
       write (*,*) ' problem reading amur from muHA.dat at record',j
       exit
     end if
  end do

  if ( j <= ntmu ) then
    print*, 'unable to read table' 
    stop 
  else
    write (*,*) ntmu,' rows of amur values read'
  end if

  do
     read (10,*,iostat=iostat ) C0,T0
      if ( iostat /= 0 ) exit
      write (*,*) ' '
      write (*,*) 'Testing for C0=',C0,'  T0=',T0
     
     do i=1,ncm
       if ( abs(cm(i)-c0) < abs (cm(kc)-c0) ) kc = i
     end do

     do j = 1,ntmu
       if ( abs(Tmu(j)-T0) < abs(Tmu(kt)-T0) ) kt = j
     end do

     write (*,*) 'Closest value for'
     write (*,*) ' C0 =',c0,' is', cm(kc)
     write (*,*) ' T0 =',t0,' is', Tmu(kt)
     write (*,*) ' amur in table =',amur(kc,kt)

     val_amur = get_linear_value ( C0, T0, cm, Tmu, amur, ncm, ntmu )

     write (*,*) ' linear interpolation amur =',val_amur
  end do   
  close (10)

  end

  function get_linear_value ( C0, T0, cm, Tmu, amur, ncm, ntm )

!  does bi-linear interpolation
!  assumes cm and Tmu are ordered; reports error if out of order
!  if C0 or T0 are out of range; limits to edge of table

  implicit none
  integer, parameter :: wp = selected_real_kind (10)
  real(wp) :: get_linear_value
  integer  :: ncm, ntm
  real(wp) :: cm(ncm), Tmu(ntm), amur(ncm, ntm)
  real(wp) :: T0, C0
!
  integer  :: kc, kt
  real(wp) :: fc, ft

  call get_linear_factor ( fc, kc, C0, cm,  ncm, 'cm' )
   if ( kc < 1 ) goto 901

  call get_linear_factor ( ft, kt, T0, Tmu, ntm, 'Tmu' )
   if ( kt < 1 ) goto 901

   write (*,*) 'interpolation factors :',kc,kt, fc,ft
!
    get_linear_value = amur(kc,  kt)   * (1-fc)*(1-ft)  &
                     + amur(kc+1,kt)   *    fc *(1-ft)  &
                     + amur(kc,  kt+1) * (1-fc)*   ft   &
                     + amur(kc+1,kt+1) *    fc *   ft  
    return

 901 get_linear_value = -999.
  end function get_linear_value

  subroutine get_linear_factor ( fc, kc, C0, cm, ncm, vn )

  implicit none
  integer, parameter :: wp = selected_real_kind (10)
  integer   :: kc, ncm
  real(wp)  :: fc, C0, cm(ncm), dx
  character :: vn*(*)
  
    do kc = 1,ncm-1
      dx = cm(kc+1)-cm(kc)
      if ( dx == 0 ) cycle     !  step in values so no inetrpolation
      fc = (C0-cm(kc)) / dx
      if ( dx <  0 ) goto 901  !  CM out of order
      if ( fc >= 0 .and. fc <= 1 ) exit  
    end do

    if ( kc < ncm ) return

    if ( C0 < cm(1) ) then
      kc = 1
      fc = 0
      return
    else if ( C0 > cm(ncm) ) then
      kc = ncm-1
      fc = 1
      return
    else
      goto 902
    end if

901 write (*,*) ' ERROR ',vn,' list is out of order'
    kc = -1
    return

902 write (*,*) ' ERROR ',vn,' target ',C0,' is out of range',cm(1),cm(ncm)
    kc = -1
    return
  
  end subroutine get_linear_factor

This bilinear interpolation is fairly straightfoward; identifying the 4-point enclosing cell and factors.
The next step could be to introduce 9-point quadratic or 16-point cubic interpolation.
The original table posted also had non uniform column and row spacing for cm and Tmu, which is another issue which quadratic or cubic interpolation might ignore.
At this stage, I think it may be preferable to better understand the field behaviour or increase the table resolution.

PierU · November 10, 2022, 9:15am

minloc(abs(array(:)-value)) (and possibly a tolerance test afterwards).

urbanjost · November 10, 2022, 1:34pm

Especially ssuming x and y are increasing monotonically and the requested value is close to a
table value the MINLOC call is much more robust. Using the M_io procedure the entire
program sans the sample output would then be

program main
use M_io,  only : read_table
implicit none
real,allocatable :: array(:,:), xp, yp
integer :: i,j, ierr
   xp=0.40
   yp=30.0
   call read_table('inputfile',array,ierr)
   associate (x=>array(1,2:),y=>array(2:,1),z=>array(2:,2:) )
      j=minloc(abs(x(:)-xp),dim=1) 
      i=minloc(abs(y(:)-yp),dim=1)
      write(*,*)z(i,j)
   end associate
end program main

That is so much nicer I should probably change the first example of read_table.

urbanjost · November 10, 2022, 1:38pm

Nice. I have some steam property tables I cannot unfortunately post I wish I had used as a sample just to see that refined a representation of.

msz59 · November 10, 2022, 4:20pm

For all that would like to experiment with the OP data, their table obtained from the image using tesseract image.png table --psm 6. No guarantee it has no errors, though (I have spotted one in the first line - 0.99 instead of 0.90). Also, it end on line starting with 400.
table.txt (3.6 KB)

JohnCampbell · November 11, 2022, 1:09am

@msz59
Thanks for the data. Plotting this via Excel, although a primitive graphics package, does show data that is more suitable for linear interpolation, although this presentation does not show the varying delta_Tmu or delta_cm. (Tmu = 298.15 does look unusual in the chart but ok in the data table)

Based on this, it still appears reasonable to use linear interpolation, rather than nearest to the pin.

Arjen · November 13, 2022, 4:16pm

I used this weekend to concoct the humble beginnings of something like that. The module I have now allows:

to read a CSV file (given the separation character) - just the file name is required. And the file should contain a header line for the column names and only real data.
A filter routine that allows you to select data that adhere to a simple condition, like the data in column 1 are between 1 and 2.

So nothing fancy or even likely to be useful at this point, but then including comments and a small test program it is 416 lines of code so far.

Topic		Replies	Views
Apply Excel LOOKUP in Fortran Help	20	503	February 23, 2025
Reading data into direct access files Help	39	2972	October 21, 2021
Reading a matrix of unknown size	14	3107	February 13, 2021
Best way to read only last two real values in a line using read in Fortran Help	26	4169	July 15, 2022
Advent of Code 2024 Announcements	37	1068	December 27, 2024

Reading File knowing certain variables

Related topics