Programming challenge (19th century writers)

Thanks @epagone for the interesting perspective. I didn’t notice it before, but indeed the table and the operations that can be performed are reminiscent of a spreadsheet or dataframe.

I must admit, I don’t really see how a Fortran dataframe could be superior to what is available in R, Pandas, or Excel. These tools allow interactive usage, i.e. you display some data, evaluate, modify your commands, and reprint. Essentially you work with a REPL (in case of Excel a GUI).

Currently, a Fortran program is compiled, meaning unless you write a full blown expression parser, you can only support a very limited set of operations. Sure it can be nice to take a peak at some data on the fly, but I think the other available tools are far superior to what can be done in Fortran. Undertaking a project like this feels to me like re-inventing the wheel. But I can imagine with LFortran providing a REPL, a dataframe will begin to make sense.

Does the behavior match in both cases, if you write the indexes to an array first, instead of providing them directly as an expression?

I have many Fortran programs that work with daily times series of stock prices or returns. Instead of passing around separately dates(:), returns(:,:), stock_symbols(:) to procedures and checking in each procedure whether they are dimensioned consistently I use a derived type

type :: date_frame_full
   character (len=1000)                 :: title = ""
   type(date_mdy)         , allocatable :: dates(:)  ! (nobs)
   character (len=len_sym), allocatable :: sym(:)    ! (nvar)
   real(kind=dp)          , allocatable :: xx(:,:)   ! (nobs,nvar)
end type date_frame_full

for which I have defined

many procedures
public :: date_frame_full,set_date_frame,read_date_frame,num_var,num_obs, &
          num_obs_total,first_date,last_date,compute_returns,write_date_frame,bad_real, &
          date_frame_of_file,first_year,resample_period, &
          weighted_sum,allocate_from_df,random_df,lag,operator(-),operator(+),mult,select_var, &
          conform,write_date_frame_stream,alloc_df,read_date_frame_stream, &
          serialize_date_frame,select_obs,shape_df,reading_time, &
          print_shapes,df_product,df_rows_sumproduct,add_df,df_skip_initial_obs,display, &
          display_tail,display_head_tail,compute_returns_with_dividends,dates_pos,var_pos,project_dates, &
          project_var,project_dates_var,obs_since,cumul_product,operator(/), &
          operator(*),project_date_frame,date_range_string,demean_by_var,first_few_var,pos_date,dates_df_vec, &
          subset,add_var_vec,add_dividend_adjusted_prices,add_returns,fill_data,subset_obs_bounds,operator(**), &
          abs,subset_sym

I have found it useful and could clean it up and publish it if there is interest. One thing I have wondered is how to create a similar derived type

type :: date_frame_seconds ! time measured to integer seconds
   character (len=1000)                 :: title = ""
   integer                , allocatable :: times(:)  ! (nobs) -- seconds after midnight
   character (len=len_sym), allocatable :: sym(:)    ! (nvar)
   real(kind=dp)          , allocatable :: xx(:,:)   ! (nobs,nvar)
end type date_frame_seconds

without excessive code duplication. Such derived types could also be used for time series in climatology and other domains.

1 Like

I reported the definite bugs; but the remaining one is just me, I think. Apparently something I thought was equivalent is not, as I get the same error with three compilers. I will figure it out when I have time. The exercise has been fun and productive, but I am apparently having a brain burp. I thought both of these sorts would work. I know it will be something simple but I am not seeing it:

program main
! bug: ifort calls sorti(1) twice for each invocation
implicit none
integer,parameter            :: isz=100
real                         :: rr(isz)
integer                      :: ii(isz), jj(isz)
integer                      :: i
character(len=*),parameter :: x='(*(g0,1x))'
   !write(*,*)'HUGE=',huge(0) !  HUGE=  2147483647
   call doboth( [100,100,500,300,100,400,500,200] )
   call doboth( [0,0,0,0,0] )
   call doboth( [100,500,300,400,200] )
   call doboth( [100,500,-300,0,400,200] )
   call doboth( [600,300,200,100,0,-100,-200,-300] )
   write(*,x)repeat('=',80)
   write(*,*)'initializing array with ',isz,' random numbers'
   CALL RANDOM_NUMBER(RR)
   jj=rr*450000.0
   ! use the index array to actually move the input array into a sorted order
   ii=jj
   ii=ii(sorti(ii))
   write(*,*)'checking if values are sorted(3f)'
   do i=1,isz-1
      if(ii(i).gt.ii(i+1))then
         write(*,*)'Error in sorting reals small to large ',i,ii(i),ii(i+1)
      endif
   enddo
   write(*,*)'test of sorti(3f) complete'

   ii=jj
   ii(sorti(ii))=ii
   write(*,*)'checking if values are sorted(3f)'
   do i=1,isz-1
      if(ii(i).gt.ii(i+1))then
         write(*,*)'Error in sorting reals small to large ',i,ii(i),ii(i+1)
      endif
   enddo
   write(*,*)'test of sorti(3f) complete'
contains
subroutine doboth(ints)
integer :: ints(:)
integer,allocatable :: iii(:)
   iii=ints
   write(*,x)repeat('=',80)
   write(*,x)'INPUT',iii
   iii(sorti(iii))=iii
   write(*,x)'LHS  ',iii
   iii=iii(sorti(iii))
   write(*,x)'RHS  ',iii
end subroutine doboth
function sorti(ints) result(counts)
integer :: ints(:)
integer :: counts(size(ints)), i
integer, save :: calls=0
   ! WARNING: this sort is slow as mollasses, but if fun to write in one line
   ! relative sort first            take account of duplicates
   counts=[(count(ints(i) > ints) + count(ints(i) == ints(:i)), i=1,size(ints) )]
   calls=calls+1
   write(*,x)'CALLS',CALLS,'VALUES',counts
end function sorti
end program

I added a few things that push modern features and even though I was totally content with using fpm(1) dependencies I made a stand-alone version in app/main.f90 if any non-fpm users want to start with my version (which does not use the logic above with the bug) in

if anyone else wants to play. Ignore the bug.f90 and blocky.f90 files. So good for now (unless someone wants to explain why my first sort in above does not work :>).

There are some interesting ideas discussed above that would be interesting to implement. I am curious to compare the app/main.f90 version to some of the C++ versions if any classmates are brazen enough to post in hostile terroritory :smile:

So several bug reports, and the M_random and chal github repositories and some fun posting later I am taking a break from worrying about it for now, but I learned I did not know something I thought I did, which is always useful. I originally just thought it was a good puzzle to play with while I drank a cup of coffee.

PS: concerning some of the comments regarding something like a spreadsheet or shell in Fortran mentioned above:

I have used a program for years (as have several other hundred people) that is a fortran shell with a full expression parser, graphics, steam tables, integration, differentiation, several command styles including a purely functional one (even IF is a function, but it can execute named blocks of code as well as expressions) that I still prefer over gnuplot, R, MATLAB, and so on. Even today a big advantage (it has not changed radically since the early 90’s) is that you can load Fortran functions into it directly without creating bindings; and the people that use it have a LOT of those, and the data formats (binary and text) that it can read are very simple for Fortran programs to write, so Fortran is a totally adequate language for that type of stuff, it’s just that most places have not maintained or developed them. Some other old tools are still around and have an impressive amount of functionality, I also used a nearly complete FORTRAN77 interpreter interactively for years, so I am not such a hater of implicit typing as some are. Interactively, it is quite handy. Unfortunately, that is gone (I think). I still don’t know anything else where I can say to take the output of two runs and produce deltas on temperature versus pressure tables, and anywhere where the delta is > a threshhold make automatically labeled plots showing the curves to the left axis and the delta to right axis with a single command, and play back everything including mouse selections from a journal file (which can also be used to verify exactly what operations were performed on the data) in any of those other tools (and pedigree is critical to me) so I don’t think I am going to switch anytime soon.

Sounds interesting – I assume it is proprietary. Can parts of it be recreated in LFortran? Even if the code or an executable cannot be released, releasing the documentation could be beneficial. Sometimes I look at the documentation Matlab, NAG, IMSL, and other commercial software for algorithms and references.

It is interesting, at least to me, but I and others have tried several times and been repeatedly told no. The company involved is not in it’s prime and there has not been a new release for nearly six years now, which is sad to me. The syntax is Unix-like (actually, AEGIS-like, but that is a long story – we had never used Unix when it was started) so a lot of people familiar with Unix/GNU-Linux find it easy to use, and like other shells any command not built in is passed to the OS so it feels like you are in a typical shell when using it. You can use “cd” and “ls” and so on, for example (well, cd is actually a built-in but a look-alike of other shells). It lets you build custom commands somewhat like bash functions but you define a command prototype so you get automatic command line parsing. The journaling is particularly handy as it includes all non-graphic output with screen messages recorded as comments (it ALWAYS journals) so you can replay anything you did. The functional mode is something that is an acquired taste so not everyone uses it, but
you can do interesting things that are somewhat like SQL in functionality with it. It is usually used for extracting data from tables, but it can be powerful.Something like

plot -f 3 -if eq(max(ge(temp,300) ge(pressure,400),primary(“HG1*”),0)

is one of the uglier but powerful commands you can use. That is not typical. Sort of like RPN.
Some people love it, others hate it and use a series of simpler commands instead. A main initial design feature was to make it feel like the sh(1) shell and to use Fortran syntax for expressions, which a lot of the targeted users were familiar with so you could start using it quickly. Never did add pipes to it though, which would have been a great addition. But you use files like tables so you don’t appear to be using files for data much except to load them and save them.

Because the primary file types are self-describing labeling and units are known so a lot of the users use just a few commands like plot, math, attach, save, and read; but others made some pretty impressive programs with it. I remember some that took projects projected to take several years and automated them and ran them in a few days.

Some of the code used for the prototype are publicly available but really dated. I could probably paste M_kracken, M_graph, M_calculator together and re-create the first production version (circa 1990-ish!) but that would not be very satisfying.

That was a fun code to work on, unfortunately since the Tsunami (long story why that is related) I don’t think anyone has worked on it except to port it to new machines.

Just one screen from the TOC for the help (there are about ten pages of main topics, some like steam(1) and matrix(1) go to many others:

paranoia (1ush)      - [USH-D:CMD] test floating point operations
parcel (1ush)        - [USH-D:CMD] create a new command
relate (1ush)        - [USH-D:CMD] unit conversion
setdash (1ush)       - [USH-D:CMD] create dash-code patterns for drawing lines
setmark (1ush)       - [USH-D:CMD] create or change geometric markers 1 thru 20
show (1ush)          - [USH-D:CMD] display command keywords and values
system (1ush)        - [USH-D:CMD] turn access to system commands on and off
units (1ush)         - [USH-D:CMD] control or display text associated with unit codes
wipe (1ush)          - [USH-D:CMD] empty the pseudo file (file 0)
zoom (1ush)          - [USH-D:CMD] select X11 keyboard zoom mode for plot(1) command
ansi (3ush)          - [USH-E:CALC] basic Fortran-intrinsic calculator functions
bessel (3ush)        - [USH-E:CALC] Bessel functions in the calculator
c (3ush)             - [USH-E:CALC] functions for mixing curves from multiple files
Calculator (3ush)    - [USH-E:CALC] using the calculator in USH
clcon (3ush)         - [USH-E:CALC] color model conversion calculator function
logical (3ush)       - [USH-E:CALC] NUMERIC LOGICAL FUNCTIONS
prompting_functions (3ush) - [USH-E:CALC] single numeric OR string prompting functions
steam (3ush)         - [USH-E:CALC] ASME steam functions
string (3ush)        - [USH-E:CALC] string functions
time (3ush)          - [USH-E:CALC] Time functions in jucalc(3f) expressions
wif97 (3ush)         - [USH-E:CALC] IF-97 Steam Routines
X11_functions (3ush) - [USH-E:CALC] X11 mouse functions
variables (7ush)     - [USH-E:CALC] miscellaneous functions and variables
convert (3ush)       - [USH-F:OPERATOR] A linear conversion operator for math(1)
decimate (3ush)      - [USH-F:OPERATOR] extract a portion of a curve
dif (3ush)           - [USH-F:OPERATOR] derivative of a curve
intg (3ush)          - [USH-F:OPERATOR] integrate curves
lfit (3ush)          - [USH-F:OPERATOR] linear fit of curves
lowess (3ush)        - [USH-F:OPERATOR] weighted regression smoothing operator for math(1)
map (3ush)           - [USH-F:OPERATOR] map curves to new X values
mask (3ush)          - [USH-F:OPERATOR] select curve points with a mask
operators (3ush)     - [USH-F:OPERATOR] OPERATORS for get(1) and math(1) (calculus, fitting, data reduction, ..
poly (3ush)          - [USH-F:OPERATOR] Polynomial curve fit of order 1 <= n <= 10.
page (1ush)          - [USH-G:LOWLEVEL] initializing the display for low-level graphics mode
parea (1ush)         - [USH-G:LOWLEVEL] creating subplot areas in low-level graphics mode
2 Likes

Actually, the pointer is not necessary. It was a relic from an earlier attempt. But unfortunately, using transfer() is not an option, as it runs into the same problem as attempting the switch directly.

Hello Ivan,

Did you ever get time to publish this? It will be extremely cool to see R kind of dataframe in Fortran.

Cheers,
Sumit

A currently developed project by @jaiken is

1 Like