About a Fortran Scientific Library

At the last monthly call we touched briefly upon the “goal of Fortran”. Perhaps I can use this thread to elaborate my view further.

In fact several mature scientific libraries exist in Fortran. The most notable examples are

Unfortunately, the ones that are actively maintained (NAG, IMSL) are not open source. HSL is free for non-commercial use, but obtaining and using the code is not very user-friendly. The SLATEC and NSWC on the other hand are Fortran 77 style codes with old-fashioned calling sequences and lacking modern documentation. The GNU Scientific Library you mention was in fact developed as a modern replacement to SLATEC.

There are hundreds more of specialist packages, if you think about all the codes available on netlib and GAMS.

Since Fortran users are scientists and engineers, most of them lack the training in software engineering and web development, which are necessary to set up an accesible library. Thinking of the era of old mainframe computers, not everyone had a course in numerical analysis or even the opportunity to work with a computer. It would have made sense to pay a professional provide you with the routines needed to solve your computational problem. At the same time new programs (programming languages) such as MATLAB and Octave were developed, to give people access to numerical tools without having to learn Fortran.

Cleve Moler wrote about this in his blog:

In the 1970s and early 1980s, while I was working on the LINPACK and EISPACK projects that I discussed in two previous posts, I was a Professor of Mathematics and then of Computer Science at the University of New Mexico in Albuquerque. I was teaching courses in Linear Algebra and Numerical Analysis. I wanted my students to have easy access to LINPACK and EISPACK without writing Fortran programs. By “easy access” I meant not going through the remote batch processing and the repeated edit-compile-link-load-execute process that was ordinarily required on the campus central mainframe computer.

(Thanks to @urbanjost, we still have a copy of an old version of MATLAB.)

A similar motivation can be found from the Octave developers:

Octave was originally conceived (in about 1988) to be companion software for an undergraduate-level textbook on chemical reactor design being written by James B. Rawlings of the University of Wisconsin-Madison and John G. Ekerdt of the University of Texas. We originally envisioned some very specialized tools for the solution of chemical reactor design problems. Later, after seeing the limitations of that approach, we opted to attempt to build a much more flexible tool.

There were still some people who said that we should just be using Fortran instead, because it is the computer language of engineering, but every time we had tried that, the students spent far too much time trying to figure out why their Fortran code failed and not enough time learning about chemical engineering. We believed that with an interactive environment like Octave, most students would be able to pick up the basics quickly, and begin using it confidently in just a few hours.

We could say that at this point Fortran was already losing users. For simple scientific calculations, interactive usage is much more straightforward than the Fortran development cycle. Thinking of F77, without modules, and without a good packaging solution, it is painstaking process to get all your calls correct. Especially, when you are doing all your work through a tiny terminal.

So at this point we could say Fortran was already becoming a niche language for HPC users, and falling out of favor for data analysis and everyday scientific calculations. One of the places were Fortran was in everyday use for statistics, data analysis, and plotting, was at CERN in the form of the the CERN Program Library. This was until it was finally superseded by ROOT (a program in C++) in 2003.


Concerning strategy, on top of what @certik wrote, with respect to stdlib, I think it is most important we define interfaces for the common scientific procedures we would like to see (essentially the stuff in NAG and IMSL).

Then it is just a matter of time to consolidate and refactor the publicly available Fortran codes which fit into such a scientific library (the Fortran codes which are wrapped in SciPy, SLATEC, other codes from netlib, etc.). This would be the free stdlib.

I used the word refactor, because I have the experience that it can be easier to generate a robust Fortran wrapper in Python, than it is to provide the same old Fortran package in modern Fortran. This is because Fortran, compared to Python, lacks assertions, exception handling, positional and keyword arguments, introspection, etc. Ironically, the old F77 style interfaces with explicit length array arguments and work arrays are easier to wrap, than some modern Fortran codes that utilize allocatable and assumed shape arrays, derived types, etc.

If we can bring stdlib far enough ahead, hopefully then organizations such as Intel, NAG, and IMSL would decide to offer versions of their own routine libraries, conforming to the stdlib interface. This can be done very elegantly using sub-modules. In fact we could even take a shortcut, and have back ends based upon GSL or Boost C++ libraries.

But agreeing upon on a sensible Fortran library API is the first and most crucial step.

11 Likes

My mistake. What I had in mind, was the Python way of receiving formal parameters **kwargs that becomes a dictionary.

Thanks a lot for the replies! Sorry I didn’t arrived to the ‘monthly call’ where you kind of address this point. Although not very deeply.

So, my concern about @certik first point is more like ‘What does “Make progress” really means?’

Thanks @ivanpribec for your nice and detailed reply. So, that’s what I was thinking to be one of the solutions, define interfaces from these already available code, the one in netlib for example.

On the other hand, another concern would be to rely on proprietary libraries and their kindness to share their work…

Liked the possible shortcut to have back ends based upon GSL, although that’ll be pretty much similar to that Fortran interface to GSL I mentioned before.

Maybe moving to another topic, what did you mean @ivanpribec with free stdlib? Is there any idea to have non-free parts of stdlib? For example, I have no much trouble on finding the code and create my library for my code, but sometime the issue is when I want to share with other people. Then I might have the problem of proprietary software or that people I share my code with has trouble with the non-standardized library.

P.S. I’m sorry because I know I’m pretty much obsessed with this point.

I meant simply the open-source version. What I had in mind was kind of how you have the standard NumPy, Scipy libraries, and then you have the Intel Distribution for Python which uses the specialized Intel libraries under the hood. The former are available to everyone, the latter only to those with access to Intel products. (Although I think MKL is available to everyone nowadays).

To be more specific, say we want to have a Fortran interface for cubic splines. We can write a new implementation like they did in SciPy (see CubicSpline). Or we can base our implementation upon pchip, or the pppack code by Carl de Boor.

On the other hand Intel could decide to use the data fitting functions from Intel MKL as the back end. IMSL could role out their version based on their spline_fitting routines. And so on.

1 Like

@iarbina please join us at the next Fortran call and we can discuss it more.

Progress means making stdlib usable like SciPy, and use it as a dependency with fpm or directly. If you have any questions or concerns, let’s discuss more.

1 Like

Ah, ok, I see. Yeah, of course I guess vendors will still create and own their libraries.

Yeah, that’s what I meant by roadmap, to discuss whether to create new code or interface older one. Which, for me, seems more realistic to interface older one, at least for things as special functions.

@certik That, of course, sounds great.

I see that probably there’s still too soon to discuss things like the ones mentioned by @ivanpribec and how to approach the creation of a mathematical/scientific library until other essential parts of the stdlib are developed.

Considering the very detailed and informative post by @ivanpribec on the available numerical methods code, I was wondering what considerations have gone into the statistics portion of the proposed Fortran standard library?

When you compare the statistical methods in GNU Scientific library to the work done on Stan-MC, I’d be interested to learn what methods developed in Stan might have more widespread utility.

My interest in this problem comes from my realization that the vast majority of scientific programming projects use:

  1. a compiled, unmanaged language for segments that efficiently use computational resources (in practice that means C, C++, and/or Fortran) and

  2. a garbage collected, interpreted scripting language for other parts, to efficiently use programmer resources (data cleaning, user interface, etc.)

I’d like a numerical methods package that is not tied to a particular scripting language, as a number of scientific libraries are tied to Python.

1 Like

A Fortran allocatable array is deallocated when it goes out of scope. Does that qualify the language as being garbage collected? One only needs to DEALLOCATE an array when it is already ALLOCATEd and one wants to use the ALLOCATE statement again.

I was thinking more in terms of data structures that use the heap (like trees, hash tables, etc), not the stack. If I understand you correctly, those type of Fortran arrays could be simulated in C using the stack, where you don’t have to worry about malloc and free, but there are limitations to that.

Clearly, C is not a garbage collected language as Python, Perl, or R is.

I would consider Fortran to be more like Rust in that regard. There is no garbage collector (typically understood to be a runtime process incurring some overhead). But a large part of the memory management is automated. The lifetime of an object in Fortran is quite strictly it’s enclosing scope (at least until we start talking about pointers), where in Rust they introduced a facility allowing objects to outlive their scope, but still keep track of when they’re lifetime ends to alleviate the programmer from needing to manage the memory manually.

TLDR; Fortran is a managed memory language, but that management is largely automated.

3 Likes

Has there been any discussion anywhere on how the standard library will be organized?

The NIST had developed what looks like a useful classification system for software packages related to applied mathematics. While the pages look ancient, the concept still appears valid. I suspect most here will be familiar with this, but I post it for those who might not know about it.

NIST Guide to Available Mathematical Software (link)

1 Like

Upon request of folks from the zoom call, I posted some suggestions on github that I report here:

There is a standard for the class of numerical functions that are needed in scientific computation. Traditionally, that was for a long time the “blue book” *Handbook of Mathematical Functions", by Abramowitz and Stegun. However, the American National Institute of Standard and Technology (NIST) has sing long ago taken the burden on its shoulder. After countless years of work and revision, the new
Digital Library of Mathematical Functions (DLMF)
has been published.
The DLMF is a the perfect starting point to broaden the standard library. All formulas are tested, some are implemented, and all rigorously referenced and documented. So there is no better blueprint.
Anywhere would be good to start, but I suggest

  • confluent hypergeometric functions
  • legendre and related functions
  • orthogonal polynomials
  • Coulomb functions
  • spherical-group coefficients (3j, 6j, 9j)

the Digital Library of Mathematical Functions (DLMF) is as authoritative and standard a reference as one can ever possibly get. My suggestion, therefore, would be to shape the special function libraries in terms of one module per DLMF chapter, and figure out which already existing libraries provide the functionalities.

That said, GAMS is a gem of a website, which I have used for over a decade :slight_smile:

4 Likes

Possible algorithms for a Fortran Scientific Library are those in the R fastmatrix package, a “small set of functions to fast computation of some matrices and operations useful in statistics.” I copied the source code, which is a mix of R, C, and Fortran 77 (with extensions such as enddo) to github. Here is a list of functions from the manual:

array.mult . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
bracket.prod . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
comm.info . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
comm.prod . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
commutation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
cov.MSSD . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
cov.weighted . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
dupl.cross . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
dupl.info . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
dupl.prod . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
duplication . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
equilibrate . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
geomean . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
hadamard . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
is.lower.tri . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
kronecker.prod . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
kurtosis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
lu . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
lu-methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
lu2inv . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
Mahalanobis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
matrix.inner . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
matrix.norm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
minkowski . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
ols . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
ols.fit . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
ols.fit-methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
power.method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30
ridge . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
sherman.morrison . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
sweep.operator . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34
symm.info . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
symm.prod . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36
symmetrizer . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37
vec . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38
vech . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38
wilson.hilferty . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39

1 Like

For the library portion on optimization methods, the COIN-OR site has a large collection of software worthy of consideration.

2 Likes

The NAG library is not open source in general, but “source code for over two hundred NAG library subroutines called by the Academic version of the Simfit package.” can be downloaded here. Simfit is open source and also calls Fortran libraries such as SLATEC and LAPACK. The functionality is as follows:

Statistics

All the usual descriptive statistics (bar charts, histograms, best-fit distributions on sample cdfs, dendrograms, box and whisker or cluster plots), multivariate statistics (distance matrices and dendrograms, principal components and scree plots), time series (ACF, PACF, ARIMA) and frequently used tests (mostly with exact p values not the normal approximations), such as:

  • chi-square (O/E vectors, m by n contingency tables and wssq/ndof)
  • McNemar test on n by n frequency tables
  • Cochran Q test
  • Fisher exact (2 by 2 contingency table) with all p values
  • Fisher exact Poisson distribution test
  • t (both equal and unequal variances, paired and unpaired)
  • variance ratio
  • F for model validation
  • Bartlett and Levene tests for homogeneity of variance
  • 1,2,3-way Anova (with automatic variance stabilizing transformations and nonparametric equivalents)
  • Tukey post-ANOVA Q test
  • Factorial ANOVA with marginal plots
  • Repeated measures ANOVA with Helmert matrix of orthonormal contrasts, Mauchly sphericity test and Greenhouse-Geisser/Huyn-Feldt epsilon corrections
  • MANOVA with Wilks lambda, Roy’s largest root, Lawley-Hotelling trace, and Pillai trace for equality of mean vectors, Box’s test for equality of covariance matrices, and profile analysis for repeated measurements.
  • Canonical variates and correlations for group comparisons
  • Mahalanobis distance estimation for allocation to groups using estimative or predictive Bayesian methods
  • Cochran-Mantel-Haenszel 2x2xk contingency table Meta Analysis test
  • Binomial test
  • Sign test
  • various Hotelling T-squared tests
  • Goodness of fit and non parametric tests:
    • runs (all, conditional, up and down)
    • signs
    • Wilcoxon-Mann-Whitney U
    • Wilcoxon paired-samples signed-ranks
    • Kolmogorov-Smirnov 1 and 2 sample
    • Kruskal-Wallis
    • Friedman
    • Median, Mood and David tests and Kendall coefficient of concordance
    • Mallows Cp, Akaike AIC, Schwarz SC, Durbin-Watson
    • tables and plots of residuals, weighted residuals, deviance residuals, Anscombe residuals, leverages and studentized residuals as appropriate
    • half normal and normal residuals plots
  • Multilinear regression by
    • L_1 norm
    • L_2 norm (weighted least squares)
    • L_infinity norm
    • Robust regression (M-estimates)
    • Also logistic, binary logistic, log-linear, orthogonal, reduced major axis, with interactive selection and transformation of variables in all cases.
  • Partial Least squares (PLS)
  • Survival analysis (Kaplan-Meier, ML-Weibull, Mantel-Haenszel) or, using generalized linear models with covariates (Exponential, Weibull, Extreme value, Cox)
  • Correlation analysis (Pearson product moment, Kendal Tau, Spearman Rank) on all possible pairs of columns in a matrix, and canonical correlations when data columns fall naturally into two groups. Partial correlation coefficients can be calculated for data sets with more than two variables.
  • Shapiro-Wilks normality test with the large sample correction
  • Normal scores plots
  • Binomial distribution, analysis of proportions, exact parameter confidence limits, likelihood ratio, odds, odds ratios and graphical tests such as log-odds-ratios plots with exact confidence limits for systematic variation in binomial p values
  • Trinomial distribution (and confidence contour plots)
  • Parameter estimates, confidence limits and goodness of fit for:
    • uniform
    • normal
    • binomial
    • Poisson
    • exponential
    • gamma
    • beta
    • lognormal
    • Weibull distributions
  • All possible pairwise comparisons between columns of data by KS-2, MWU and unpaired t tests using the Bonferroni principle.

Simfit menu, Simfit home page

Curve fitting

  • Sums of exponentials (and estimation of AUC)
  • Sums of Michaelis-Menten functions (and estimation of half saturation points and final asymptotes)
  • Sums of High/Low affinity binding sites
  • Cooperative order n saturation functions (and cooperativity analysis)
  • Positive n:n rational functions
  • Growth curves (derivative plots, comparison of models and estimation of min/max growth rates, half-times and final sizes)
  • Survival curves using several models
  • Polynomials (all degrees up to 6 then statistical tests for the best model and prediction of x with confidence limits given y)
  • Cubic splines (user-placed knots, automatic knots with variable tension, or crosss-validation data smoothing)
  • Systems of differential equations (Adams, Gear, phase portrait, orbits) using defined starting estimates and limits or random cycles to search for a global minimum
  • Calibration curves (polynomials, splines or user selected models)
  • Area under curve (AUC by choice from several methods)
  • Initial rates
  • Lag times
  • Horizontal and inclined asymptotes
  • Numerical deconvolution of sums of exponentials, Michaelis-Mentens, trigonometric functions and Gaussian densities
  • Fitting user supplied models
  • Analyzing flow cytometry profiles
  • After fitting functions of 1, 2 or 3 variables, parameters and objective functions can be stored for F, Akaike, Schwarz and Mallows Cp tests, and the wssq/ndof contours and 3D surface can be viewed as functions of any two chosen parameters. With all functions of 1 variable, calibration, evaluation, extrapolation, area calculations, derivative estimations and interactive error bar plots can be done.
  • Multi-function mode: simultaneous fitting of several functions of the same independent variables, linked by common model parameters
  • Generalized Linear Models (GLM) can be fitted interactively with either normal, binomial, Poisson or Gamma errors. Appropriate links can be either identity, power, square root, reciprocal, log, logistic, probit or complementary log-log with canonical links as defaults and facility to supply fixed offsets. A simplified interface is included for logistic, binary logistic or polynomial logistic regression, bioassy, log-lin contingency analysis or survival analysis.
  • Stratified data sets can be analyzed by Cox regression and conditional logistic regression
  • Autoregressive integrated moving average models (ARIMA) to time series with forecasting
  • Facility to store parameter estimates and covariance matrices in order to compute Mahalanobis distances between fits of the same model to different data sets and test for significant differences in parameter estimates.

Simfit menu, Simfit home page

Graph plotting

  • Grouping of data into histograms (with error bars if appropriate) and cdfs
  • Calculation of means and error bars with arbitrary confidence limits from replicates can be done interactively or from data files
  • Error bars can be non-symmetrical or sloping if required and multiple non-orthogonal error bars can be plotted
  • Error bars can be added to 2D and 3D bar charts and 3D cylinder plots
  • Extrapolation of best-fit linear and nonlinear curves to arbitrary end points
  • Automatic transformation of error bars into various axes (Hill, Lineweaver-Burk, Scatchard, log-odds, etc.)
  • Immunoassay type dilution plots using logs to base 2, 3, 4, 5, 6, 7, 8, 9 as well as e and 10, and with labels as logs, powers of the base or fractions
  • Multiple axes plots
  • Pie charts with arbitrary displacements, fill-styles, colours
  • Bar charts with arbitrary positions, sizes, fill-styles, colours and error-bars
  • Presentation box and whisker plots, and pie or bar charts with 3D perspective effects
  • Orbits and vector field diagrams for systems of differential equations
  • Dendrograms and 3D-cluster plots for use in cluster analysis
  • Scree diagrams and score or loading scatter plots for principal components analysis (score plots can have Hotelling T^2 elliptical confidence regions)
  • 3D-surfaces and 2D-projections of contours
  • Curves in space and projections onto planes
  • vast array of plotting characters and maths symbols
  • Standard PostScript fonts, Symbol, ZapfDingbats and Isolatin1 encoding
  • Professional quality PostScript files that can easily be edited to change titles, legends, symbols, line-types and thicknesses, etc.
  • Interactive PostScript facility for arbitrarily stretching and clipping overcrowded plots such as dendrograms without changing aspect ratios of fonts or plotting symbols but by just changing white-space between graphical objects
  • A PostScript editor is supplied for scaling, rotating, shearing, translating, editing, making collages and inlays from .eps files
  • Transformation of .eps files into bit-map and compressed graphics formats (e.g. bmp, pcx, tif, jpg, png, pdf)
  • Plotting user defined parameteric equations such as r(theta), x(t), y(t) in 2-space and x(t),y(t),z(t) in 3-space
  • Facility to import PostScript specials automatically into the PostScript file creation stream in order to redefine fonts, colours, plotting symbols, add logos, etc.
  • Graphical deconvolution of summation models after fitting
  • 2D and 3D Biplots for multivariate data sets

Simfit menu, Simfit home page

Calculations

  • Zeros of polynomials
  • Zeros of a user-defined function
  • Zeros of n nonlinear functions in n variables
  • Integrals of n user-defined functions in m variables
  • Convolution integrals
  • Bound-constrained quasi-Newton optimization
  • Eigenvalues
  • Determinants
  • Inverses
  • Singular value decomposition with right and left singular vectors
  • LU factorisation as A = PLU with matrix 1 and infinity norms and corresponding condition numbers
  • QR factorisation as in A = QR
  • Cholesky factorisation as in Q = R(R^T)
  • Matrix multiplication C = AB, (A^T)B, A(B^T) or (A^T)(B^T)
  • Evaluation of quadratic forms (x^T)Ax or (x^T)(A^{-1})x
  • Solve full-rank matrix equations Ax = b
  • Solve over-determined linear systems Ax = b in the L1, L2 or L∞ norms
  • Solve the symmetric eigenvalue problem (A - lambda*B)x = 0
  • Areas, derivatives and arc lengths of user supplied functions
  • Analysis of cooperative ligand binding (zeros of binding polynomial, Hessian, minmax Hill slope, transformed binding constants, cooperativity indices, plotting species fractions)
  • Power and sample size calculations for statistical tests used in clinical trials, including plotting power as a function of sample size (1 or 2 binomial proportions; 1, 2 or k normal ANOVA samples; 1 or 2 correlation coefficients; 1 or 2 variances; chi-square test)
  • Probabilities and cdf plots for the non-central t, non-central chi-square, non-central beta or non-central F distributions
  • Estimation of exact parameter confidence limits for the binomial, normal, Poisson, etc. distributions and plotting confidence contours for the trinomial distribution.
  • Robust calculation of location parameter with confidence limits for one sample (median, trimmed and winsorized means, Hodges-Lehmann estimate, etc.).
  • Time series smoothing by moving averages, running medians, Hanning or the 4253H-twice smoother
  • Time series, sample autocorrelation functions and partial autocorrelation functions and plots for chosen numbers of lags and associated test statistics
  • Auto- and cross-correlation matrices for two time series
  • Distance matrices for use in cluster analysis with extensive choice of pre-conditioning transformations and alternative link functions, e.g. Canberra dissimilarity and Bray-Curtis similarity.
  • Neaest neighbours from a distance matrix
  • Classical-metric and non-metric scaling of distance matrices
  • Principal components with eigenvalues, scree diagrams, loadings and scores from multivariate data sets
  • Procrustes analysis to estimate the similarity between two matrices
  • Varimax or Quartimax rotation of a loading matrix
  • Canonical variates with eigenvalues, scree diagrams, loadings and scores from multivariate data sets. Group means can be plotted with confidence regions to assign comparison data to existing groups.
  • K-means cluster analysis with plots
  • Shannon, Brilloin, Pielou, and Simpson diversity indices.
  • Kernel density estimation
3 Likes

Are you sure the sources are available? I’ve spent half an hour on those pages, unable to find them. The package seems to be offered as a set of Windows executables, the Linux support being advertised as through the wine emulator.

At the link I provided (scroll to the bottom of the page) you can download the following files. The last one has NAG source code – I downloaded and unzipped it. It is fixed source form that uses a few F90 features such as ALLOCATE and ENDDO.

simzip7_8_2.zip Simfit package source code.
Contains all batch, link, html, and icon files.
demzip7_8_1.zip Simdem package source code.
Contains all batch, link, html, and icon files.
manzip7_8_2.zip Simfit reference manual LaTeX code.
Contains all auxiliary files and graphics files.
nagzip7_3_0 .zip Source code, icons, batch files and documentation to upgrade
Simfit for a new release of the NAG DLLs.
naglib6_9_8.zip Source code for over two hundred NAG library subroutines
called by the Academic version of the Simfit package.

That is certainly great. And it has GPLv3! :heart:
But I also struggled to see how to make it work in Linux natively…

I think it’s great people is sharing their knowledge on different numerical Fortran routines. However, I think the point is to think which is the best way to create an standard numerical library that every user can easily install (with the fpm) so that her/his code can easily be compiled and run by other users.

It is true that there’s a lot of good code that can (have to) be (re)used. I think the best way to, at least as starting point, is to try to interface this old well-known code and with modern fortran, and maybe first to define the interface rules we want to set. Once this is done, all the open source routines that has been posted here can be interfaced and added to the fpm.

Does that make sense?

1 Like

One that I noticed on GitHub today is the Library of Numerical Analysis of the Research Computing Center of the Moscow State University . It looks comprehensive. The code comments are in Russian, and procedure names such as ADB1C match the documentation but are not informative on their own. Default reals are used. C and Pascal versions are also available.

Using Google Translate, the catalog is

3 Likes