GSoC: improve nonlinear least squares in R

Beliavsky · August 21, 2021, 6:40pm

On GitHub I see the project

GSoC21-improveNLS

Efforts to improve the functioning of the R nls() function for nonlinear least squares estimation. This is part of the Google Summer of Code program for 2021.

The guts of the package is old FORTRAN and C code, and looking at the project write-up, it appears that neither the GSoC student or the original programmer completely understands how it works. Quoting the report RefactoringNLS:

As the nls() man page states, when the “port” algorithm is used with the trace argument TRUE, the
iterations display the objective function value which is 1/2 the sum of squares (or deviance). It is likely that the trace display is embedded in the Fortran of the nlminb routine that is called to execute the “port” algorithm, but the discrepancy is nonetheless unfortunate for users.

Issue: code structure
The nls() code is structured in a way that inhibits both maintenance and improvement. In particular, the iterative setup is such that introduction of Marquardt stabilization is not easily available.
To obtain performance, a lot of the code is in C with consequent calls and returns that complicate the code. Over time, R has become much more efficient on modern computers, and the need to use compiled C and Fortran is less critical. Moreover, the burden for maintenance could be much reduced by moving code entirely to R.

Issue: code documentation for maintenance
setPars() – explain weaknesses. Only used by profile.nls()
The paucity of documentation is exacerbated by the mixed R/C/Fortran code base.
Following is an email to John Nash from Doug Bates. This is NOT a criticism of Prof. Bates work, but reflection on how difficult it is to develop code in this subject area and to keep it maintainable. We have experienced similar loss of understanding for some of our own codes.

As stdlib matures maybe its code can be used not only directly in Fortran but in packages for Python and R.

Descriptions of GSoC R projects are here.

fortran4r · September 12, 2021, 8:33pm

R’s many basic stats functions are written in F77 and seem difficult to read. Hopefully, someone will modernize them otherwise they probably would be replaced by C codes by the core members.

Topic		Replies	Views
Fortran-lang accepted to Google Summer of Code 2024 GSoC-2024	13	902	March 22, 2024
GSoC 2024 Applicant: Linear algebra and sparse matrices (stdlib) GSoC-2024	3	421	March 1, 2024
GSoC Optimizers: Weekly Progress Reports GSoC-2023	15	1131	August 25, 2023
Google Summer of Code 2025 GSoC-2025	23	1259	April 22, 2025
GSoC 2023 Fortran-lang resources for prospective contributors GSoC-2023	4	689	March 2, 2023

GSoC: improve nonlinear least squares in R

GSoC21-improveNLS

Related topics