How to fix your scientific coding errors

How to fix your scientific coding errors: Software bugs are frustrating. Adopting some simple strategies can help you to avoid them, and fix them when they occur.

by Jeffrey M. Perkel
Nature
31 January 2022

When it comes to software, bugs are inevitable — especially in academia, where code tends to be written by graduate students and postdocs who were never trained in software development. But simple strategies can minimize the likelihood of a bug, and ease the process of recovering from them.

To minimize delays, good documentation is crucial. Milan Curcic, an oceanographer at the University of Miami, Florida, co-authored a 2020 study6 that investigated the impact of hurricane wind speed on ocean waves. As part of that work, Curcic and his colleagues repeated calculations that had been conducted in the same lab in 2004, only to discover that the original code was using the wrong data file to perform some of its calculations, producing an “offset” of about 30%.

According to Google Scholar, the 2004 study7 has been cited more than 800 times, and its predictions inform hurricane forecasts today, Curcic says. Yet its code, written in the programming language MATLAB, was never placed online. And it was so poorly documented that Curcic had to work through it line by line to understand how it worked. When he found the error, he says, “The question was, am I not understanding this correctly, or is this indeed incorrect?”

14 Likes

Also how science progresses is mentioned in the last 3 paragraphs:

Bugs don’t necessarily mean retraction in any event. Barba, Brown and Weisberg’s errors had only minor impacts on their results, and none required changes to their publications. In 2016, Marcos Gallego Llorente, then a genetics graduate student at the University of Cambridge, UK, identified an error in the code he wrote to study human migratory patterns in Africa 4,500 years ago. When he reanalysed the data, the overall conclusion was unchanged, although the extent of its geographic impact was, and a correction sufficed.

Thomas Hoye, an organic chemist at the University of Minnesota at Minneapolis, co-authored a study that used the software in which Williams discovered a bug. When Williams contacted him, Hoye says, he didn’t have “any particular strong reaction”. He and his colleagues fixed their code, updated their online protocols, and moved on.

“I couldn’t help but at the end think, ‘this is the way science should work’,” he says. “You find a mistake, you go back, you improve, you correct, you advance.”

4 Likes

Congrats @milancurcic for being mentioned in the Nature article!

6 Likes

But that is only possible if someone has the means and the inclination to critically look at the results. In the case of software that would require examining the original source code and input or reconstructing the software and input, doesn’t it? The software mentioned in the article was never published or documented.

3 Likes

The C++ code used circa Feb-Mar 2020 by the team at Imperial College (Ferguson et al.) toward epidemiological calculations of the novel SARS-COV-2 virus and its infectiousness among populations in the report to UK government that precipitated worldwide lockdowns as a response is never to be seen by humanity again.

Those interested can peruse a highly refactored version of the code modified by Microsoft and co. at GitHub, the original calculations are never to be reproduced: