Indexing source code

This is not Fortran-specific but may be of interest.

Sometimes I want to find Fortran code on my hard disk about a topic by searching for a string. There is the field of text indexing, but it appears that the Windows findstr command (similar to grep) does this behind the scenes, making later searches much faster. For example, in a directory with 3200 .f90 files that take up 26 MB, findstr -i autoreg *.f90 takes 26 s to run the first time but only 0.4 s the second time, and searches for other strings also take about 0.4 s. For source code downloaded from GitHub, 20,000 files occuping 300 MB, searching in subdirectories with findstr -i -s autoreg *.f* took 98s initially but only 3s the second time for the same or a different search string.

Using grep on WSL I find that the first search is faster than Windows findstr but that subsequent searches take the same time and are slower than subsequent searches with findstr.

ETA: Someone wrote

findstr will be noticably faster the second time you search within the same directory since windows caches opened files

1 Like