Hello Everyone,
Starting with why I was absent during the community bonding period, I had to go to my home town for health-related issues.
Now a brief about my past week’s work:
Changes to my already implemented linked list:
Deleting the whole list was Recursive earlier which led to stack overflow for more than 100000 nodes in a list. I have implemented an Iterative approach for the same which saved me from stack overflow.
I was facing memory leaks in my get function, whose return type was an allocatable which the compiler didn’t know when to deallocate. I changed the return type from allocatable to a pointer which solved this memory leak issue (Credits to @Arjen for the idea).
As I mentioned in my last blog, I will develop some stats related to the processing time of the current basic linked list.
Also, a brief of all the APIs that I believe are important for implementation are enlisted here.
My whole progress and work can be reviewed in this github repository.
Plans for the following week:
Implement a List of linked list module based on my current module.
This would hopefully reduce the traversal time of a List.
The parent list will be made many children lists that are of max size 10000 / 100000 nodes. (Not sure of what would be the optimum size of a children list)
I am open to feedback, being new to the language I am not sure if I am following the best practices or not.
Thank You
Could you describe the memory leak in some detail and possible solutions (you found one already , but it will be good to document this. for future reference)
Timings for algorithms are tricky business. You want to have data that are reproducible and most computers (or better OSes) are simply not made for that (that is why there are “real-time” OSes, IIUIC). So, you should revise your timing procedure to measure the overall time of a large enough number of get() calls, with an index randomly selected from the range. The overall time should be in the order of at least several seconds, otherwise the noise of the computer doing whatever it is computers do in addition to running your program, gets in the way.
It’s not my understanding that an allocatable function result leads to a memory leak (the compiler deallocates the result after it has been used) . I read somewhere (sorry can’t find the reference) that valgrind does not understand the semantics of compiled fortran with respect to allocatables. Is the reported amount of memory leaked proportional to the number of times the function is called?
Yes, the amount of memory leak linearly increased with respect to the number of calls of get().
Also, I did ask Arjen to check the same program on his system with his software.
He used the Intel Inspector and the Intel oneAPI toolkit to figure out the memory leak and got similar results.
I would mention that I used gfortran compiler.
If you use allocatables, then there cannot be a memory leak in the way the Fortran langauge is constructed. If it leaks, it’s a compiler bug.
A more reliable way to check is to keep your program running with a loop allocating what you suspect is leaking, and use htop to see if the program’s memory usage keeps increasing. It should not. If it is, then it should be reported to the compiler as a bug.
@certik is correct. Any memory leaks with regards to allocatable is a compiler bug. I have several bug reports open with gfortran with respect to allocatable function results.
I did not realize that valgrind could get confused by this, so it is possible that my bug reports are incorrect, as I used valgrind to demonstrate those memory leaks. Does anyone know of a better tool for finding memory leaks with Fortran?
I cannot recall where I read that about valgrind so it is perhaps not correct. There is also the address sanitizer built into gcc (-fsanitize=address).
Well, I used Intel Inspector, like Chetan mentioned, on the program compiled with Intel oneAPI Fortran. I was raher surprised myself to see this memory leak. But perhaps even Intel Inspector is getting confused. It does merit further investigation and a tool like htop (or Task Manager under Windows) applied to a long-running program might help to solve the riddle.
I tried the htop to check if the memory usage increases, and indeed it does increase over time.
went from 0% to 8%. So I guess this is an issue with my implementation or with the compiler.
Here is the complete get function:
function get_node_at_index( this_list, node_index ) result (return_item)
implicit none
class(list), intent(inout) :: this_list
integer, intent(in):: node_index
class(*), allocatable :: return_item
type(node), pointer:: current_node
integer:: count
!iterating through the list to reach the nth node
current_node => this_list%head
count = 1
do while ( associated(current_node) )
if (count==node_index) then
return_item = current_node%item
nullify(current_node)
return
end if
current_node => current_node%next
count = count+1
end do
nullify(current_node)
allocate(return_item,source = "Wrong Input")
end function get_node_at_index
The NAG Compiler with -mtrace=all will produce a log of all memory allocation/deallocations. The NAG-supplied filter program nagfmcheck will then report any mismatches as memory leaks. I could have a go with it and report back.
NAG Fortran Compiler Release 7.0(Yurakucho) Build 7048
fortran_linked_list.f90:
Error: fortran_linked_list.f90, line 167: DEALLOCATE of polymorphic THIS_NODE%ITEM in pure procedure NODE_DESTROYED
Error: fortran_linked_list.f90, line 181: Intrinsic assignment of type NODE might deallocate polymorphic allocatable subcomponent ITEM in pure procedure ALL_NODES_DETROYED
[NAG Fortran Compiler error termination, 2 errors]
Ah, useful report! The deallocations occur in routines that are meant to clean things up, but these should therefore not be pure.
The memory leak we encountered is elsewhere, though. The current workaround/solution is not to use an allocatable result, but a pointer (get_node_at_index, lines 77 and 89)
Eliminating PURE attributes so that the only ones left are
fortran_linked_list.f90: pure function get_length (this_list) result (length)
fortran_linked_list.f90: pure function initialise_node( item ) result( new_node )
fortran_linked_list.f90: pure subroutine append_at_tail( this_list, item )
List_of_Lists.f90: pure function child_length(this_parent_node) result(size)
I get
nagfor -C=all -C=undefined fortran_linked_list.f90 List_of_Lists.f90 Test_Program.f90
NAG Fortran Compiler Release 7.0(Yurakucho) Build 7048
fortran_linked_list.f90:
[NAG Fortran Compiler normal termination]
List_of_Lists.f90:
[NAG Fortran Compiler normal termination]
Test_Program.f90:
Warning: Test_Program.f90, line 52: Unused local variable DATA
Warning: Test_Program.f90, line 52: Unused local variable F
Warning: Test_Program.f90, line 52: Unused local variable J
Warning: Test_Program.f90, line 52: Local variable VEL2 is default initialised but never used
[NAG Fortran Compiler normal termination, 4 warnings]
Loading…
./a.out
Runtime Error: List_of_Lists.f90, line 42: Reference to undefined variable THIS_PARENT_LIST%NUM_PARENT_NODES
Program terminated by fatal error
Aborted (core dumped)
Without the -C checks, I get an executable that completes, but with -mtrace I get an infinite loop. Best to fix the undefined reference.
My mistake, with -mtrace and nagfmcheck, I get the following memeory leak summary
600063 allocations
***MEMORY LEAK:
LEAK: Allocation 8 (size 24) = Z'14EBC0B3E050' at line 67 of List_of_Lists.f90
LEAK: Allocation 30011 (size 24) = Z'14EBC0BDA480' at line 70 of List_of_Lists.f90
LEAK: Allocation 60014 (size 24) = Z'14EBC09768D0' at line 70 of List_of_Lists.f90
LEAK: Allocation 90017 (size 24) = Z'14EBC0A12D00' at line 70 of List_of_Lists.f90
LEAK: Allocation 120020 (size 24) = Z'14EBC07AF150' at line 70 of List_of_Lists.f90
LEAK: Allocation 150023 (size 24) = Z'14EBC064B590' at line 70 of List_of_Lists.f90
LEAK: Allocation 180026 (size 24) = Z'14EBC06E79C0' at line 70 of List_of_Lists.f90
LEAK: Allocation 210029 (size 24) = Z'14EBC0483E10' at line 70 of List_of_Lists.f90
LEAK: Allocation 240032 (size 24) = Z'14EBC0520240' at line 70 of List_of_Lists.f90
LEAK: Allocation 270035 (size 24) = Z'14EBC03BC690' at line 70 of List_of_Lists.f90
LEAK: Allocation 300038 (size 24) = Z'14EBC0158AD0' at line 70 of List_of_Lists.f90
LEAK: Allocation 330041 (size 24) = Z'14EBC01F4F00' at line 70 of List_of_Lists.f90
LEAK: Allocation 360044 (size 24) = Z'14EBC0091350' at line 70 of List_of_Lists.f90
LEAK: Allocation 390047 (size 24) = Z'14EBC012D780' at line 70 of List_of_Lists.f90
LEAK: Allocation 420050 (size 24) = Z'14EBBFEC9BD0' at line 70 of List_of_Lists.f90
LEAK: Allocation 450053 (size 24) = Z'14EBBFD66010' at line 70 of List_of_Lists.f90
LEAK: Allocation 480056 (size 24) = Z'14EBBFE02440' at line 70 of List_of_Lists.f90
LEAK: Allocation 510059 (size 24) = Z'14EBBFB9E890' at line 70 of List_of_Lists.f90
LEAK: Allocation 540062 (size 24) = Z'14EBBFC3ACC0' at line 70 of List_of_Lists.f90
LEAK: Allocation 570065 (size 24) = Z'14EBBFAD7110' at line 70 of List_of_Lists.f90
I think it may be useful to create a dedicated thread about known memory leaks and problematic code patterns so that we can be aware of those very quickly (I saw internet Q/As that report similar issues for >10 times, but do not remember where each page is…)