FortranFan Thank you for a great suggestion.
I have made the changes that you propose, but in order not to make too many changes, instead of printing the memory location inside the function I return the position of memory LOC(LST(I))
instead of the index I
in the array:
In the VSRCH function:
DO I=1,L
IF (LST(I) == STR) THEN
VSRCH = LOC(LST(I))
EXIT
END IF
END DO
Then in the maing program I do as you suggest to get the initial memory possition that would be found in the function and here I print them both to compare them:
O = LOC(list(5)%A)
call cpu_time(T1)
DO I=1,N
IDX = VSRCH("A", list%A, SIZ)
END DO
call cpu_time(T2)
PRINT *, "Found ", I, SIZ, O, IDX, T2-T1
The result shows that indeed, in the original solution, inside the function the address of the memory location of the string found in the array is different than the corresponding memory location in the main program.
Also, it proves that when using the proposed solution, the position in memory in the function is the exact same as the one in the main program, proving that there is no copy made at all.
>problem.exe 100000
Found 100001 100000 -1758418752 -1084180200 24.82812
>solution.exe 100000
Found 100001 100000 -1967675200 -1967675200 1.5625000E-02
I then added a print statement inside the main program loop to print the memory location found inside the function at each iteration to see if the copy is made to different locations each time:
O = LOC(list(5)%A)
call cpu_time(T1)
DO I=1,N
IDX = VSRCH("A", list%A, SIZ)
PRINT *,I,O,IDX
END DO
call cpu_time(T2)
PRINT *, "Found ", I, SIZ, O, IDX, T2-T1
The results shows that in the original problem the memory location in the inside the method is different than the memory location in the main program, but that the memory location inside the method is the exact same every time in every run. This means that there is not a lot of memory trashing in this case as the same memory locations are used over and over again.
In the proposed solution, the memory location is the exact same in every single iteration as the one in the main program:
problem.exe 5
1 1282320576 1977662216
2 1282320576 1977662216
3 1282320576 1977662216
4 1282320576 1977662216
5 1282320576 1977662216
Found 6 100000 1282320576 1977662216 0.0000000E+00
solution.exe 5
1 1846782144 1846782144
2 1846782144 1846782144
3 1846782144 1846782144
4 1846782144 1846782144
5 1846782144 1846782144
Found 6 100000 1846782144 1846782144 0.0000000E+00
In order to verify this later observation, I then changed the main program to count how many times the memory location changes:
OL = LOC(list(5)%A)
NL = OL
CL = 0
call cpu_time(T1)
DO I=1,N
IDX = VSRCH("A", list%A, SIZ)
IF (IDX .NE. NL) THEN
CL = CL + 1
NL = IDX
END IF
END DO
call cpu_time(T2)
PRINT *, "Found ", I, SIZ, OL, IDX, CL, T2-T1
The result confirms this that in the original problem the Intel Fortran copies the data into the exact same memory location each one of the 10k times, whereas in the solution there is absolutely no copy made in any run and is the exact same as in the main program.
problem.exe 10000
Found 10001 100000 41068736 1847638984 1 2.468750
solution.exe 10000
Found 10001 100000 -1050892096 -1050892096 0 0.0000000E+00
So this result is quite conclusive of the current behaviour and definitely confirms that this is inded the correct solution for this scenario if you use Intel Fortran.
This is great, but it still it would be nice confirm how this whole mechanic actually works on the Intel Fortran, as non stride-1 arrays are used in this code set for many purposes to handle large complex data set, not used just for searching strings.