I had to write the gauss algorithm in Python last week and has to use Numpy arrays to get (a little bit) speed.
After doing so, I asked my self how much faster the same code would be in Fortran.
So I wrote it again, this time in Fortran.
I should explain that n
means the number of rows (and obviously columns) of the matrix a
that includes the coefficients.
I add the written code here.
def gauss(a: np.array, b: np.array) -> np.array:
for i in range(len(a)):
# Check if row order must be changed
if a[i, i] == 0:
# Search for first field != 0
for row_ind in range(i + 1, len(a)):
if a[row_ind, i] != 0:
# Swap
a[[i, row_ind]] = a[[row_ind, i]]
b[[i, row_ind]] = b[[row_ind, i]]
break
# Eliminate fields below
for row_ind in range(i + 1, len(a)):
b[row_ind] -= b[i] / a[i, i] * a[row_ind, i]
a[row_ind, i:] -= (a[i, i:] / a[i, i] * a[row_ind, i])
# a[row_ind] -= (a[i] / a[i, i] * a[row_ind, i])
# Figure out x
x = np.array([0 for _i in range(len(a))])
for i in range(len(a) - 1, -1, -1):
b[i] -= np.dot(a[i], x)
x[i] = b[i] / a[i, i]
return x
module gauss_m
contains
subroutine gauss(a, b, x)
real(8), intent(inout) :: a(:, :), b(:), x(:)
integer :: i, k, n
real(8), allocatable :: tmp_a(:)
real(8) :: tmp_b
n = size(a, 1)
do i=1, n
! Check if has to swap
if (a(i, i) == 0.0) then
do k=i+1, n
! Swap values and exit loop
if (a(k, i) /= 0.0) then
tmp_a = a(k, :); tmp_b = b(k)
a(k, :) = a(i, :); b(k) = b(i)
a(i, :) = tmp_a; b(i) = tmp_b
exit
end if
end do
end if
! Eliminate items in row below
do k=i+1, n
b(k) = b(k) - (b(i) / a(i, i) * a(k, i))
a(k, :) = a(k, :) - a(i, :) / a(i, i) * a(k, i)
end do
end do
! Solve
x = 0.0
do i=n, 1, -1
b(i) = b(i) - dot_product(a(i, :), x)
x(i) = b(i) / a(i, i)
end do
end subroutine gauss
end module gauss_m
To test how many time is needed to solve the problem I created a jupyter Notebook and created a cpython
module using f2py
to being able to import it into Python.
When I started to test how many time is needed, I was suprised how fast Fortran was able to solve the linear equations system.
It was up to 6 times faster than the Python code (although the conversion was written in Python and the code was wrapped using cpython).
But everytime I’m trying “big” n
it looks like Python is getting faster than Fortran.
In this graphs you can see time needed to solve the n
x n
system,
“Python” means my selfwritten Python program, “Numpy” the function numpy.linalg.solve
and “Fortran” my selfewritten Fortran code.
The whole Jupyter Notebook can be found here: JetBrains Datalore: A powerful environment for Jupyter notebooks.
It seems like Python is faster solving this system than Fortran if you choose a “big” n
.
I think that’s not right but I cannot find a reason for. I tried a fiew things like using compiler flags for optimization, but it’s not getting much better.
If you’ve got any suggestions about how to solve this issue or why it is getting that slow please let me know.