# Julia: Fast as Fortran, Beautiful as Python

The use of `.` for broadcasting is far from unique to Julia (it’s also used by numpy and matlab), and IMO it does by far the best job of it. Having an easy way of vectorizing code is way better than having to define scalar and vector (and matrix and tensor) versions of everything.

1 Like

Fortran has intrinsic and user-defined ELEMENTAL functions for that.

3 Likes

From what I’ve seen there’s a key difference which is that you have to decide whether a function is elemental or not when you write it. This means that for example, you can’t broadcast matrix multiplication without re-writing BLAS. (also, you need different names for the elemental and non-elemental versions)

1 Like

So, does Julia vectorize a compiled external library at the request of the user, or does it recompile the library every time to achieve the goal?

It’s somewhere in-between

``````using SpecialFunctions # for error function, erf

v = rand(10)
println(erf.(log.(v)))
``````

will be lowered to `Base.materialize(Base.broadcasted(erf, Base.broadcasted(log, v)))` which basically turns into

``````y = similar(v)
for i in 1:10
y[i] = erf(log(v[i]))
end
``````

If `log` and `erf` inline (in the particular they don’t since they are too expensive), Julia might make further optimizations like unrolling the loop, vectorizing the functions, or removing bounds checks.

3 Likes

Please see this example where the key computation, the sine of a radian, was moved to an external library, `cordic_sine`: Simple summation 8x slower than in Julia - #44 by FortranFan.

Once that was done, note in all the subsequent scenarios, the ratio of compute times with Fortran vs Julia were around unity, meaning no real difference i.e., the specific advantages with an enthusiast-driven `macro` implementation toward vectorization and specific “fast” trig functions leading to the original 8x difference became inapplicable.

2 Likes

Sometimes the same function or operator has a different meaning when acting on matrices than elementwise.

``````julia> A = rand(2,2)
2×2 Matrix{Float64}:
0.198773   0.747052
0.0135511  0.777073

julia> exp.(A)
2×2 Matrix{Float64}:
1.21991  2.11077
1.01364  2.1751

julia> exp(A)
2×2 Matrix{Float64}:
1.22747    1.23599
0.0224203  2.18427

julia> A * A
2×2 Matrix{Float64}:
0.0496342  0.729008
0.0132238  0.613966

julia> A .* A
2×2 Matrix{Float64}:
0.0395108    0.558087
0.000183634  0.603843

julia> A ^ 3
2×2 Matrix{Float64}:
0.0197449  0.603572
0.0109485  0.486976

julia> A .^ 3
2×2 Matrix{Float64}:
0.00785369  0.41692
2.48845e-6  0.46923
``````
1 Like

I’m not sure what you mean. `exp.(A)` and `exp(A)` do something different (elementwise vs matrix exponential). The dot is used to syntactically indicate whether the function is applied to the entire argument, or applied elementwise.

Just like having type information listed makes it easier to read code, this is also informative, i.e. we know that `foo.(A)` is elementwise and `foo(A)` probably isn’t (unless someone defined `foo(A)` to explicitly loop over `A`).

1 Like

In Julia, we can define `A::AbstractMatrix{Float64} * B::AbstractMatrix{Float64}` to be `dgemm` and `A::Float64 * B::Float64` to be scalar multiplication, and then automatically `A::AbstractMatrix{Float64} .* B::AbstractMatrix{Float64}` is elementwise multiplication.

Thus we do need to define two different functions, but the “ELEMENTAL” behavior comes for free by just defining the elemental (scalar) version of the function.
Then syntactically, someone reading our code knows whether our function is applying elementwise, or applying to the entire argument.
Similarly, one knows syntactically if it is fusing with other `.`, so one can look at code and tell how many array temporaries are being created.

Then, if the same operator has a different meaning when applied to a certain container type – like a matrix – we are still free to add a definition for that container type, without needing to use other operators/function names.

My argument would only be specious if we had to explicitly define `exp.()` and `.*`.
The `.` is syntax. All we have to define is `exp(::Float64)` to get both `exp(::Float64)` (no dot) and `exp.(::AbstractArray{Float64})` (elementwise on a matrix) or `exp.((A::AbstractArray .- B::AbstractArray) .^ 2)` (elementwise exponential of squared differences, without allocating any temporaries).

2 Likes

I wanted to see if I could implement what you are saying just to see how far Fortran is away from that. Take this with a grain of salt as a goofy example. You would typically use intrinsics and elemental functions as noted earlier.

``````program test
use, intrinsic :: iso_fortran_env

interface operator(.elem.)
procedure elem
end interface
interface exp
procedure exp_mat
procedure exp_elem
end interface
interface operator(.star.)
procedure star_mat
procedure star_elem
end interface

type elem_ptr
real(real64), pointer :: mat(:, :)
end type

real(real64) :: A(2, 2)
A = reshape([0.198773, 0.747052, &
0.0135511, 0.777073], shape=[2, 2], order=[2, 1])
call print_mat(A)
call print_mat(exp(.elem.A))
call print_mat(exp(A))
call print_mat(A.star.A)
call print_mat(A.star..elem.A)

contains

function elem(x) result(y)
real(real64), target, intent(in) :: x(:, :)
type(elem_ptr) :: y
y%mat => x
end function

function exp_mat(x) result(y)
real(real64), intent(in) :: x(:, :)
real(real64) :: y(size(x, 1), size(x, 2))
real(real64) :: x_k(size(x, 1), size(x, 2))
integer :: i, j, k, fact
y = 0.
do concurrent(i=1:size(x, 1), j=1:size(x, 2))
if(i == j) y(i, j) = 1.
enddo
fact = 1
x_k = x
do k = 1, 8
fact = fact*k
y = y + x_k/fact
x_k = matmul(x_k, x)
enddo
end function

function exp_elem(x) result(y)
type(elem_ptr), intent(in) :: x
real(real64) :: y(size(x%mat, 1), size(x%mat, 2))
integer :: i, j
do concurrent(i=1:size(x%mat, 1), j=1:size(x%mat, 2))
y(i, j) = exp(x%mat(i, j))
enddo
end function

function star_mat(a, b) result(c)
real(real64), intent(in) :: a(:, :), b(:, :)
real(real64) :: c(size(a, 1), size(b, 2))
c = matmul(a, b)
end function

function star_elem(a, b) result(c)
real(real64), intent(in) :: a(:, :)
type(elem_ptr), intent(in) :: b
real(real64) :: c(size(a, 1), size(a, 2))
c = a*b%mat
end function

subroutine print_mat(mat)
real(real64), intent(in) :: mat(:, :)
integer(int32) :: i, j, m, n
write(*, '(i0,"x",i0," Matrix{Float64}")') size(mat, 1), size(mat, 2)
do i = 1, size(mat, 1)
write(*, '(22f15.7)', advance='no')(mat(i, j), j=1, size(mat, 2))
write(*, *)
enddo
end subroutine

end program
``````
1 Like

Note that, in `exp` case (and in any other), one could do:

``````a = 1.0
A = ones(2,2)

exp(a) # scalar
exp(A) # exp of matrix

v = [ a, a, a ]
exp.(a) # scalar, element wise
exp.(A) # scalar, element wise

V = [ A, A, A ] # vector of matices
exp.(V) # Matrix exp on each element of V

``````

You can do this with essentially any function in Julia, even if it is function defined in a library. In this case what is needed is one `exp` method defined for scalars and other for matrices. (Because of generics, for many other functions only one method is needed at all)

4 Likes

Thanks for pointing out the last example of matrix exponential acting on each element of a vector of matrices. Fortran’s elemental functions can only be defined on scalars, so to get the same behavior, you need to define a new `matrix` type, and then define an elemental matrix-exponential function that operates on this type. Then you can extend the generic interface for `exp` to include matrix exponentials.

Julia seems to work in a similar way, just with different syntax. Under the hood, Julia is using type information to dispatch to different methods for scalar and matrix exponential. It’s the same principle in Fortran, with the difference being that the `Matrix` type and associated methods come “for free” in Julia, rather than being defined by the user (or a 3rd party library). That’s a substantive difference, of course, especially considering that this is just one example of Julia’s nice support for “abstract linear algebra” at the language level.

2 Likes

I understand that there is a wide range of methods for computing the matrix exponential (see What Is the Matrix Exponential? – Nick Higham). Until we know more, it seems that a single “exp” for matrices might create more trouble than it solves.

2 Likes

You would need something along the lines of this to achieve that. Perhaps there could be a nice library.

``````module my_library
use, intrinsic :: iso_fortran_env
contains

function my_exp_scalar(x) result(y)
real(real64), intent(in) :: x
real(real64) :: y
print *, 'function my_exp scalar'
y = exp(x)
end function

function my_exp_mat(x) result(y)
real(real64), intent(in) :: x(:, :)
real(real64) :: y(size(x, 1), size(x, 2))
real(real64) :: x_k(size(x, 1), size(x, 2))
integer :: i, j, k, fact
print *, 'function my_exp mat niave'
y = 0.
do concurrent(i=1:size(x, 1), j=1:size(x, 2))
if(i == j) y(i, j) = 1.
end do
fact = 1
x_k = x
do k = 1, 8
fact = fact*k
y = y + x_k/fact
x_k = matmul(x_k, x)
end do
end function
end module

module my_elemental
use, intrinsic :: iso_fortran_env
interface elementwise
module procedure elementwise_scalar
module procedure elementwise_mat
endinterface
type container
real(real64), allocatable :: mat(:, :)
endtype
contains

function elementwise_scalar(func, x) result(y)
interface
function func(x) result(y)
import
real(real64), intent(in) :: x
real(real64) :: y
end function
end interface
real(real64), intent(in) :: x(:)
real(real64) :: y(size(x))
integer :: i
do i = 1, size(x)
y(i) = func(x(i))
end do
end function

function elementwise_mat(func, x) result(y)
interface
function func(x) result(y)
import
real(real64), intent(in) :: x(:, :)
real(real64) :: y(size(x, 1), size(x, 2))
end function
end interface
type(container), intent(in) :: x(:)
type(container) :: y(size(x))
integer :: i
do i = 1, size(x)
y(i)%mat = func(x(i)%mat)
end do
end function
end module

module helpers
use, intrinsic :: iso_fortran_env
contains

function ones(dim1, dim2) result(mat)
integer, intent(in) :: dim1, dim2
real(real64), allocatable :: mat(:, :)
allocate(mat(dim1, dim2))
mat = 1.
end function

subroutine print_mat(mat)
real(real64), intent(in) :: mat(:, :)
integer(int32) :: i, j
write(*, '(i0,"x",i0," Matrix{Float64}")') size(mat, 1), size(mat, 2)
do i = 1, size(mat, 1)
write(*, '(22f15.7)', advance='no')(mat(i, j), j=1, size(mat, 2))
write(*, *)
end do
end subroutine
end module

program test
use, intrinsic :: iso_fortran_env
use my_library
use my_elemental
use helpers

real(real64) :: a
real(real64) :: A_mat(2, 2)
real(real64), allocatable :: v(:)
type(container), allocatable :: V_mat(:), results(:)
integer :: i

a = 1.0
A_mat = ones(2, 2)

print *, my_exp_scalar(a)
call print_mat(my_exp_mat(A_mat))

v = [a, a, a, a]
print *, elementwise(my_exp_scalar, v)

V_mat = [(container(A_mat), i=1, 4)]
results = elementwise(my_exp_mat, V_mat)
do i = 1, 4
call print_mat(results(i)%mat)
end do
end program
``````
1 Like

Would it be possible, in Fortran, to write a `map` function, such that

``````map(f,x)
``````

Computes `f` for every element of `x`, being `f` and `x` of any type? Maybe working only with pointers internally? (I guess there is no way to it to know the size of the element of `x`)

The other problem is that you need dependent types to statically type a map function.

There definitely are limits to the generic programming capabilities, but I think one of the issues that is different about Fortran is that you don’t need to use map at all when you use elemental. This is summed up in this post: Map, Filter, Reduce in Fortran 2018 | Milan Curcic. Milan says, "If you write your functions as `elemental` , what you’d write in some other language as `map(f, x)` in Fortran 2018 is just `f(x)`". See also elemental in Fortran Wiki.

I rewrote the code from above and removed my elementwise function and used elemental functions in the library. So now it doesn’t seem totally ridiculous at least.

``````module containers
use, intrinsic :: iso_fortran_env
type mat
real(real64), allocatable :: mat(:, :)
end type
end module

module my_library
use, intrinsic :: iso_fortran_env
use containers
interface my_exp
module procedure my_exp_scalar
module procedure my_exp_mat
end interface
contains

elemental function my_exp_scalar(x) result(y)
real(real64), intent(in) :: x
real(real64) :: y
y = exp(x)
end function

elemental function my_exp_mat(x) result(y)
type(mat), intent(in) :: x
type(mat) :: y
real(real64) :: x_k(size(x%mat, 1), size(x%mat, 2))
integer :: i, j, k, fact
allocate (y%mat, mold=x%mat)
y%mat = 0.
do concurrent(i=1:size(x%mat, 1), j=1:size(x%mat, 2))
if (i == j) y%mat(i, j) = 1.
end do
fact = 1
x_k = x%mat
do k = 1, 8
fact = fact*k
y%mat = y%mat + x_k/fact
x_k = matmul(x_k, x%mat)
end do
end function
end module

module helpers
use, intrinsic :: iso_fortran_env
contains

function ones(dim1, dim2) result(mat)
integer, intent(in) :: dim1, dim2
real(real64), allocatable :: mat(:, :)
allocate (mat(dim1, dim2))
mat = 1.
end function

subroutine print_mat(mat)
real(real64), intent(in) :: mat(:, :)
integer(int32) :: i, j
write (*, '(i0,"x",i0," Matrix{Float64}")') size(mat, 1), size(mat, 2)
do i = 1, size(mat, 1)
write (*, '(22f15.7)', advance='no') (mat(i, j), j=1, size(mat, 2))
write (*, *)
end do
end subroutine
end module

program test
use, intrinsic :: iso_fortran_env
use my_library
use containers
use helpers

real(real64) :: a
real(real64) :: A_mat(2, 2)
type(mat) :: result
real(real64) :: v(4)
type(mat) :: V_mat(4), results(4)
integer :: i

a = 1.0
A_mat = ones(2, 2)

print *, my_exp(a)
result = my_exp(mat(A_mat))
call print_mat(result%mat)

v = [a, a, a, a]
print *, my_exp(v)

V_mat = [(mat(A_mat), i=1, 4)]
results = my_exp(V_mat)
do i = 1, 4
call print_mat(results(i)%mat)
end do
end program
``````
4 Likes

@implicitall , yes looking at your penultimate post, it was unclear why you had felt the need to include the element-wise functions, particularly considering the post by @nshaffer that included the library “recipe” to be followed.

The motivation was “You can do this with essentially any function in Julia, even if it is function defined in a library”. So the constraint for that piece of code is a non-elemental function defined in a library. The final version is for a library for which you have access to the code or the functions are elemental.

Yes, I agree.

Fortran language with its ISO IEC standard, multiple vendors with different hardware platforms, diverse user community with such a long legacy of code bases is on a different evolutionary path than Julia.

With Fortran, a key question is whether the language is sufficiently flexible to develop the kind of solutions that help its practitioners achieve perfomant computing goals. In the case of `linear algebra` too, it is a case of so near, yet so far. Meaning one can achieve a lot right now but the limitations with language intrinsics as well as `generics` imply considerable code duplication and verbosity in library code and perhaps the use of features such as `INCLUDE` files which are not appealing to quite a few.

Hopefully Fortran 202Y and the Fortran `stdlib` will make it a much richer experience for Fortran practitioners.

Toward the `stdlib` effort, the modules being considered for `linear algebra` will hopefully include good options for matrix computations that investigate options in the language such as parameterized derived types, defined IO in conjunction with the kind of useful `ELEMENTAL` methods such as `exp`, `log`, etc. that Julia has taken care to offer its users and that it will also provide other performance options as applicable - vectorization, concurrent processing, parallelization, etc.:

``````module mat_m
! Say this module is part of some Linear Algebra library
type :: mat_t(n,m)
integer, len :: n, m
private
real, public :: mat(n,m)
real :: work(n,m)
contains
private
procedure, pass(dtv) :: write_mat_t
generic, public :: write(formatted) => write_mat_t
end type
generic :: exp => exp_mat_t   !<-- overload the intrinsic
contains
elemental function ones(n,m) result(r)
integer, intent(in) :: n,m
type(mat_t(n=n,m=m)) :: r
r%mat = 1.0
end function
elemental function exp_mat_t(x) result(y)
type(mat_t(n=*,m=*)), intent(in) :: x
type(mat_t(n=x%n,m=x%m)) :: y
integer :: i, j, k, fact
integer, parameter :: maxiter = 8
y%mat = 0.
do concurrent ( i=1:size(x%mat, 1), j=1:size(x%mat, 2))
if (i == j) y%mat(i, j) = 1.
end do
fact = 1
y%work = x%mat
do k = 1, maxiter
fact = fact*k
y%mat = y%mat + y%work/fact
y%work = matmul(y%work, x%mat)
end do
end function
subroutine write_mat_t (dtv, lun, iotype, vlist, istat, imsg)
class(mat_t(n=*,m=*)), intent(in) :: dtv
integer, intent(in)               :: lun
character (len=*), intent(in)     :: iotype
integer, intent(in)               :: vlist(:)
integer, intent(out)              :: istat
character (len=*), intent(inout)  :: imsg
! local variable
integer :: i, j

write (lun, '(i0,"x",i0," Matrix{Float64}")', iostat=istat, iomsg=imsg) dtv%n, dtv%m
if ( istat /= 0 ) return
write(lun,*) new_line("")
do i = 1, dtv%n
write(lun, '(22f15.7)', advance='no', iostat=istat, iomsg=imsg) (dtv%mat(i, j), j=1,dtv%m)
if ( istat /= 0 ) return
write(lun, *) new_line("")
end do
return
end subroutine
end module

program test

use mat_m

real :: a_scalar
real :: vector(4)
type(mat_t(n=2,m=2)) :: A
type(mat_t(n=2,m=2)) :: V(4)

a_scalar = 1.0
vector = [ a_scalar, a_scalar, a_scalar, a_scalar ]
print *, exp( vector )

A = ones( A%n, A%m )
print *, exp( A )

V = [ A, A, A, A ]
print *, exp( V )

end program
``````

Note the main program above in Fortran looks rather similar to the Julia snippet once some effort toward a higher level of abstraction is expended in a library.

But now I set up the above code to work with default `REAL` for the underlying representation. To extend it now to other supported floating-point kinds by a processor is explicit code duplication, once for each kind and which can involve added effort to make it portable across platforms. This is among the aspects we can hope the Generics facility in Fortran 202Y will offer a good solution.

C:\temp>ifort /standard-semantics p.f90
Intel(R) Fortran Intel(R) 64 Compiler Classic for applications running on Intel(R) 64, Version 2021.2.0 Build 20210228_000000

Microsoft (R) Incremental Linker Version 14.27.29112.0

-out:p.exe
-subsystem:console
p.obj

C:\temp>p.exe
2.718282 2.718282 2.718282 2.718282
2x2 Matrix{Float64}
4.1936512 3.1936510
3.1936510 4.1936512

2x2 Matrix{Float64}
4.1936512 3.1936510
3.1936510 4.1936512

2x2 Matrix{Float64}
4.1936512 3.1936510
3.1936510 4.1936512

2x2 Matrix{Float64}
4.1936512 3.1936510
3.1936510 4.1936512

2x2 Matrix{Float64}
4.1936512 3.1936510
3.1936510 4.1936512

C:\temp>

4 Likes