Julia: Fast as Fortran, Beautiful as Python

The use of . for broadcasting is far from unique to Julia (it’s also used by numpy and matlab), and IMO it does by far the best job of it. Having an easy way of vectorizing code is way better than having to define scalar and vector (and matrix and tensor) versions of everything.

1 Like

Fortran has intrinsic and user-defined ELEMENTAL functions for that.

3 Likes

From what I’ve seen there’s a key difference which is that you have to decide whether a function is elemental or not when you write it. This means that for example, you can’t broadcast matrix multiplication without re-writing BLAS. (also, you need different names for the elemental and non-elemental versions)

1 Like

So, does Julia vectorize a compiled external library at the request of the user, or does it recompile the library every time to achieve the goal?

It’s somewhere in-between

using SpecialFunctions # for error function, erf

v = rand(10)
println(erf.(log.(v)))

will be lowered to Base.materialize(Base.broadcasted(erf, Base.broadcasted(log, v))) which basically turns into

y = similar(v)
for i in 1:10
    y[i] = erf(log(v[i]))
end

If log and erf inline (in the particular they don’t since they are too expensive), Julia might make further optimizations like unrolling the loop, vectorizing the functions, or removing bounds checks.

3 Likes

Please see this example where the key computation, the sine of a radian, was moved to an external library, cordic_sine: Simple summation 8x slower than in Julia - #44 by FortranFan.

Once that was done, note in all the subsequent scenarios, the ratio of compute times with Fortran vs Julia were around unity, meaning no real difference i.e., the specific advantages with an enthusiast-driven macro implementation toward vectorization and specific “fast” trig functions leading to the original 8x difference became inapplicable.

2 Likes

Sometimes the same function or operator has a different meaning when acting on matrices than elementwise.

julia> A = rand(2,2)
2×2 Matrix{Float64}:
 0.198773   0.747052
 0.0135511  0.777073

julia> exp.(A)
2×2 Matrix{Float64}:
 1.21991  2.11077
 1.01364  2.1751

julia> exp(A)
2×2 Matrix{Float64}:
 1.22747    1.23599
 0.0224203  2.18427

julia> A * A
2×2 Matrix{Float64}:
 0.0496342  0.729008
 0.0132238  0.613966

julia> A .* A
2×2 Matrix{Float64}:
 0.0395108    0.558087
 0.000183634  0.603843

julia> A ^ 3
2×2 Matrix{Float64}:
 0.0197449  0.603572
 0.0109485  0.486976

julia> A .^ 3
2×2 Matrix{Float64}:
 0.00785369  0.41692
 2.48845e-6  0.46923
1 Like

I’m not sure what you mean. exp.(A) and exp(A) do something different (elementwise vs matrix exponential). The dot is used to syntactically indicate whether the function is applied to the entire argument, or applied elementwise.

Just like having type information listed makes it easier to read code, this is also informative, i.e. we know that foo.(A) is elementwise and foo(A) probably isn’t (unless someone defined foo(A) to explicitly loop over A).

1 Like

I’m lost here by the combative tone your comments are striking.

In Julia, we can define A::AbstractMatrix{Float64} * B::AbstractMatrix{Float64} to be dgemm and A::Float64 * B::Float64 to be scalar multiplication, and then automatically A::AbstractMatrix{Float64} .* B::AbstractMatrix{Float64} is elementwise multiplication.

Thus we do need to define two different functions, but the “ELEMENTAL” behavior comes for free by just defining the elemental (scalar) version of the function.
Then syntactically, someone reading our code knows whether our function is applying elementwise, or applying to the entire argument.
Similarly, one knows syntactically if it is fusing with other ., so one can look at code and tell how many array temporaries are being created.

Then, if the same operator has a different meaning when applied to a certain container type – like a matrix – we are still free to add a definition for that container type, without needing to use other operators/function names.

My argument would only be specious if we had to explicitly define exp.() and .*.
The . is syntax. All we have to define is exp(::Float64) to get both exp(::Float64) (no dot) and exp.(::AbstractArray{Float64}) (elementwise on a matrix) or exp.((A::AbstractArray .- B::AbstractArray) .^ 2) (elementwise exponential of squared differences, without allocating any temporaries).

2 Likes

I wanted to see if I could implement what you are saying just to see how far Fortran is away from that. Take this with a grain of salt as a goofy example. You would typically use intrinsics and elemental functions as noted earlier.

program test
use, intrinsic :: iso_fortran_env

interface operator(.elem.)
    procedure elem
end interface
interface exp
    procedure exp_mat
    procedure exp_elem
end interface
interface operator(.star.)
    procedure star_mat
    procedure star_elem
end interface

type elem_ptr
    real(real64), pointer :: mat(:, :)
end type

real(real64) :: A(2, 2)
A = reshape([0.198773, 0.747052, &
             0.0135511, 0.777073], shape=[2, 2], order=[2, 1])
call print_mat(A)
call print_mat(exp(.elem.A))
call print_mat(exp(A))
call print_mat(A.star.A)
call print_mat(A.star..elem.A)

contains

function elem(x) result(y)
    real(real64), target, intent(in) :: x(:, :)
    type(elem_ptr) :: y
    y%mat => x
end function

function exp_mat(x) result(y)
    real(real64), intent(in) :: x(:, :)
    real(real64) :: y(size(x, 1), size(x, 2))
    real(real64) :: x_k(size(x, 1), size(x, 2))
    integer :: i, j, k, fact
    y = 0.
    do concurrent(i=1:size(x, 1), j=1:size(x, 2))
        if(i == j) y(i, j) = 1.
    enddo
    fact = 1
    x_k = x
    do k = 1, 8
        fact = fact*k
        y = y + x_k/fact
        x_k = matmul(x_k, x)
    enddo
end function

function exp_elem(x) result(y)
    type(elem_ptr), intent(in) :: x
    real(real64) :: y(size(x%mat, 1), size(x%mat, 2))
    integer :: i, j
    do concurrent(i=1:size(x%mat, 1), j=1:size(x%mat, 2))
        y(i, j) = exp(x%mat(i, j))
    enddo
end function

function star_mat(a, b) result(c)
    real(real64), intent(in) :: a(:, :), b(:, :)
    real(real64) :: c(size(a, 1), size(b, 2))
    c = matmul(a, b)
end function

function star_elem(a, b) result(c)
    real(real64), intent(in) :: a(:, :)
    type(elem_ptr), intent(in) :: b
    real(real64) :: c(size(a, 1), size(a, 2))
    c = a*b%mat
end function

subroutine print_mat(mat)
    real(real64), intent(in) :: mat(:, :)
    integer(int32) :: i, j, m, n
    write(*, '(i0,"x",i0," Matrix{Float64}")') size(mat, 1), size(mat, 2)
    do i = 1, size(mat, 1)
        write(*, '(22f15.7)', advance='no')(mat(i, j), j=1, size(mat, 2))
        write(*, *)
    enddo
end subroutine

end program
1 Like

Note that, in exp case (and in any other), one could do:

a = 1.0
A = ones(2,2)

exp(a) # scalar
exp(A) # exp of matrix

v = [ a, a, a ]
exp.(a) # scalar, element wise
exp.(A) # scalar, element wise

V = [ A, A, A ] # vector of matices
exp.(V) # Matrix exp on each element of V

You can do this with essentially any function in Julia, even if it is function defined in a library. In this case what is needed is one exp method defined for scalars and other for matrices. (Because of generics, for many other functions only one method is needed at all)

4 Likes

Thanks for pointing out the last example of matrix exponential acting on each element of a vector of matrices. Fortran’s elemental functions can only be defined on scalars, so to get the same behavior, you need to define a new matrix type, and then define an elemental matrix-exponential function that operates on this type. Then you can extend the generic interface for exp to include matrix exponentials.

Julia seems to work in a similar way, just with different syntax. Under the hood, Julia is using type information to dispatch to different methods for scalar and matrix exponential. It’s the same principle in Fortran, with the difference being that the Matrix type and associated methods come “for free” in Julia, rather than being defined by the user (or a 3rd party library). That’s a substantive difference, of course, especially considering that this is just one example of Julia’s nice support for “abstract linear algebra” at the language level.

2 Likes

I understand that there is a wide range of methods for computing the matrix exponential (see What Is the Matrix Exponential? – Nick Higham). Until we know more, it seems that a single “exp” for matrices might create more trouble than it solves.

2 Likes

You would need something along the lines of this to achieve that. Perhaps there could be a nice library.

module my_library
use, intrinsic :: iso_fortran_env
contains

function my_exp_scalar(x) result(y)
    real(real64), intent(in) :: x
    real(real64) :: y
    print *, 'function my_exp scalar'
    y = exp(x)
end function

function my_exp_mat(x) result(y)
    real(real64), intent(in) :: x(:, :)
    real(real64) :: y(size(x, 1), size(x, 2))
    real(real64) :: x_k(size(x, 1), size(x, 2))
    integer :: i, j, k, fact
    print *, 'function my_exp mat niave'
    y = 0.
    do concurrent(i=1:size(x, 1), j=1:size(x, 2))
        if(i == j) y(i, j) = 1.
    end do
    fact = 1
    x_k = x
    do k = 1, 8
        fact = fact*k
        y = y + x_k/fact
        x_k = matmul(x_k, x)
    end do
end function
end module

module my_elemental
use, intrinsic :: iso_fortran_env
interface elementwise
    module procedure elementwise_scalar
    module procedure elementwise_mat
endinterface
type container
    real(real64), allocatable :: mat(:, :)
endtype
contains

function elementwise_scalar(func, x) result(y)
    interface
        function func(x) result(y)
            import
            real(real64), intent(in) :: x
            real(real64) :: y
        end function
    end interface
    real(real64), intent(in) :: x(:)
    real(real64) :: y(size(x))
    integer :: i
    do i = 1, size(x)
        y(i) = func(x(i))
    end do
end function

function elementwise_mat(func, x) result(y)
    interface
        function func(x) result(y)
            import
            real(real64), intent(in) :: x(:, :)
            real(real64) :: y(size(x, 1), size(x, 2))
        end function
    end interface
    type(container), intent(in) :: x(:)
    type(container) :: y(size(x))
    integer :: i
    do i = 1, size(x)
        y(i)%mat = func(x(i)%mat)
    end do
end function
end module

module helpers
use, intrinsic :: iso_fortran_env
contains

function ones(dim1, dim2) result(mat)
    integer, intent(in) :: dim1, dim2
    real(real64), allocatable :: mat(:, :)
    allocate(mat(dim1, dim2))
    mat = 1.
end function

subroutine print_mat(mat)
    real(real64), intent(in) :: mat(:, :)
    integer(int32) :: i, j
    write(*, '(i0,"x",i0," Matrix{Float64}")') size(mat, 1), size(mat, 2)
    do i = 1, size(mat, 1)
        write(*, '(22f15.7)', advance='no')(mat(i, j), j=1, size(mat, 2))
        write(*, *)
    end do
end subroutine
end module

program test
use, intrinsic :: iso_fortran_env
use my_library
use my_elemental
use helpers

real(real64) :: a
real(real64) :: A_mat(2, 2)
real(real64), allocatable :: v(:)
type(container), allocatable :: V_mat(:), results(:)
integer :: i

a = 1.0
A_mat = ones(2, 2)

print *, my_exp_scalar(a)
call print_mat(my_exp_mat(A_mat))

v = [a, a, a, a]
print *, elementwise(my_exp_scalar, v)

V_mat = [(container(A_mat), i=1, 4)]
results = elementwise(my_exp_mat, V_mat)
do i = 1, 4
    call print_mat(results(i)%mat)
end do
end program
1 Like

Would it be possible, in Fortran, to write a map function, such that

map(f,x)

Computes f for every element of x, being f and x of any type? Maybe working only with pointers internally? (I guess there is no way to it to know the size of the element of x)

The other problem is that you need dependent types to statically type a map function.

There definitely are limits to the generic programming capabilities, but I think one of the issues that is different about Fortran is that you don’t need to use map at all when you use elemental. This is summed up in this post: Map, Filter, Reduce in Fortran 2018 | Milan Curcic. Milan says, "If you write your functions as elemental , what you’d write in some other language as map(f, x) in Fortran 2018 is just f(x)". See also elemental in Fortran Wiki.

I rewrote the code from above and removed my elementwise function and used elemental functions in the library. So now it doesn’t seem totally ridiculous at least.

module containers
use, intrinsic :: iso_fortran_env
type mat
    real(real64), allocatable :: mat(:, :)
end type
end module

module my_library
use, intrinsic :: iso_fortran_env
use containers
interface my_exp
    module procedure my_exp_scalar
    module procedure my_exp_mat
end interface
contains

elemental function my_exp_scalar(x) result(y)
    real(real64), intent(in) :: x
    real(real64) :: y
    y = exp(x)
end function

elemental function my_exp_mat(x) result(y)
    type(mat), intent(in) :: x
    type(mat) :: y
    real(real64) :: x_k(size(x%mat, 1), size(x%mat, 2))
    integer :: i, j, k, fact
    allocate (y%mat, mold=x%mat)
    y%mat = 0.
    do concurrent(i=1:size(x%mat, 1), j=1:size(x%mat, 2))
        if (i == j) y%mat(i, j) = 1.
    end do
    fact = 1
    x_k = x%mat
    do k = 1, 8
        fact = fact*k
        y%mat = y%mat + x_k/fact
        x_k = matmul(x_k, x%mat)
    end do
end function
end module

module helpers
use, intrinsic :: iso_fortran_env
contains

function ones(dim1, dim2) result(mat)
    integer, intent(in) :: dim1, dim2
    real(real64), allocatable :: mat(:, :)
    allocate (mat(dim1, dim2))
    mat = 1.
end function

subroutine print_mat(mat)
    real(real64), intent(in) :: mat(:, :)
    integer(int32) :: i, j
    write (*, '(i0,"x",i0," Matrix{Float64}")') size(mat, 1), size(mat, 2)
    do i = 1, size(mat, 1)
        write (*, '(22f15.7)', advance='no') (mat(i, j), j=1, size(mat, 2))
        write (*, *)
    end do
end subroutine
end module

program test
use, intrinsic :: iso_fortran_env
use my_library
use containers
use helpers

real(real64) :: a
real(real64) :: A_mat(2, 2)
type(mat) :: result
real(real64) :: v(4)
type(mat) :: V_mat(4), results(4)
integer :: i

a = 1.0
A_mat = ones(2, 2)

print *, my_exp(a)
result = my_exp(mat(A_mat))
call print_mat(result%mat)

v = [a, a, a, a]
print *, my_exp(v)

V_mat = [(mat(A_mat), i=1, 4)]
results = my_exp(V_mat)
do i = 1, 4
    call print_mat(results(i)%mat)
end do
end program
4 Likes

@implicitall , yes looking at your penultimate post, it was unclear why you had felt the need to include the element-wise functions, particularly considering the post by @nshaffer that included the library “recipe” to be followed.

The motivation was “You can do this with essentially any function in Julia, even if it is function defined in a library”. So the constraint for that piece of code is a non-elemental function defined in a library. The final version is for a library for which you have access to the code or the functions are elemental.

Yes, I agree.

Fortran language with its ISO IEC standard, multiple vendors with different hardware platforms, diverse user community with such a long legacy of code bases is on a different evolutionary path than Julia.

With Fortran, a key question is whether the language is sufficiently flexible to develop the kind of solutions that help its practitioners achieve perfomant computing goals. In the case of linear algebra too, it is a case of so near, yet so far. Meaning one can achieve a lot right now but the limitations with language intrinsics as well as generics imply considerable code duplication and verbosity in library code and perhaps the use of features such as INCLUDE files which are not appealing to quite a few.

Hopefully Fortran 202Y and the Fortran stdlib will make it a much richer experience for Fortran practitioners.

Toward the stdlib effort, the modules being considered for linear algebra will hopefully include good options for matrix computations that investigate options in the language such as parameterized derived types, defined IO in conjunction with the kind of useful ELEMENTAL methods such as exp, log, etc. that Julia has taken care to offer its users and that it will also provide other performance options as applicable - vectorization, concurrent processing, parallelization, etc.:

module mat_m
! Say this module is part of some Linear Algebra library
   type :: mat_t(n,m)
      integer, len :: n, m
      private
      real, public :: mat(n,m)
      real :: work(n,m)
   contains
      private
      procedure, pass(dtv) :: write_mat_t
      generic, public :: write(formatted) => write_mat_t
   end type
   generic :: exp => exp_mat_t   !<-- overload the intrinsic
contains
   elemental function ones(n,m) result(r)
      integer, intent(in) :: n,m
      type(mat_t(n=n,m=m)) :: r
      r%mat = 1.0
   end function
   elemental function exp_mat_t(x) result(y)
       type(mat_t(n=*,m=*)), intent(in) :: x
       type(mat_t(n=x%n,m=x%m)) :: y
       integer :: i, j, k, fact
       integer, parameter :: maxiter = 8
       y%mat = 0.
       do concurrent ( i=1:size(x%mat, 1), j=1:size(x%mat, 2))
           if (i == j) y%mat(i, j) = 1.
       end do
       fact = 1
       y%work = x%mat
       do k = 1, maxiter
           fact = fact*k
           y%mat = y%mat + y%work/fact
           y%work = matmul(y%work, x%mat)
       end do
   end function
   subroutine write_mat_t (dtv, lun, iotype, vlist, istat, imsg)
      class(mat_t(n=*,m=*)), intent(in) :: dtv
      integer, intent(in)               :: lun
      character (len=*), intent(in)     :: iotype
      integer, intent(in)               :: vlist(:)
      integer, intent(out)              :: istat
      character (len=*), intent(inout)  :: imsg
      ! local variable
      integer :: i, j

      write (lun, '(i0,"x",i0," Matrix{Float64}")', iostat=istat, iomsg=imsg) dtv%n, dtv%m
      if ( istat /= 0 ) return
      write(lun,*) new_line("")
      do i = 1, dtv%n
         write(lun, '(22f15.7)', advance='no', iostat=istat, iomsg=imsg) (dtv%mat(i, j), j=1,dtv%m)
         if ( istat /= 0 ) return
         write(lun, *) new_line("")
      end do
      return
   end subroutine
end module

program test

   use mat_m

   real :: a_scalar
   real :: vector(4)
   type(mat_t(n=2,m=2)) :: A
   type(mat_t(n=2,m=2)) :: V(4)

   a_scalar = 1.0
   vector = [ a_scalar, a_scalar, a_scalar, a_scalar ]
   print *, exp( vector )

   A = ones( A%n, A%m )
   print *, exp( A )

   V = [ A, A, A, A ]
   print *, exp( V )

end program

Note the main program above in Fortran looks rather similar to the Julia snippet once some effort toward a higher level of abstraction is expended in a library.

But now I set up the above code to work with default REAL for the underlying representation. To extend it now to other supported floating-point kinds by a processor is explicit code duplication, once for each kind and which can involve added effort to make it portable across platforms. This is among the aspects we can hope the Generics facility in Fortran 202Y will offer a good solution.

C:\temp>ifort /standard-semantics p.f90
Intel(R) Fortran Intel(R) 64 Compiler Classic for applications running on Intel(R) 64, Version 2021.2.0 Build 20210228_000000
Copyright (C) 1985-2021 Intel Corporation. All rights reserved.

Microsoft (R) Incremental Linker Version 14.27.29112.0
Copyright (C) Microsoft Corporation. All rights reserved.

-out:p.exe
-subsystem:console
p.obj

C:\temp>p.exe
2.718282 2.718282 2.718282 2.718282
2x2 Matrix{Float64}
4.1936512 3.1936510
3.1936510 4.1936512

2x2 Matrix{Float64}
4.1936512 3.1936510
3.1936510 4.1936512

2x2 Matrix{Float64}
4.1936512 3.1936510
3.1936510 4.1936512

2x2 Matrix{Float64}
4.1936512 3.1936510
3.1936510 4.1936512

2x2 Matrix{Float64}
4.1936512 3.1936510
3.1936510 4.1936512

C:\temp>

4 Likes