Performance: Writing to stdout vs output_unit

The problem

Hi, as I am trying to improve the performance of the test Reverse Complement from the Benchmark Game, I can use two different methods to write to standard output.

As the first (the fastest) method spends less time I feel tented to use it, but, it is not so portable as the second one.

Do you know how to get the best of both methods. Thank you

Note: I am using gfortran 9.3.0 and Linux

My results

TIME fast procedure:    6.27586031    
TIME slow procedure:    15.6404552

The code

program write_test
    use, intrinsic :: iso_fortran_env
    implicit none

    integer:: file_id, i 
    real:: time_fast(0:1), time_slow(0:1)
    integer, parameter:: iterations = 5e7
    character(len=7), parameter:: s = "Fortran"

    !-----------------------------------------------
    ! Faster procedure
    !-----------------------------------------------
    call cpu_time(time_fast(0))
    open(newunit=file_id, file="/dev/stdout", form="unformatted", access="stream")
    do i = 1, iterations
        write (file_id) s
    end do
    close(file_id)
    call cpu_time(time_fast(1))

    ! ----------------------------------------------
    ! Slower procedure
    ! ----------------------------------------------
    call cpu_time(time_slow(0))
    open(unit=output_unit)
    do i = 1, iterations
        write (output_unit, "(a)") s
    end do
    call cpu_time(time_slow(1))

    ! ----------------------------------------------
    ! Show results
    ! ----------------------------------------------
    print *, "TIME fast procedure: ",time_fast(1)-time_fast(0)
    print *, "TIME slow procedure: ",time_slow(1)-time_slow(0)
end program write_test

I think that you’re comparing the performance of formatted output to the performance of unformatted output. Even in the easy case of CHARACTER data and an (A) format, there’s more overhead involved in the I/O runtime support library for formatted output.

2 Likes

I looked through the description for the Reverse Complement code (having written such code in Fortran in the past), and the requirements basically mandate poorly written code. I would ignore their suggestions and just write a code that accomplishes the task and makes sense in Fortran.

The unit OUTPUT_UNIT should be connected to the same file that C calls stdout.

My goal is to improve the Fortran results in that test, but I am not able to use output_unit as unformatted using gfortran, so I am looking for other solutions.

After working on some benchmarks I realized they are not useful tools, as the constraints are too artificial.

Note: usually I get better results with gfortran than with ifort, I am not sure why

In pure Fortran, you have no choice but to use write/print with the large associated overheads.

If you’re willing to use c interop for your benchmark and since you are only outputting a string in this example (which requires no parsing/conversion), you can call the c function puts:


program write_test
    use iso_c_binding, only: c_char, c_null_char
    implicit none

    integer:: i, r 
    real:: time_fast(0:1), time_slow(0:1)
    integer, parameter:: iterations = 5e7
    character(len=7), parameter:: s = "Fortran"
    
    interface
      function puts(str) result(r) bind(C,name="puts")
        import
        character(kind=c_char), dimension(*) :: str
        integer :: r
      end function puts
    end interface

    !-----------------------------------------------
    ! Faster procedure
    !-----------------------------------------------
    call cpu_time(time_fast(0))
    do i = 1, iterations
        r = puts(s//c_null_char)
    end do
    call cpu_time(time_fast(1))

    ! ----------------------------------------------
    ! Slower procedure
    ! ----------------------------------------------
    call cpu_time(time_slow(0))
    open(unit=output_unit)
    do i = 1, iterations
        write (output_unit, "(a)") s
    end do
    call cpu_time(time_slow(1))

    ! ----------------------------------------------
    ! Show results
    ! ----------------------------------------------
    print *, "TIME fast procedure: ",time_fast(1)-time_fast(0)
    print *, "TIME slow procedure: ",time_slow(1)-time_slow(0)
end program write_test

On my machine this has a similar performance to your original using unformatted output.
This is faster because, as @pmk pointed out, there’s a lot of overhead in the write routine, even compared to c printf.

I use c interop so ubiquitously in Fortran that I don’t consider it invalid for a benchmark, but this opinion is likely not shared by all.

1 Like

Indeed. OUTPUT_UNIT is the same unit as * in Fortran, and is explicitly for sequential formatted output. So gfortran is doing the right thing. OTOH, processing a FASTA file like this usually results in another FASTA file written to a normal disk file, not to stdout. I guess you could try writing the output unformatted to disk, including new_line characters at the end of each “line” and then ‘cat’ the result sending it to the screen.

I am afraid that one of the constraints of this benchmarksgame states that you have to read from stdin and write to stdout in this way:

$ ./test <input.fasta >output.fasta

There is not an inherent reason why formatted WRITE has to be slower than printf.