# The birthday problem

## Introduction to the problem to be solved

There is a planet whose years have 100 days. In this planet there are 5 people, so,
calculates in extent all possible cases. What is the probability that two people
were born on the same day?

## The data structure

This is the vector `v`. Its size is equal to the number of people in the planet.
Each item in the vector represents the day each person was born. So

``````    -------------------------------
v:  |  1  |  4  |  1  |  7  |  6  |
-------------------------------
``````

Person 1 was born the first day of the year, person 2 was born de fourth day, the third
person the first day of the year (same day as the first person) â€¦

The numbers in the vector are in the range `1..n_days`

## The process

It consists in testing all combinations in the vector and test if there are two or more
people who were born the same day.

## Results

Execution time using `\$ time ` from command line. Fortran code was compiled using gfortran.

• GNU Fortran recursive 39.48s
• GNU Fortran iterative 54.69s
• Intel Fortran recursive 2m 14s
• Intel Fortran iterative 2m 54s
• Julia recursive 4m 55.69s
• Julia iterative 7m 25.34s

## Source code

I assume that you are aware that this is a classic problem in probabilities that doesnâ€™t require more than five calculations to solve. https://en.wikipedia.org/wiki/Birthday_problem

1 Like

Yes, it is a well known problem, but I just wanted to test the code working with many loops and checks. The results are just to compare how fast are languages, not to solve the problem.

1 Like

@Samuel_GF thanks for this submission!

The classic problem only gives you probabilities, while it is my understanding that the program above gives the actual numbers of coincidences for a given set of birthdays. So I think this is a great candidate for our benchmark repository, I created an issue for this: https://github.com/fortran-lang/benchmarks/issues/8.

@kargl is right that you should specify the exact compiler versions, flags used and the hardware details. In our benchmarks repository we will eventually do this automatically.

1 Like

I havenâ€™t read the code carefully, but tried running the codes on my very old Mac
(2012) <-- because this is too old, I cannot install BigSur (the next macOS)â€¦ XD

1. Because the Julia version seems to be using an array of `Float64` for `v`, while Fortran uses an array of `integer(8)` for `v`, I first changed the former to an array of `Int64`, and also `total / 100` to `div(total, 100)` (because `/` is a floating-point division similar to Python3â€¦). Then the loop indices `i` and `j` become `Int64` (rather than `Float64`), and so the code becomes closer to Fortran. This change gives 10 % speed up on my mac, but the gap in timing still remainsâ€¦ (<-- `gfortran-10 -O2` is about 130 sec, `gfortran-10 -O3` 43 sec, while `julia-1.4 -O3` is about 300 sec.)

(Btw, I cannot understand why the difference of O2 and O3 for gfortran is so largeâ€¦)

1. The Fortran version is using a fixed-size array for `v` on the stack (with length = 5 specified as a parameter), while Julia is using a dynamic array allocated on the heap (with length = 5). To fill this gap, using StaticArrays etc might be useful, but not very sure atmâ€¦
1 Like

Thank you for your reply. I am going to I improve the benchmark with all your suggestions and ideas today.

1 Like

# Next iteration

## New results

``````- GNU Fortran 0m 42s
- Julia       1m 32s     !! Faster than before
``````

# Changes in Julia code

After receive help and tips from Julia discourse forum these changes have been implemented:

• Renamed function `add1_recursive()` to `next_iteration()`
• Vector `v` now is a static array
• No bounds checked for loops of `same_day()` function
• No bounds checked for loops of `next_iteration()` function
• Added a `goto` statement to take a shortcut
• Added `return nothing` at the end of `next_iteration()`
• Removed old progress system and added new @showprogress
• Removed parameter `n_people` in function `same_day()` to avoid mistakes

Here you have the new Julia code.

# Changes in Fortran code

• The subroutine `add1_recursive()` has been renamed as `next_iteration()`

This is the Fortran code.

• Julia 1.4.1
• GNU Fortran 9.3.0
• CPU Model: 6.78.3 â€śIntelÂ® Coreâ„˘ i3-6006U CPU @ 2.00GHzâ€ť
• Memory Size: 3 GB + 768 MB
• Ubuntu 20.04.1 LTS (Focal Fossa)
• Linux kernel 5.4.0-42-generic
4 Likes

Awesome, this is a great example that shows why I prefer Fortran: just straightforward code using arrays, bounds checked in Debug mode and unchecked/fast in Release mode. Compared to Julia you have to worry about stuff like `@inbounds` (from the documentation my understanding is that it is not actually checked anymore in Debug build, since there doesnâ€™t seem to be a way to trigger a Debug build that checks this).

4 Likes

A battle is taking place in the Julia discourse https://discourse.julialang.org/t/the-birthday-paradox-help-to-be-faster-than-fortran/44809/10

There are bright minds working in awesome solutions to get better results with Julia, but Fortran is still faster (20 seconds Julia vs 16 seconds Fortran).

3 Likes

@Samuel_GF, I presume you are @profesor_s at the Julialang discourse site? And that you may be conducting something like a â€śround-robinâ€ť effort and working up the â€śfaithfulâ€ť on both sides!!?

First, since your use of `\$ time` command option on Linux is not a portable approach to measure performance (e.g., there is no reasonable equivalent on Windows even if Windows Powershell has some similar functionality), it is unclear what, if any, general conclusions can be drawn from the effort.

More importantly though, itâ€™s unclear what is being attempted as measurement? Juila JIT compiler with LLVM dependency vs GCC backend? and how these two setup their IO instructions? It doesnâ€™t quite look like Julia compared to Fortran, particularly with the latter having multiple compiler implementations that are faster in certain instructions than gfortran.

Increasingly though, note language performance comparisons, when two language offer similar semantics and syntax toward something (say array containers for floating-point objects), may be meaningless. But now, if a language has consciously decided not to include certain safeguards with facilities (e.g., use of pointer architecture to mimic multidimensional arrays in C++), it may make sense to compare what the â€śworkaroundâ€ť costs relative to a native implementation like arrays in Fortran. Or, alternately since Fortran has postponed any native type system like bit strings to manage arbitrary sequences of bits, it may make sense to how the cost of workaround options in Fortran versus native support of bit arrays in some other language in, say, an image rasterization application example.

What often matters are the aspects that cannot be measured readily: how easy or productive can the effort be in terms of the most precious commodity i.e., developer time, how full-featured yet advancing is the ecosystem including libraries and tooling to support the developers, how maintainable and extensible can the effort be in terms of future generations, future application needs and future computing architectures. Fortran does well in some areas, not so in others - that is what matters.

4 Likes