Apologies for saying that JacobiConstant
was part of the code being timed (I misread the code slightly).
The main problem in the code being benchmarked is the callbacks that you added. Since they are type unstable, and refer to a non-constant global variable, they kill performance.
To diagnose this, I’ve simplified your benchmark a bunch (just to make things easier to read).
using OrdinaryDiffEq, StaticArrays;
function cr3bpeom(dX, X, p, t)
mu = 0.012277471
@inbounds begin
r1 = sqrt((X[1] + mu)^2 + X[2]^2)
r2 = sqrt((X[1] - 1 + mu)^2 + X[2]^2)
@views dX[1:3] = X[4:6]
dX[4] = X[1] + 2*X[5] - (1 - mu)*(X[1] + mu)/r1^3 - mu*(X[1] - 1 + mu)/r2^3
dX[5] = X[2] - 2*X[4] - (1 - mu)*X[2]/r1^3 - mu*X[2]/r2^3
dX[6] = 0.0
end
end
condition1 = function (u, t, integrator)
if t > 0.5 && t < 0.5005
1
else
-1
end
end
function affect1(integrator)
end
function affect1_neg(integrator)
end
cb1 = ContinuousCallback(condition1, affect1, affect1_neg;
rootfind=true,save_positions=(true,true),abstol=0.001)
Mask2 = false
function condition2(u, t, integrator)
if Mask2 == false
if u[1] < 0
-1
else
u[2]
end
else
-1
end
end
function affect2(integrator)
Mask2 = true
end
cb2 = ContinuousCallback(condition2, affect2; rootfind=true,save_positions=(true,true),abstol=1e-9)
Mask3 = false
function condition3(u, t, integrator)
if Mask3 == false
if u[1] > 0
-1
else
u[2]
end
else
-1
end
end
function affect3(integrator)
integrator.u .= X0
Mask3 = true
end
cb3 = ContinuousCallback(condition3, affect3; rootfind=true,save_positions=(true,true),abstol=1e-9)
# callback set
cb = CallbackSet(cb1,cb2,cb3)
const X0 = [0.994, 0.0, 0.0, 0.0, -2.00158510637908252240537862224, 0.0]
#compile everything
solve(ODEProblem(cr3bpeom, copy(X0), (0.0,.1)),Tsit5(),abstol=1e-14, reltol=1e-11,timeseries_errors=false, dense_errors=false, dense=false)
@time for i in 1:100
# randomsize initial conditions
X0[1] = X0[1] + .000000000001*X0[1].*rand()
prob = ODEProblem(cr3bpeom, copy(X0), (0.0,42.66304140039491));
solve(prob,Tsit5(),abstol=1e-14, reltol=1e-11, timeseries_errors=false, dense_errors=false, dense=false,callback=cb)
end
On my computer, this takes 15 seconds. If we remove callback=cb
, that drops to .2 seconds (or faster than FLINT).
We can then get another 10% performance improvement and a 20x drop in allocations by using the StaticVector
approach that was mentioned when you originally asked for advice for on this (the other reason to do this is that it provides a much bigger speedup once you start adding back more optimized versions of the callbacks).
To optimize the callbacks, we need to make sure the conditions always return the same datatype, and don’t rely on non-const globals. For example,
condition1(u, t, integrator) = (t-0.5) * (t-0.5005)
cb1 = ContinuousCallback(condition1, nothing; rootfind=true,save_positions=(true,false), abstol=0.001)
const Mask2 = falsefunction condition2(u, t, integrator)
if Mask2 == false
if u[1] < 0
-1.
else
u[2]
end
else
-1.
end
end
function affect2(integrator)
Mask2 = true
end
cb2 = ContinuousCallback(condition2, affect2; rootfind=true,save_positions=(true,false),abstol=1e-9)
const Mask3 = false
function condition3(u, t, integrator)
if Mask3 == false
if u[1] > 0
-1.
else
u[2]
end
else
-1.
end
end
function affect3(integrator)
integrator.u = X0
Mask3 = true
end
cb3 = ContinuousCallback(condition3, affect3; rootfind=true,save_positions=(true,true),abstol=1e-9)
cb = CallbackSet(cb1,cb2,cb3)
Using const
for Mask2
and Mask3
isn’t great practice, but const
variables allow you to change their value (but not type) for now, and it’s a good, easy way to show how much of a difference it makes. With these changes, the code runs in 3.5 seconds (or an over 4x speedup over the original code).