Thanks Ivan,
At the end, I kept it simple and asked an LLM to produce the unswitched loop versions (funny that @Beliavsky predicted this). It goes a bit against my wish to avoid repeating code, but since this is very stable code, I’d prefer that over a solution with metaprogramming at this point.
I found a strange behavior in ifx -fast for this code: Re: Performance of `ifx -fast` when invariant if conditions in loops. - Intel Community