Performance drop when using intel oneAPI with standard sematics option

I encountered something similar to this and it was related to how gradual underflows were treated. With certain input data, my code generated a large number of denormalized numbers, and if gradual underflow was enabled it took a lot of time to process them.