Have you tried profiling the test code yet? It would be interesting to know where the ‘hot spots’ are. Might be something relatively simple to fix.
I suspect you are generating a lot of procedure calls within hot loops. Perhaps turning on compiler inlining might help.