Things that basically have to be hand written in assembly (think OpenBLAS).
I would like resources on any of them but specially 2 and 3. I tried search engines but all results are on software for mathematical optimization (operational research).
I was able to find 3 books but they are all from ~2000 and I guess the basics should remain the same but CPUs have changed a bit since then: multi core is much more popular, performance and efficiency cores, vectorization, etc.
Performance Optimization of Numerically Intensive Codes;
Software Optimization for High-Performance Computers;
Software Optimization Cookbook: High-Performance Recipes for the Intel Architecture;
Juliacon (starting tomorrow) will have a bunch of talks about language level features from high performance garbage collection, to how to write a LIBM in a high level language., to GPU optimization. (schedule here, https://pretalx.com/juliacon-2022/schedule/ sign up is free)
Let me know if you have more specific questions.