Does LAPACK/BLAS automatically use multi cores or threads?

Thanks all!

@RonShepard @FedericoPerini .
I run the kxk code using intel fortran with flag -O3 -xHost on my laptop, xeon 2186M, windows 10. I believe I used MKL already. I am currently running other programs so the result is not accurate.

However anyway, it seems the speed I got is way way way much slower than the ones you listed,

 c11=  -2.31481358685203      cpu_time=  0.328125000000000       GFLOPS=  6.09523809523809
 c11=  -2.31481358685203      cpu_time=  0.390625000000000       GFLOPS=  5.12000000000000

I am a little puzzled.
Is there anyone get similar results on a non-Mac M1 machine?
Is there ways to improve the result on a non-Mac M1 machine?