Can you try timing with system_clock
or omp_get_wtime
instead of cpu_time
?
For gfortran you will need an additional flag to generate multithreaded do concurrent. I’m not sure if flang has something similar (yet). You also may want to check the optimization reports.
Some details can be found here: DO CONCURRENT: compiler flags to enable parallelization - #6 by ivanpribec