Dear all,
I noticed in some cases perhaps Julia may be faster as @certik showed below,
I also noticed @mohoree said Intel MKL’s routines can perform like 5x faster than not using MKL,
I also notice @gnikit mentioned MKL too,
So, MKL these three letters keep apprearing in my mind recently. I try to see what Intel Fortran with MKL can do.
While looking at MKL examples, I notice that intel new Fortran Compiler IFX seems can offload openMP at least to Intel’s GPU.
I mean say I have a xeon 2186M and with intel P630 GPU inside the chip, it seems IFX should be able to offload openMP to intel’s GPU,
https://www.intel.com/content/www/us/en/develop/documentation/get-started-with-cpp-fortran-compiler-openmp/top.html
I use Intel OneAPI 2022.0.3 + visual studio 2019 on windows, and I tried to compile and run the offload examples in Intel OneAPI MKL examples in the examples folder,
C:\Program Files (x86)\Intel\oneAPI\mkl\2022.0.3\examples
The example I am trying the run openMP offload to intel GPU is vsinh.f90
located at,
C:\Program Files (x86)\Intel\oneAPI\mkl\2022.0.3\examples\examples_offload_f\f_offload\vml\source
However, it seems if I enable openMP offload as below,
it just give error at linking stage. However I am not sure if I what I set at linking stage is correct or not, below is what I set,
The error at linking is below,
Build started...
1>------ Build started: Project: MKL_test (IFX), Configuration: Release x64 ------
1>Linking...
1>Intel(R) Fortran Compiler for applications running on Intel(R) 64, Version 2022.0.0 Build 20211123
1>Copyright (C) 1985-2021 Intel Corporation. All rights reserved.
1>2828013-vsinh.obj : warning LNK4078: multiple '__CLANG_OFFLOAD_BUNDLE__openmp-s' sections found with different attributes (40000800)
1>2828013-vsinh.obj : error LNK2019: unresolved external symbol MKL_VM_VMLSETMODE_OMP_OFFLOAD_ILP64 referenced in function TEST_FLOAT
1>2828013-vsinh.obj : error LNK2019: unresolved external symbol MKL_VM_VSSINH_OMP_OFFLOAD_ILP64 referenced in function TEST_FLOAT
1>2828013-vsinh.obj : error LNK2019: unresolved external symbol MKL_VM_VMSSINH_OMP_OFFLOAD_ILP64 referenced in function TEST_FLOAT
1>2828013-vsinh.obj : error LNK2019: unresolved external symbol MKL_VM_VSSINHI_OMP_OFFLOAD_ILP64 referenced in function TEST_FLOAT
1>2828013-vsinh.obj : error LNK2019: unresolved external symbol MKL_VM_VMSSINHI_OMP_OFFLOAD_ILP64 referenced in function TEST_FLOAT
1>2828013-vsinh.obj : error LNK2019: unresolved external symbol MKL_VM_VDSINH_OMP_OFFLOAD_ILP64 referenced in function TEST_DOUBLE
1>2828013-vsinh.obj : error LNK2019: unresolved external symbol MKL_VM_VMDSINH_OMP_OFFLOAD_ILP64 referenced in function TEST_DOUBLE
1>2828013-vsinh.obj : error LNK2019: unresolved external symbol MKL_VM_VDSINHI_OMP_OFFLOAD_ILP64 referenced in function TEST_DOUBLE
1>2828013-vsinh.obj : error LNK2019: unresolved external symbol MKL_VM_VMDSINHI_OMP_OFFLOAD_ILP64 referenced in function TEST_DOUBLE
1>2828013-vsinh.obj : error LNK2019: unresolved external symbol MKL_VM_VCSINH_OMP_OFFLOAD_ILP64 referenced in function TEST_FLOAT_COMPLEX
1>2828013-vsinh.obj : error LNK2019: unresolved external symbol MKL_VM_VMCSINH_OMP_OFFLOAD_ILP64 referenced in function TEST_FLOAT_COMPLEX
1>2828013-vsinh.obj : error LNK2019: unresolved external symbol MKL_VM_VCSINHI_OMP_OFFLOAD_ILP64 referenced in function TEST_FLOAT_COMPLEX
1>2828013-vsinh.obj : error LNK2019: unresolved external symbol MKL_VM_VMCSINHI_OMP_OFFLOAD_ILP64 referenced in function TEST_FLOAT_COMPLEX
1>2828013-vsinh.obj : error LNK2019: unresolved external symbol MKL_VM_VZSINH_OMP_OFFLOAD_ILP64 referenced in function TEST_DOUBLE_COMPLEX
1>2828013-vsinh.obj : error LNK2019: unresolved external symbol MKL_VM_VMZSINH_OMP_OFFLOAD_ILP64 referenced in function TEST_DOUBLE_COMPLEX
1>2828013-vsinh.obj : error LNK2019: unresolved external symbol MKL_VM_VZSINHI_OMP_OFFLOAD_ILP64 referenced in function TEST_DOUBLE_COMPLEX
1>2828013-vsinh.obj : error LNK2019: unresolved external symbol MKL_VM_VMZSINHI_OMP_OFFLOAD_ILP64 referenced in function TEST_DOUBLE_COMPLEX
1>x64\Release\MKL_test.exe : fatal error LNK1120: 17 unresolved externals
1>
However if I disable offload like below then the code can compile and link (with some minor warnings) and runs fine.
But I just do not know how to corrected set in Visual Studio or in command line how to let IFX offload to GPU and run the code correctly.
Just curious, does anyone knows how to use IFX and offload openMP to GPU?
Thanks much in advance!
PS.
So as usual, I googled and as usual, I notice Dr. Fortran @sblionel have some comments in 2018 in the below thread,
Now in 2022, I guess IFX can offload to intel GPU, but I just do not know how to set it in VIsual Studio, Perhaps Dr. Fortran @sblionel may give some hints about how to offload openMP to intel GPU? Thank Dr. Fortran in advance
The example vsinh.f90 is below,
!===============================================================================
! Copyright 2020-2021 Intel Corporation.
!
! This software and the related documents are Intel copyrighted materials, and
! your use of them is governed by the express license under which they were
! provided to you (License). Unless the License provides otherwise, you may not
! use, modify, copy, publish, distribute, disclose or transmit this software or
! the related documents without Intel's prior written permission.
!
! This software and the related documents are provided as is, with no express
! or implied warranties, other than those that are expressly stated in the
! License.
!===============================================================================
!*
!
!* Content:
!* Sinh example program text (OpenMP offload interface)
!*
!*******************************************************************************/
include "mkl_omp_offload.f90"
include "_vml_common_functions.f90"
! @brief Real single precision function test begin
integer (kind=4) function test_float(funcname)
use onemkl_vml_omp_offload
implicit none
include "_vml_common_data.f90"
character (len = *) :: funcname
real (kind=4) :: as_float
integer (kind=4) :: check_result_float
real (kind=4),allocatable :: varg1(:), vres1(:), vmres1(:), vref1(:)
real (kind=4),allocatable :: vresi1(:), vmresi1(:), vrefi1(:)
integer (kind=4) i, a, errs
integer (kind=4) VLEN
parameter (VLEN = 4)
integer (kind=4) test_arg1(VLEN)
integer (kind=4) test_ref1(VLEN)
integer (kind=4) nan_value
integer (kind=8) vml_accuracy_mode(3)
data vml_accuracy_mode / VML_HA, VML_LA, VML_EP /
integer tmode
! NaN value to fill result vector
data nan_value /Z'FFFFFFFF'/
! Arguments and reference results begin
data test_arg1 / Z'40D9B85C', & ! 6.80375481
Z'C007309A', & ! -2.1123414
Z'40B52EFA', & ! 5.66198444
Z'40BF006A' / ! 5.96880054
data test_ref1 / Z'43E14E52', & ! 450.611877
Z'C0825890', & ! -4.07331085
Z'430FDB98', & ! 143.857788
Z'43438454' / ! 195.516907
! Arguments and reference results end
errs = 0
! Allocate vectors
allocate(varg1(VLEN))
allocate(vres1(VLEN))
allocate(vmres1(VLEN))
allocate(vref1(VLEN))
allocate(vresi1(VLEN))
allocate(vmresi1(VLEN))
allocate(vrefi1(VLEN))
! Fill vectors
do i = 1, VLEN
varg1(i) = as_float(test_arg1(i))
vref1(i) = as_float(test_ref1(i))
vres1(i) = as_float(nan_value)
vmres1(i) = as_float(nan_value)
! Fill even result values with 777 pads for strided indexing
if (and(i,1) .eq. 1) then
vrefi1(i) = as_float(test_ref1(i))
vresi1(i) = 999
vmresi1(i) = 999
else
vrefi1(i) = 777
vresi1(i) = 777
vmresi1(i) = 777
end if
enddo
! Loop by three accuracy flavors
do a = 1, 3
! Call VML function with specific accuracy flavor
!$omp target variant dispatch
tmode = vmlsetmode(vml_accuracy_mode(a))
!$omp end target variant dispatch
!$omp target data map(varg1,vres1)
!$omp target variant dispatch use_device_ptr(varg1,vres1)
call vssinh(VLEN, varg1, vres1)
!$omp end target variant dispatch
!$omp end target data
!$omp target data map(varg1,vmres1)
!$omp target variant dispatch use_device_ptr(varg1,vmres1)
call vmssinh(VLEN, varg1, vmres1, vml_accuracy_mode(a))
!$omp end target variant dispatch
!$omp end target data
!$omp target data map(varg1,vresi1)
!$omp target variant dispatch use_device_ptr(varg1,vresi1)
call vssinhi(VLEN/2, varg1, 2, vresi1, 2)
!$omp end target variant dispatch
!$omp end target data
!$omp target data map(varg1,vmresi1)
!$omp target variant dispatch use_device_ptr(varg1,vmresi1)
call vmssinhi(VLEN/2, varg1, 2, vmresi1, 2, vml_accuracy_mode(a))
!$omp end target variant dispatch
!$omp end target data
! Check results
do i = 1, VLEN
errs = errs + check_result_float(i, VML_ARG1_RES1, varg1(i), varg1(i), &
vres1(i), vres1(i), vref1(i), vref1(i), "v"//funcname, a, ", simple")
errs = errs + check_result_float(i, VML_ARG1_RES1, varg1(i), varg1(i), &
vmres1(i), vmres1(i), vref1(i), vref1(i), "vm"//funcname, a, ", simple")
errs = errs + check_result_float(i, VML_ARG1_RES1, varg1(i), varg1(i), &
vresi1(i), vresi1(i), vrefi1(i), vrefi1(i), "v"//funcname//"i", a, ", strided")
errs = errs + check_result_float(i, VML_ARG1_RES1, varg1(i), varg1(i), &
vmresi1(i), vmresi1(i), vrefi1(i), vrefi1(i), "vm"//funcname//"i", a, ", strided")
enddo
enddo
test_float = errs
end function
! @brief Real single precision function test end
! @brief Real double precision function test begin
integer (kind=4) function test_double(funcname)
use onemkl_vml_omp_offload
implicit none
include "_vml_common_data.f90"
character (len = *) :: funcname
real (kind=8) :: as_double
integer (kind=4) :: check_result_double
real (kind=8),allocatable :: varg1(:), vres1(:), vmres1(:), vref1(:)
real (kind=8),allocatable :: vresi1(:), vmresi1(:), vrefi1(:)
integer (kind=4) i, a, errs
integer (kind=8) VLEN
parameter (VLEN = 4)
integer (kind=8) test_arg1(VLEN)
integer (kind=8) test_ref1(VLEN)
integer (kind=8) nan_value
integer (kind=8) vml_accuracy_mode(3)
data vml_accuracy_mode / VML_HA, VML_LA, VML_EP /
integer tmode
! NaN value to fill result vector
data nan_value /Z'FFFFFFFFFFFFFFFF'/
! Arguments and reference results begin
data test_arg1 / Z'401B370B60E66E18', & ! 6.80375434309419092
Z'C000E6134801CC26', & ! -2.11234146361813924
Z'4016A5DF421D4BBE', & ! 5.66198447517211711
Z'4017E00D485FC01A' / ! 5.96880066952146571
data test_ref1 / Z'407C29C968C677F1', & ! 450.611672187106763
Z'C0104B1218DE4197', & ! -4.07331122261566403
Z'4061FB72FBB708AE', & ! 143.857786042678924
Z'4068708AA6866883' / ! 195.516925108448135
! Arguments and reference results end
errs = 0
! Allocate vectors
allocate(varg1(VLEN))
allocate(vres1(VLEN))
allocate(vmres1(VLEN))
allocate(vref1(VLEN))
allocate(vresi1(VLEN))
allocate(vmresi1(VLEN))
allocate(vrefi1(VLEN))
! Fill vectors
do i = 1, VLEN
varg1(i) = as_double(test_arg1(i))
vref1(i) = as_double(test_ref1(i))
vres1(i) = as_double(nan_value)
vmres1(i) = as_double(nan_value)
! Fill even result values with 777 pads for strided indexing
if (and(i,1) .eq. 1) then
vrefi1(i) = as_double(test_ref1(i))
vresi1(i) = 999
vmresi1(i) = 999
else
vrefi1(i) = 777
vresi1(i) = 777
vmresi1(i) = 777
end if
enddo
! Loop by three accuracy flavors
do a = 1, 3
! Call VML function with specific accuracy flavor
!$omp target variant dispatch
tmode = vmlsetmode(vml_accuracy_mode(a))
!$omp end target variant dispatch
!$omp target data map(varg1,vres1)
!$omp target variant dispatch use_device_ptr(varg1,vres1)
call vdsinh(VLEN, varg1, vres1)
!$omp end target variant dispatch
!$omp end target data
!$omp target data map(varg1,vmres1)
!$omp target variant dispatch use_device_ptr(varg1,vmres1)
call vmdsinh(VLEN, varg1, vmres1, vml_accuracy_mode(a))
!$omp end target variant dispatch
!$omp end target data
!$omp target data map(varg1,vresi1)
!$omp target variant dispatch use_device_ptr(varg1,vresi1)
call vdsinhi(VLEN/2, varg1, 2, vresi1, 2)
!$omp end target variant dispatch
!$omp end target data
!$omp target data map(varg1,vmresi1)
!$omp target variant dispatch use_device_ptr(varg1,vmresi1)
call vmdsinhi(VLEN/2, varg1, 2, vmresi1, 2, vml_accuracy_mode(a))
!$omp end target variant dispatch
!$omp end target data
! Check results
do i = 1, VLEN
errs = errs + check_result_double(i, VML_ARG1_RES1, varg1(i), varg1(i), &
vres1(i), vres1(i), vref1(i), vref1(i), "v"//funcname, a, ", simple")
errs = errs + check_result_double(i, VML_ARG1_RES1, varg1(i), varg1(i), &
vmres1(i), vmres1(i), vref1(i), vref1(i), "vm"//funcname, a, ", simple")
errs = errs + check_result_double(i, VML_ARG1_RES1, varg1(i), varg1(i), &
vresi1(i), vresi1(i), vrefi1(i), vrefi1(i), "v"//funcname//"i", a, ", strided")
errs = errs + check_result_double(i, VML_ARG1_RES1, varg1(i), varg1(i), &
vmresi1(i), vmresi1(i), vrefi1(i), vrefi1(i), "vm"//funcname//"i", a, ", strided")
enddo
enddo
test_double = errs
end function
! @brief Real double precision function test end
! @brief Complex single precision function test begin
integer (kind=4) function test_float_complex(funcname)
use onemkl_vml_omp_offload
implicit none
include "_vml_common_data.f90"
character (len = *) :: funcname
real (kind=4) :: as_float
integer (kind=4) :: check_result_float_complex
complex (kind=4),allocatable :: varg1(:), vres1(:), vmres1(:), vref1(:)
complex (kind=4),allocatable :: vresi1(:), vmresi1(:), vrefi1(:)
integer (kind=4) i, a, errs
integer (kind=4) VLEN
parameter (VLEN = 4)
integer (kind=4) test_arg1(2*VLEN)
integer (kind=4) test_ref1(2*VLEN)
integer (kind=4) nan_value
integer (kind=8) vml_accuracy_mode(3)
data vml_accuracy_mode / VML_HA, VML_LA, VML_EP /
integer tmode
! NaN value to fill result vector
data nan_value /Z'FFFFFFFF'/
! Arguments and reference results begin
data test_arg1 / Z'C007309A', Z'40D9B85C', & ! -2.1123414 + i * 6.80375481
Z'40BF006A', Z'40B52EFA', & ! 5.96880054 + i * 5.66198444
Z'C0C1912F', Z'4103BA28', & ! -6.04897261 + i * 8.2329483
Z'40ABAABC', Z'C052EA36' / ! 5.3645916 + i * -3.2955451
data test_ref1 / Z'C06228DD', Z'400582FC', & ! -3.5337441 + i * 2.08611965
Z'431EFD8F', Z'C2E396E2', & ! 158.990463 + i * -113.794693
Z'429CBE3E', Z'4344CF32', & ! 78.3715668 + i * 196.809357
Z'C2D32BFA', Z'418315A9' / ! -105.585892 + i * 16.3855762
! Arguments and reference results end
errs = 0
! Allocate vectors
allocate(varg1(VLEN))
allocate(vres1(VLEN))
allocate(vmres1(VLEN))
allocate(vref1(VLEN))
allocate(vresi1(VLEN))
allocate(vmresi1(VLEN))
allocate(vrefi1(VLEN))
! Fill vectors
do i = 1, VLEN
varg1(i) = CMPLX(as_float(test_arg1(2*i-1)), as_float(test_arg1(2*i)), 4)
vref1(i) = CMPLX(as_float(test_ref1(2*i-1)), as_float(test_ref1(2*i)), 4)
vres1(i) = as_float(nan_value)
vmres1(i) = as_float(nan_value)
! Fill even result values with 777 pads for strided indexing
if (and(i,1) .eq. 1) then
vrefi1(i) = CMPLX(as_float(test_ref1(2*i-1)), as_float(test_ref1(2*i)), 4)
vresi1(i) = CMPLX(999,999,4)
vmresi1(i) = CMPLX(999,999,4)
else
vrefi1(i) = CMPLX(777,777,4)
vresi1(i) = CMPLX(777,777,4)
vmresi1(i) = CMPLX(777,777,4)
end if
enddo
! Loop by three accuracy flavors
do a = 1, 3
! Call VML function with specific accuracy flavor
!$omp target variant dispatch
tmode = vmlsetmode(vml_accuracy_mode(a))
!$omp end target variant dispatch
!$omp target data map(varg1,vres1)
!$omp target variant dispatch use_device_ptr(varg1,vres1)
call vcsinh(VLEN, varg1, vres1)
!$omp end target variant dispatch
!$omp end target data
!$omp target data map(varg1,vmres1)
!$omp target variant dispatch use_device_ptr(varg1,vmres1)
call vmcsinh(VLEN, varg1, vmres1, vml_accuracy_mode(a))
!$omp end target variant dispatch
!$omp end target data
!$omp target data map(varg1,vresi1)
!$omp target variant dispatch use_device_ptr(varg1,vresi1)
call vcsinhi(VLEN/2, varg1, 2, vresi1, 2)
!$omp end target variant dispatch
!$omp end target data
!$omp target data map(varg1,vmresi1)
!$omp target variant dispatch use_device_ptr(varg1,vmresi1)
call vmcsinhi(VLEN/2, varg1, 2, vmresi1, 2, vml_accuracy_mode(a))
!$omp end target variant dispatch
!$omp end target data
! Check results
do i = 1, VLEN
errs = errs + check_result_float_complex(i, VML_ARG1_RES1, varg1(i), varg1(i), &
vres1(i), vres1(i), vref1(i), vref1(i), "v"//funcname, a, ", simple")
errs = errs + check_result_float_complex(i, VML_ARG1_RES1, varg1(i), varg1(i), &
vmres1(i), vmres1(i), vref1(i), vref1(i), "vm"//funcname, a, ", simple")
errs = errs + check_result_float_complex(i, VML_ARG1_RES1, varg1(i), varg1(i), &
vresi1(i), vresi1(i), vrefi1(i), vrefi1(i), "v"//funcname//"i", a, ", strided")
errs = errs + check_result_float_complex(i, VML_ARG1_RES1, varg1(i), varg1(i), &
vmresi1(i), vmresi1(i), vrefi1(i), vrefi1(i), "vm"//funcname//"i", a, ", strided")
enddo
enddo
test_float_complex = errs
end function
! @brief Complex single precision function test end
! @brief Complex double precision function test begin
integer (kind=4) function test_double_complex(funcname)
use onemkl_vml_omp_offload
implicit none
include "_vml_common_data.f90"
character (len = *) :: funcname
real (kind=8) :: as_double
integer (kind=4) :: check_result_double_complex
complex (kind=8),allocatable :: varg1(:), vres1(:), vmres1(:), vref1(:)
complex (kind=8),allocatable :: vresi1(:), vmresi1(:), vrefi1(:)
integer (kind=4) i, a, errs
integer (kind=8) VLEN
parameter (VLEN = 4)
integer (kind=8) test_arg1(2*VLEN)
integer (kind=8) test_ref1(2*VLEN)
integer (kind=8) nan_value
integer (kind=8) vml_accuracy_mode(3)
data vml_accuracy_mode / VML_HA, VML_LA, VML_EP /
integer tmode
! NaN value to fill result vector
data nan_value /Z'FFFFFFFFFFFFFFFF'/
! Arguments and reference results begin
data test_arg1 / Z'C000E6134801CC26', Z'401B370B60E66E18', & ! -2.11234146361813924 + i * 6.80375434309419092
Z'4017E00D485FC01A', Z'4016A5DF421D4BBE', & ! 5.96880066952146571 + i * 5.66198447517211711
Z'C0183225E080644C', Z'40207744D998EE8A', & ! -6.04897261413232101 + i * 8.23294715873568705
Z'4015755793FAEAB0', Z'C00A5D46A314BA8E' / ! 5.36459189623808186 + i * -3.2955448857022196
data test_ref1 / Z'C00C451C45E4AF59', Z'4000B05EB8F0615E', & ! -3.533745332757388 + i * 2.08611816867430289
Z'4063DFB206091F94', Z'C05C72DC5A12846B', & ! 158.990481393641517 + i * -113.794699209292659
Z'405397C41F9D3258', Z'406899E6F517D13C', & ! 78.3713454280017459 + i * 196.809443041341979
Z'C05A657FC5D4A250', Z'403062B3F5B1D862' / ! -105.585923631334708 + i * 16.3855584677879804
! Arguments and reference results end
errs = 0
! Allocate vectors
allocate(varg1(VLEN))
allocate(vres1(VLEN))
allocate(vmres1(VLEN))
allocate(vref1(VLEN))
allocate(vresi1(VLEN))
allocate(vmresi1(VLEN))
allocate(vrefi1(VLEN))
! Fill vectors
do i = 1, VLEN
varg1(i) = CMPLX(as_double(test_arg1(2*i-1)), as_double(test_arg1(2*i)), 8)
vref1(i) = CMPLX(as_double(test_ref1(2*i-1)), as_double(test_ref1(2*i)), 8)
vres1(i) = as_double(nan_value)
vmres1(i) = as_double(nan_value)
! Fill even result values with 777 pads for strided indexing
if (and(i,1) .eq. 1) then
vrefi1(i) = CMPLX(as_double(test_ref1(2*i-1)), as_double(test_ref1(2*i)), 8)
vresi1(i) = CMPLX(999,999,8)
vmresi1(i) = CMPLX(999,999,8)
else
vrefi1(i) = CMPLX(777,777,8)
vresi1(i) = CMPLX(777,777,8)
vmresi1(i) = CMPLX(777,777,8)
end if
enddo
! Loop by three accuracy flavors
do a = 1, 3
! Call VML function with specific accuracy flavor
!$omp target variant dispatch
tmode = vmlsetmode(vml_accuracy_mode(a))
!$omp end target variant dispatch
!$omp target data map(varg1,vres1)
!$omp target variant dispatch use_device_ptr(varg1,vres1)
call vzsinh(VLEN, varg1, vres1)
!$omp end target variant dispatch
!$omp end target data
!$omp target data map(varg1,vmres1)
!$omp target variant dispatch use_device_ptr(varg1,vmres1)
call vmzsinh(VLEN, varg1, vmres1, vml_accuracy_mode(a))
!$omp end target variant dispatch
!$omp end target data
!$omp target data map(varg1,vresi1)
!$omp target variant dispatch use_device_ptr(varg1,vresi1)
call vzsinhi(VLEN/2, varg1, 2, vresi1, 2)
!$omp end target variant dispatch
!$omp end target data
!$omp target data map(varg1,vmresi1)
!$omp target variant dispatch use_device_ptr(varg1,vmresi1)
call vmzsinhi(VLEN/2, varg1, 2, vmresi1, 2, vml_accuracy_mode(a))
!$omp end target variant dispatch
!$omp end target data
! Check results
do i = 1, VLEN
errs = errs + check_result_double_complex(i, VML_ARG1_RES1, varg1(i), varg1(i), &
vres1(i), vres1(i), vref1(i), vref1(i), "v"//funcname, a, ", simple")
errs = errs + check_result_double_complex(i, VML_ARG1_RES1, varg1(i), varg1(i), &
vmres1(i), vmres1(i), vref1(i), vref1(i), "vm"//funcname, a, ", simple")
errs = errs + check_result_double_complex(i, VML_ARG1_RES1, varg1(i), varg1(i), &
vresi1(i), vresi1(i), vrefi1(i), vrefi1(i), "v"//funcname//"i", a, ", strided")
errs = errs + check_result_double_complex(i, VML_ARG1_RES1, varg1(i), varg1(i), &
vmresi1(i), vmresi1(i), vrefi1(i), vrefi1(i), "vm"//funcname//"i", a, ", strided")
enddo
enddo
test_double_complex = errs
end function
! @brief Complex double precision function test end
! @brief Main test program begin
program sinh_example
use onemkl_vml_omp_offload
implicit none
include "_vml_common_data.f90"
integer (kind=4) :: blend_int32
integer (kind=4) :: test_float
integer (kind=4) :: test_float_complex
integer (kind=4) :: test_double
integer (kind=4) :: test_double_complex
integer (kind=4) errs, total_errs
character (len = *), parameter :: funcname = "sinh"
total_errs = 0
data FLOAT_MAXULP /FLOAT_MAXULP_HA,FLOAT_MAXULP_LA,FLOAT_MAXULP_EP/
data COMPLEX_FLOAT_MAXULP /4.0,FLOAT_COMPLEX_MAXULP_LA,FLOAT_COMPLEX_MAXULP_EP/
data DOUBLE_MAXULP /DOUBLE_MAXULP_HA,DOUBLE_MAXULP_LA,DOUBLE_MAXULP_EP/
data COMPLEX_DOUBLE_MAXULP /4.0,DOUBLE_COMPLEX_MAXULP_LA,DOUBLE_COMPLEX_MAXULP_EP/
write (*, 111) funcname
111 format ('Running ', A, ' functions:')
! Single precision test run begin
write (*, 112) TAB, funcname
112 format(A, 'Running ', A, ' with single precision real data type:')
errs = test_float(funcname)
total_errs = total_errs + errs
write (*, 113) TAB, funcname, TEST_RESULT(blend_int32((errs>0),2,1))
113 format(A, A, ' single precision real result: ', A)
! Single precision test run end
! Real double precision test run begin
write (*, 117) TAB, funcname
117 format(A, 'Running ', A, ' with double precision real data type:')
errs = test_double(funcname)
total_errs = total_errs + errs
write (*, 118) TAB, funcname, TEST_RESULT(blend_int32((errs>0),2,1))
118 format(A, A, ' double precision real result: ', A)
! Real double precision test run end
! Single precision complex test run begin
write (*, 115) TAB, funcname
115 format(A, 'Running ', A, ' with single precision complex data type:')
errs = test_float_complex(funcname)
total_errs = total_errs + errs
write (*, 116) TAB, funcname, TEST_RESULT(blend_int32((errs>0),2,1))
116 format(A, A, ' single precision complex result: ', A)
! Single precision complex test run end
! Complex double precision test run begin
write (*, 119) TAB, funcname
119 format(A, 'Running ', A, ' with double precision complex data type:')
errs = test_double_complex(funcname)
total_errs = total_errs + errs
write (*, 120) TAB, funcname, TEST_RESULT(blend_int32((errs>0),2,1))
120 format(A, A, ' double precision complex result: ', A)
! Complex double precision test run end
write (*, 121) funcname, TEST_RESULT(blend_int32((total_errs>0),2,1))
121 format(A, ' function result: ', A)
end program
! @brief Main test program end