Sujet : Re: Continuations
De : tkoenig (at) *nospam* netcologne.de (Thomas Koenig)
Groupes : comp.archDate : 18. Jul 2024, 08:54:23
Autres entêtes
Organisation : A noiseless patient Spider
Message-ID : <v7ahnf$2an0d$1@dont-email.me>
References : 1 2 3 4 5 6 7 8 9 10 11 12
User-Agent : slrn/1.0.3 (Linux)
Stephen Fuld <
SFuld@alumni.cmu.edu.invalid> schrieb:
[Arrhenius]
Good, I get that. But Thomas' original discussion of the problem
indicated that it was very parallel, so the question is, in your
design, how many of those calculations can go in in parallel?
I ran a little Arrhenius benchmark on an i7-11700. Main program was
program main
implicit none
integer, parameter :: n = 1024
double precision, dimension(n) :: k, a, ea, t
integer :: i
call random_number (a)
call random_number(ea)
ea = 10000+ea*30000
call random_number(t)
t = 400 + 200*t
do i=1,1024*1024
call arrhenius(k,a,ea,t,n)
end do
end program main
and the called routine was (in a separate file, so the compiler
could not notice that the results were actually never used)
subroutine arrhenius(k, a, ea, t, n)
implicit none
integer, intent(in) :: n
double precision, dimension(n), intent(out) :: k
double precision, dimension(n), intent(in) :: a, ea, t
double precision, parameter :: r = 8.314
k = a * exp(-ea/(r*t))
end subroutine arrhenius
Timing result (wall-clock time only):
-O0: 5.343s
-O2: 4.560s
-Ofast: 2.237s
-Ofast -march=native -mtune=native: 2.154
Of course, you kever know what speed your CPU is actually running
at these days, but if I assume 5GHz, that would give around 10
cycles per Arrhenius evaluation, which is quite fast (IMHO).
It uses an AVX2 version of exp, or so I gather from the function
name, _ZGVdN4v_exp_avx2 .