Re: else ladders practice

Liste des GroupesRevenir à cl c  
Sujet : Re: else ladders practice
De : antispam (at) *nospam* fricas.org (Waldek Hebisch)
Groupes : comp.lang.c
Date : 22. Nov 2024, 20:29:59
Autres entêtes
Organisation : To protect and to server
Message-ID : <vhqm3l$3ntp7$1@paganini.bofh.team>
References : 1 2 3 4 5 6 7 8 9 10 11 12 13 14
User-Agent : tin/2.6.2-20221225 ("Pittyvaich") (Linux/6.1.0-9-amd64 (x86_64))
Bart <bc@freeuk.com> wrote:
On 22/11/2024 12:33, Waldek Hebisch wrote:
Bart <bc@freeuk.com> wrote:
>
Sure. That's when you run a production build. I can even do that myself
on some programs (the ones where my C transpiler still works) and pass
it through gcc-O3. Then it might run 30% faster.
 
On fast machine running Dhrystone 2.2a I  get:
 
tcc-0.9.28rc    20000000
gcc-12.2 -O     64184852
gcc-12.2 -O2    83194672
clang-14 -O     83194672
clang-14 -O2    85763288
 
so with 02 this is more than 4 times faster.  Dhrystone correlated
resonably with runtime of tight compute-intensive programs.
Compiler started to cheat on original Dhrystone, so there are
bigger benchmarks like SPEC INT.  But Dhrystone 2 has modifications
to make cheating harder, so I think it is still reasonable
benchmark.  Actually, difference may be much bigger, for example
in image processing both clang and gcc can use vector intructions,
with may give additional speedup of order 8-16.
 
30% above means that you are much better than tcc or your program
is badly behaving (I have programs that make intensive use of
memory, here effect of optimization would be smaller, but still
of order 2).
 
The 30% applies to my typical programs, not benchmarks. Sure, gcc -O3
can do a lot of aggressive optimisations when everything is contained
within one short module and most runtime is spent in clear bottlenecks.
 
Real apps, like say my compilers, are different. They tend to use
globals more, program flow is more disseminated. The bottlenecks are
harder to pin down.
 
But, OK, here's the first sizeable benchmark that I thought of (I can't
find a reliable Dhrystone one; perhaps you can post a link).

First Google hit for Dhrystone 2.2a

https://homepages.cwi.nl/~steven/dry.chttps://homepages.cwi.nl/~steven/dry.c

(I used this one).

Compiled in two steps like:

gcc -c -O -o dry.o dry.c
gcc -o dry2 -DPASS2 -O dry.c dry.o

If you want something practical, I have the following C function:

#include <stdint.h>
void inner_mul(uint32_t * x, uint32_t * y, uint32_t * z,
               uint32_t xdeg, uint32_t ydeg, uint32_t zdeg, uint32_t p) {
    if (ydeg < xdeg) {
        uint32_t * tmpp = x;
        uint32_t tmp = xdeg;
        x = y;
        xdeg = ydeg;
        y = tmpp;
        ydeg = tmp;
    }
    if (zdeg < xdeg) {
        xdeg = zdeg;
    }
    if (zdeg < ydeg) {
        ydeg = zdeg;
    }
    uint64_t ss;
    long i;
    long j;
    for(i=0; i<=xdeg; i++) {
        ss = z[i];
        for(j=0; j<=i; j++) {
            ss += ((uint64_t)(x[i-j]))*((uint64_t)(y[j]));
        }
        z[i] = ss%p;
    }
    for(i=xdeg+1; i<=ydeg; i++) {
        ss = z[i];
        for(j=0; j<=xdeg; j++) {
            ss += ((uint64_t)(x[j]))*((uint64_t)(y[i-j]));
        }
        z[i] = ss%p;
    }
    for(i=ydeg+1; i<=zdeg; i++) {
        ss = z[i];
        for(j=i-xdeg; j<=ydeg; j++) {
            ss += ((uint64_t)(x[i-j]))*((uint64_t)(y[j]));
        }
        z[i] = ss%p;
    }
}

and the following test driver:

#include <stdio.h>
#include <stdint.h>
#include <sys/time.h>

extern void inner_mul(uint32_t * x, uint32_t * y, uint32_t * z,
               uint32_t xdeg, uint32_t ydeg, uint32_t zdeg, uint32_t p);

int main(void) {
    uint32_t x[85], y[85], z[169];
    int i;
    for(i=0;i<85;i++) {
        x[i] = 1;
        y[i] = 1;
    }

    struct timeval tv1, tv2;
    gettimeofday(&tv1, 0);
    int j;
    for(j=0; j < 100000; j++) {
        for(i=0;i<169; i++) {
            z[i] = 1;
        }
        inner_mul(x, y, z, 84, 84, 168, 1000003);
    }
    gettimeofday(&tv2, 0);
    for(i=0;i<12; i++) {
        printf(" %u,", z[i]);
    }
    putchar('\n');
    long tt = tv2.tv_sec - tv1.tv_sec;
    tt *= (1000*1000);
    tt += (tv2.tv_usec - tv1.tv_usec);
    printf("Time: %ld us\n", tt);
    return 0;   
}

At least for gcc and clang put them is separate files to avoid
simplifing the task too much ('inner_mul' is supposed to work
with variable data, here we feed it the same thing several times).
Of course, the test driver is silly, but 'inner_mul' is doing
important computation and, as long as 'inner_mul' is compiled
without knowledge of actual parameters, the test should be fair.
My results are:

clang -O3 -march=native       126112us
clang -O3                     222136us
clang -O                      225855us
gcc -O3 -march=native          82809us
gcc -O3                       114365us
gcc -O                        287786us
tcc                           757347us

There is some irregularity in timings, but this shows that
factor of order 9 is possible.

Notes:
- this code is somewhat hard to vectorize, but clang
  and gcc manage to do this,
- vectorized code is sensitive to alignment of the data, some
  variation may be due to this,
- modern processors dynamically change clock frequency, the
  times seem to be high enough to trigger switch to maximal
  frequency (initally I used smaller number of iterations
  but timing were less regular),
- most of code is portable, but for timing we need timer with
  sufficient resolution, so I use Unix 'gettimeofday'.

--
                              Waldek Hebisch

Date Sujet#  Auteur
31 Oct 24 * else ladders practice255fir
31 Oct 24 +* Re: else ladders practice9Anton Shepelev
31 Oct 24 i+- Re: else ladders practice1fir
31 Oct 24 i`* Re: else ladders practice7James Kuyper
1 Nov 24 i `* Re: else ladders practice6David Brown
2 Nov 24 i  +* Re: else ladders practice2James Kuyper
2 Nov 24 i  i`- Re: else ladders practice1David Brown
2 Nov 24 i  `* Re: else ladders practice3fir
2 Nov 24 i   +- Re: else ladders practice1David Brown
2 Nov 24 i   `- Re: else ladders practice1James Kuyper
31 Oct 24 +* Re: else ladders practice5Richard Harnden
31 Oct 24 i+* Re: else ladders practice3fir
31 Oct 24 ii`* Re: else ladders practice2fir
31 Oct 24 ii `- Re: else ladders practice1fir
31 Oct 24 i`- Re: else ladders practice1Bonita Montero
31 Oct 24 +* Re: else ladders practice22Dan Purgert
31 Oct 24 i+* Re: else ladders practice3fir
31 Oct 24 ii`* Re: else ladders practice2Dan Purgert
31 Oct 24 ii `- Re: else ladders practice1fir
16 Nov 24 i`* Re: else ladders practice18Stefan Ram
16 Nov 24 i +* Re: else ladders practice5Bart
16 Nov 24 i i`* Re: else ladders practice4David Brown
19 Nov 24 i i `* Re: else ladders practice3Janis Papanagnou
19 Nov 24 i i  +- Re: else ladders practice1David Brown
19 Nov 24 i i  `- Re: else ladders practice1Michael S
16 Nov 24 i +* Re: else ladders practice3James Kuyper
19 Nov 24 i i`* Re: else ladders practice2Janis Papanagnou
1 Dec 24 i i `- Re: else ladders practice1Tim Rentsch
16 Nov 24 i +* Re: else ladders practice2Lew Pitcher
17 Nov 24 i i`- Re: else ladders practice1Tim Rentsch
20 Nov 24 i +* Re: else ladders practice3Dan Purgert
30 Nov 24 i i`* Re: else ladders practice2Rosario19
5 Dec 24 i i `- Re: else ladders practice1Dan Purgert
1 Dec 24 i `* Re: else ladders practice4Waldek Hebisch
1 Dec 24 i  `* Re: else ladders practice3Janis Papanagnou
2 Dec 24 i   `* Re: else ladders practice2Waldek Hebisch
2 Dec 24 i    `- Re: else ladders practice1Janis Papanagnou
31 Oct 24 +- Re: else ladders practice1Janis Papanagnou
31 Oct 24 `* Re: else ladders practice217Bart
1 Nov 24  `* Re: else ladders practice216fir
1 Nov 24   +* Re: else ladders practice198Bart
1 Nov 24   i+* Re: else ladders practice196fir
1 Nov 24   ii`* Re: else ladders practice195Bart
1 Nov 24   ii `* Re: else ladders practice194fir
1 Nov 24   ii  `* Re: else ladders practice193fir
1 Nov 24   ii   `* Re: else ladders practice192Bart
1 Nov 24   ii    `* Re: else ladders practice191David Brown
1 Nov 24   ii     `* Re: else ladders practice190Bart
1 Nov 24   ii      `* Re: else ladders practice189David Brown
1 Nov 24   ii       `* Re: else ladders practice188Bart
2 Nov 24   ii        `* Re: else ladders practice187David Brown
2 Nov 24   ii         `* Re: else ladders practice186Bart
3 Nov 24   ii          +- Re: else ladders practice1Tim Rentsch
3 Nov 24   ii          +* Re: else ladders practice4fir
3 Nov 24   ii          i`* Re: else ladders practice3Bart
3 Nov 24   ii          i `* Re: else ladders practice2fir
3 Nov 24   ii          i  `- Re: else ladders practice1fir
3 Nov 24   ii          +* Re: else ladders practice4fir
3 Nov 24   ii          i`* Re: else ladders practice3Bart
3 Nov 24   ii          i `* Re: else ladders practice2fir
3 Nov 24   ii          i  `- Re: else ladders practice1fir
3 Nov 24   ii          +* Re: else ladders practice35David Brown
3 Nov 24   ii          i+- Re: else ladders practice1Kaz Kylheku
3 Nov 24   ii          i+* Re: else ladders practice23Bart
4 Nov 24   ii          ii+* Re: else ladders practice21David Brown
4 Nov 24   ii          iii`* Re: else ladders practice20Bart
4 Nov 24   ii          iii +* Re: else ladders practice2David Brown
5 Nov 24   ii          iii i`- Re: else ladders practice1Bart
5 Nov 24   ii          iii `* Re: else ladders practice17David Brown
5 Nov 24   ii          iii  +* Re: else ladders practice2Bart
5 Nov 24   ii          iii  i`- Re: else ladders practice1David Brown
6 Nov 24   ii          iii  +* Re: else ladders practice5Bart
6 Nov 24   ii          iii  i`* Re: else ladders practice4David Brown
6 Nov 24   ii          iii  i `* Re: else ladders practice3Bart
7 Nov 24   ii          iii  i  `* Re: else ladders practice2David Brown
7 Nov 24   ii          iii  i   `- Re: else ladders practice1Bart
9 Nov 24   ii          iii  `* Re: else ladders practice9Janis Papanagnou
9 Nov 24   ii          iii   `* Re: else ladders practice8David Brown
10 Nov 24   ii          iii    `* Re: else ladders practice7Janis Papanagnou
10 Nov 24   ii          iii     `* Re: else ladders practice6David Brown
19 Nov 24   ii          iii      `* Re: else ladders practice5Janis Papanagnou
19 Nov 24   ii          iii       `* Re: else ladders practice4David Brown
19 Nov 24   ii          iii        `* Re: else ladders practice3Janis Papanagnou
19 Nov 24   ii          iii         `* Re: else ladders practice2David Brown
20 Nov 24   ii          iii          `- Re: else ladders practice1Janis Papanagnou
9 Nov 24   ii          ii`- Re: else ladders practice1Janis Papanagnou
8 Nov 24   ii          i+* Re: else ladders practice9Janis Papanagnou
8 Nov 24   ii          ii+* Re: else ladders practice4David Brown
9 Nov 24   ii          iii`* Re: else ladders practice3Janis Papanagnou
9 Nov 24   ii          iii `* Re: else ladders practice2David Brown
10 Nov 24   ii          iii  `- Re: else ladders practice1Janis Papanagnou
9 Nov 24   ii          ii`* Re: else ladders practice4Bart
9 Nov 24   ii          ii `* Re: else ladders practice3Janis Papanagnou
9 Nov 24   ii          ii  `* Re: else ladders practice2Bart
10 Nov 24   ii          ii   `- Re: else ladders practice1Janis Papanagnou
8 Nov 24   ii          i`- Re: else ladders practice1Bart
5 Nov 24   ii          `* Re: else ladders practice141Waldek Hebisch
5 Nov 24   ii           +- Re: else ladders practice1fir
5 Nov 24   ii           +* Re: else ladders practice24David Brown
5 Nov 24   ii           i+* Re: else ladders practice17Waldek Hebisch
5 Nov 24   ii           ii`* Re: else ladders practice16David Brown
6 Nov 24   ii           i`* Re: else ladders practice6Bart
5 Nov 24   ii           `* Re: else ladders practice115Bart
1 Nov 24   i`- Re: else ladders practice1fir
2 Nov 24   `* Re: else ladders practice17Tim Rentsch

Haut de la page

Les messages affichés proviennent d'usenet.

NewsPortal