Re: Sieve of Erastosthenes optimized to the max

Liste des GroupesRevenir à cl c++ 
Sujet : Re: Sieve of Erastosthenes optimized to the max
De : vir.campestris (at) *nospam* invalid.invalid (Vir Campestris)
Groupes : comp.lang.c++
Date : 01. Jun 2024, 22:07:56
Autres entêtes
Organisation : A noiseless patient Spider
Message-ID : <v3fv2u$2ursd$1@dont-email.me>
References : 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22
User-Agent : Mozilla Thunderbird
On 31/05/2024 06:17, Tim Rentsch wrote:
Vir Campestris <vir.campestris@invalid.invalid> writes:
 
On 22/05/2024 03:06, Tim Rentsch wrote:
>
Vir Campestris <vir.campestris@invalid.invalid> writes:
>
I've been playing with this.  Has it really been this long?  I
ought to have more time than this...  [...]
>
I missed Tim's Mod 30 trick, and that might well help.  But I
think I'm bored with this.  It would save a lot of memory, but
the extra computation might make it slower.
>
Two comments.  Using the mod 30 encoding can be speed competitive
with simpler encodings.  The second comment is, it isn't just
that less memory is used, but that more primes can be found.
Bonita brags about how fast some code is, but it's an apples
and oranges comparison - that code doesn't do the job because
it runs into a memory limit way sooner than using a mod 30
encoding, and trying to run past that limit would make the
code R E A L L Y  S L O W.
>
Maybe I'll try one day ;)
>
I hope you do.  If you have some difficulty seeing how to
avoid the seeming speed penalties, feel free to ask, I am
happy to help.
>
I think I've come to the end of what my code will do. [...]
 A few miscellaneous responses...
 
Firstly the C++ bit:
>
std::vector<bool> is much too slow to be any use.
[various alternatives]
 I simply used a static array, more accurately a union of
two arrays (code here is approximate only):
      #define LARGEST_N 2400000000000  // primes less than 2.4e12
      union {
         unsigned char       bytes[ LARGEST_N / 30 ];
         unsigned long long  ulls[  LARGEST_N / 30 / 8 ];
     } all;
 
That's the end of the on-topic part.
>
I don't store even numbers.  That's an optimisation, and is of course
equivalent to Tim's Mod 30 trick, but with mod 2 instead.  And it
doesn't require an divide which may be slow.
 Like I keep saying, keeping the bits in a mod 30 representation
doesn't need any divides (at least not in the main loop, where
multiples are being computed).
 
When I mask out just 3 I get a pattern like this:
>
2492492492492490
>
There is a single nybble at the end of that which is zero, but the
rest is a 3-nybble repeat.  [similarly for 3 and 5, etc...]
 In a mod 30 representation, the multiples of 7 repeat after 7
bytes.  If we replicate those 7 bytes 8 times, we get 7 unsigned
long longs.  Those 7 ULLs can be combined with logic operations into
the all.ulls[] array.  Similarly for 11, 13, etc, for as high as is
reasonable (I think I topped out at 29).
 
However I can build a mask from the ones for 3 and 5, which has a
repeat of 15 (3*5)
6692cd259a4b3490,
96692cd259a4b349,496692cd259a4b34,3496692cd259a4b3,b3496692cd259a4b,
4b3496692cd259a4,a4b3496692cd259a,9a4b3496692cd259,59a4b3496692cd25,
259a4b3496692cd2,d259a4b3496692cd,cd259a4b3496692c,2cd259a4b3496692,
92cd259a4b349669,692cd259a4b34966,6692cd259a4b3496
>
and copy that.
>
In fact I can do that with any number of single prime masks I want.
[...]
 I thought about doing something along these lines but decided it
was too much work.  So I just did the blocks of 7*8 bytes, 11*8
bytes, 13*8 bytes, etc, up to some fairly small prime, then
switched to an alternate method where individual bytes are
addressed and modified.
 
Note that although this has been quite a fun thing to play with it's
had almost no C++ content.  In fact I found that to get best
performance I had to drop down to C type code.
 Right.  I used straight C, and I don't see any part of the
program that would benefit from having C++.
 
I kept a class to
manage my buffers though, it's much easier to drop an object that
manages a malloc-ed buffer than to remember exactly when I need to
call free.  Correctness trumps speed.
 I used static allocation for the big bitmap, and small local
arrays (less than 50 * 8 bytes) for the small temporaries.
No malloc()s or free()s to worry about.  Initialized to zero
without any code needed.
 
Now Mod 30 - by storing the odd numbers only I'm halving the storage,
and I don't need to do an expensive divide.  It's only shifts and ANDs.
>
If I use mod 30 (30 = 2*3*5) conveniently there are 8 candidate
numbers I need to store a mask for.
 By special casing each of the possibilities (there are 8*8 == 64 of
them), only multiplies and adds are needed, and no shifts -- masks
are evaluated to compile-time constants.
 
Sadly my computer likes 64 bit
operations better.  But the required storage is now only 27% of the
original mask if all numbers are put in it.
 Once the primes get above a fairly small number, the density of
places to zap is low enough so that working in bytes is okay.
In particular once we are looking at primes above 240 there is
less than one bit per 64 bit ULL, which makes using bytes
rather than ULLs less painful
 
One C++ bit I did use was to template all my classes with the storetype, so I could run with 8, 32 or 64 bits. 64 was fastest. I didn't bother with 16. The store access is of course behind the hood a cache line.

I can carry on with this, but the storage doesn't have any more nice
sizes.  [...]
 Yeah.  It's a happy coincidence that mod 30 needs 8 bits
per block-of-30, and unfortunately no other nice size in
sight for larger prime-set mods.
 
Maybe I'll play with it.  But the rain stopped, and I must get on
with the weeding!
 Now you have something to look forward to the next time
it rains. :)
I'm missing something.
When setting the bits for a multiple of a reasonably large prime what I do is:
(1) Shift right 1 to drop the last bit.
(2) The remaining top bits tell me which word needs to be accessed. I shift the multiple right to get the index of the word.
(3) The remaining bottom bits tell me which bit in the word needs to be set, so I shift 1 left by those bottom bits to get a mask.
My operation (2) is a divide, by a power of 2.
Take 997 as an example:
multiple mod30 div30
997 7 33
1994 14 66
2991 21 99
3988 28 132
4985 5 166
5982 12 199
6979 19 232
7976 26 265
8973 3 299
9970 10 332
10967 17 365
11964 24 398
12961 1 432
13958 8 465
14955 15 498
15952 22 531
16949 29 564
17946 6 598
18943 13 631
19940 20 664
20937 27 697
21934 4 731
22931 11 764
23928 18 797
24925 25 830
25922 2 864
26919 9 897
27916 16 930
28913 23 963
29910 0 997
30907 7 1030
I don't see how you get that last column without a divide by 30.
Andy

Date Sujet#  Auteur
23 Mar 24 * Re: Sieve of Erastosthenes optimized to the max68Chris M. Thomasson
23 Mar 24 `* Re: Sieve of Erastosthenes optimized to the max67Bonita Montero
23 Mar 24  `* Re: Sieve of Erastosthenes optimized to the max66Chris M. Thomasson
24 Mar 24   `* Re: Sieve of Erastosthenes optimized to the max65Bonita Montero
24 Mar 24    `* Re: Sieve of Erastosthenes optimized to the max64Chris M. Thomasson
24 Mar 24     `* Re: Sieve of Erastosthenes optimized to the max63Bonita Montero
24 Mar 24      `* Re: Sieve of Erastosthenes optimized to the max62Chris M. Thomasson
16 May 24       `* Re: Sieve of Erastosthenes optimized to the max61Vir Campestris
16 May 24        +- Re: Sieve of Erastosthenes optimized to the max1Ben Bacarisse
22 May 24        `* Re: Sieve of Erastosthenes optimized to the max59Tim Rentsch
30 May 24         `* Re: Sieve of Erastosthenes optimized to the max58Vir Campestris
30 May 24          +- Re: Sieve of Erastosthenes optimized to the max1Bonita Montero
30 May 24          +* Re: Sieve of Erastosthenes optimized to the max3Paavo Helde
31 May 24          i`* Re: Sieve of Erastosthenes optimized to the max2Bonita Montero
31 May 24          i `- Re: Sieve of Erastosthenes optimized to the max1Paavo Helde
31 May 24          `* Re: Sieve of Erastosthenes optimized to the max53Tim Rentsch
1 Jun 24           `* Re: Sieve of Erastosthenes optimized to the max52Vir Campestris
2 Jun 24            +- Re: Sieve of Erastosthenes optimized to the max1Richard Damon
2 Jun 24            `* Re: Sieve of Erastosthenes optimized to the max50Tim Rentsch
3 Jun 24             `* Re: Sieve of Erastosthenes optimized to the max49Tim Rentsch
18 Jun 24              `* Re: Sieve of Erastosthenes optimized to the max48Vir Campestris
19 Jun 24               `* Re: Sieve of Erastosthenes optimized to the max47Tim Rentsch
30 Jun 24                `* Re: Sieve of Erastosthenes optimized to the max46Vir Campestris
2 Jul 24                 `* Re: Sieve of Erastosthenes optimized to the max45Tim Rentsch
2 Jul 24                  +- Re: Sieve of Erastosthenes optimized to the max1Vir Campestris
3 Jul 24                  `* Re: Sieve of Erastosthenes optimized to the max43Vir Campestris
15 Jul 24                   +- Re: Sieve of Erastosthenes optimized to the max1Tim Rentsch
20 Jul 24                   +* Re: Sieve of Erastosthenes optimized to the max40Tim Rentsch
25 Jul 24                   i`* OT: Re: Sieve of Erastosthenes optimized to the max39Vir Campestris
10 Aug 24                   i +* Re: OT: Re: Sieve of Erastosthenes optimized to the max36Tim Rentsch
12 Aug 24                   i i+* Re: OT: Re: Sieve of Erastosthenes optimized to the max2Vir Campestris
16 Aug 24                   i ii`- Re: OT: Re: Sieve of Erastosthenes optimized to the max1Tim Rentsch
15 Aug 24                   i i`* Re: OT: Re: Sieve of Erastosthenes optimized to the max33Vir Campestris
16 Aug 24                   i i `* Re: OT: Re: Sieve of Erastosthenes optimized to the max32Tim Rentsch
16 Aug 24                   i i  +* Re: OT: Re: Sieve of Erastosthenes optimized to the max30Bonita Montero
16 Aug 24                   i i  i+- Re: OT: Re: Sieve of Erastosthenes optimized to the max1Bonita Montero
19 Aug 24                   i i  i+* Re: OT: Re: Sieve of Erastosthenes optimized to the max12Vir Campestris
20 Aug 24                   i i  ii`* Re: OT: Re: Sieve of Erastosthenes optimized to the max11Bonita Montero
20 Aug 24                   i i  ii `* Re: OT: Re: Sieve of Erastosthenes optimized to the max10Bonita Montero
20 Aug 24                   i i  ii  +- Re: OT: Re: Sieve of Erastosthenes optimized to the max1Bonita Montero
20 Aug 24                   i i  ii  `* Re: OT: Re: Sieve of Erastosthenes optimized to the max8Vir Campestris
20 Aug 24                   i i  ii   +- Re: OT: Re: Sieve of Erastosthenes optimized to the max1Bonita Montero
26 Aug 24                   i i  ii   `* Re: OT: Re: Sieve of Erastosthenes optimized to the max6Tim Rentsch
27 Aug 24                   i i  ii    `* Re: OT: Re: Sieve of Erastosthenes optimized to the max5Bonita Montero
1 Sep 24                   i i  ii     `* Re: OT: Re: Sieve of Erastosthenes optimized to the max4Vir Campestris
2 Sep 24                   i i  ii      `* Re: OT: Re: Sieve of Erastosthenes optimized to the max3Tim Rentsch
2 Sep 24                   i i  ii       +- Re: OT: Re: Sieve of Erastosthenes optimized to the max1Bonita Montero
3 Sep 24                   i i  ii       `- Re: OT: Re: Sieve of Erastosthenes optimized to the max1Vir Campestris
19 Aug 24                   i i  i+* Re: OT: Re: Sieve of Erastosthenes optimized to the max12Vir Campestris
20 Aug 24                   i i  ii+* Re: OT: Re: Sieve of Erastosthenes optimized to the max4red floyd
20 Aug 24                   i i  iii+* Re: OT: Re: Sieve of Erastosthenes optimized to the max2Vir Campestris
26 Aug 24                   i i  iiii`- Re: OT: Re: Sieve of Erastosthenes optimized to the max1Tim Rentsch
26 Aug 24                   i i  iii`- Re: OT: Re: Sieve of Erastosthenes optimized to the max1Tim Rentsch
20 Aug 24                   i i  ii+* Re: OT: Re: Sieve of Erastosthenes optimized to the max3Bonita Montero
20 Aug 24                   i i  iii+- Re: OT: Re: Sieve of Erastosthenes optimized to the max1Bonita Montero
20 Aug 24                   i i  iii`- Re: OT: Re: Sieve of Erastosthenes optimized to the max1Bonita Montero
20 Aug 24                   i i  ii`* Re: OT: Re: Sieve of Erastosthenes optimized to the max4Bonita Montero
22 Aug 24                   i i  ii `* Re: OT: Re: Sieve of Erastosthenes optimized to the max3Vir Campestris
22 Aug 24                   i i  ii  `* Re: OT: Re: Sieve of Erastosthenes optimized to the max2Bonita Montero
22 Aug 24                   i i  ii   `- Re: OT: Re: Sieve of Erastosthenes optimized to the max1Vir Campestris
22 Aug 24                   i i  i`* Re: OT: Re: Sieve of Erastosthenes optimized to the max4Vir Campestris
23 Aug 24                   i i  i +* Re: OT: Re: Sieve of Erastosthenes optimized to the max2red floyd
26 Aug 24                   i i  i i`- Re: OT: Re: Sieve of Erastosthenes optimized to the max1Tim Rentsch
26 Aug 24                   i i  i `- Re: OT: Re: Sieve of Erastosthenes optimized to the max1Tim Rentsch
19 Aug 24                   i i  `- Re: OT: Re: Sieve of Erastosthenes optimized to the max1Tim Rentsch
11 Aug 24                   i +- Re: OT: Re: Sieve of Erastosthenes optimized to the max1Tim Rentsch
11 Aug 24                   i `- Re: OT: Re: Sieve of Erastosthenes optimized to the max1Tim Rentsch
23 Jul 24                   `- Re: Sieve of Erastosthenes optimized to the max1Tim Rentsch

Haut de la page

Les messages affichés proviennent d'usenet.

NewsPortal