Re: Good hash for pointers

Liste des GroupesRevenir à l c 
Sujet : Re: Good hash for pointers
De : tr.17687 (at) *nospam* z991.linuxsc.com (Tim Rentsch)
Groupes : comp.lang.c
Date : 26. May 2024, 07:54:59
Autres entêtes
Organisation : A noiseless patient Spider
Message-ID : <86plt9f77w.fsf@linuxsc.com>
References : 1 2 3 4 5 6 7 8
User-Agent : Gnus/5.11 (Gnus v5.11) Emacs/22.4 (gnu/linux)
bart <bc@freeuk.com> writes:

On 25/05/2024 19:12, Tim Rentsch wrote:
>
bart <bc@freeuk.com> writes:
>
On 25/05/2024 10:12, Tim Rentsch wrote:
>
bart <bc@freeuk.com> writes:
[...]
It looks like your hash function was tuned for this testing
setup.  With different choices for testing it does much
worse.
>
It wasn't tuned at all;  it was a copy of what I've longed used
within my lexers.
>
It is designed to give a good spread when inputs are similar or
change sequentially.  (Well, not exactly designed;  more trial
and error.)
>
The trial and error process you mention had the effect of tuning
your function for the testing setup used, whether you meant it to
or not.
>
>
The testing is done with malloc() blocks all of the same
requested size, and that size is 1.  Typical workloads
are likely to be both larger and more variable.
>
When adding a new entry finds a collision with an old
entry, linear probing is used to find a free slot.  It's
well understood that linear probing suffers badly as the
load factor goes up.  Better to take a few high bits of
the hash value -- as few as five or six is fine -- to
have the reprobe sequences have different strides.
>
Your hash function is expensive to compute, moreso even
than the "FNV" function shown earlier.
>
You mean Malcolm's version?  That also uses a loop.  Neither
need to process all 8 bytes of a 64-bit pointer; 6 will do, and
probably just 4.
And such a loop can be trivially unrolled.
>
Both your function and the function posted by Malcolm are slow.
It's just that yours is slower.
>
With unoptimised code, mine is up to 50% faster (suggesting an
inherently faster method).
>
Optimised via gcc -O3, it's 17% slower.

On my test platform the Malcolm code runs between dead even up
to about 44% faster, depending on optimization level, except
for -O0 where it runs about 12% slower.  Note that these results
represent only one measurement of each case, run on only one
target environment.

If I duplicate the tests in my language, which has a meagre optimiser,
then mine is 3 times the speed (I haven't looked at why;  something
about the MM code that keeps locals out of registers I guess):
>
                Unopt     gcc-O3   My lang opt (uses 64 bits)
>
MM hash         2.7       0.34     1.9 seconds
BC hash         1.7       0.4      0.6
>
There are anyway any number of tweaks that can be made;  no need to do
all 64 bits for a start, and the loops can be manually
unrolled.  Suggesting it is slower is premature.

I am simply reporting results of empirical observations made on
my test system.  Obviously different environments might produce
different results.

It isn't hard to produce a
hash function of equal quality that runs more than twice as
fast.
>
With or without the aid of an optimising compiler?  You can't always
tell with the latter whether it's you who's clever, or the compiler.

As long as both functions are run at the same level of optimization
I don't think it matters much.  For the sake of concreteness we can
stipulate code quality comparable to gcc -O1 if that helps you.

Let me tell you a little secret.  Hash functions are like random
number generators:  good ones never do exceptionally well or
exceptionally poorly.  The reason for this is simple - whenever
there are scenarios where one does well then inevitably there are
other scenarios where it does poorly. >  The ones
that are left are all, with high probability, just fine no matter
what application they are used in.  That is not to say there is
no need to test within the intended application, it's always good
to do a sanity check, but the point of the sanity check is just
to make sure the earlier process didn't miss something;  it isn't
the right way to compare alternatives.
>
So what's a good one for use within the identifier lookup of a
compiler's lexer?

That depends a lot on what sort of lookup is done and on the
degree to which the hash computation is integrated into the
lexing code.  How many entries will the hash table have?
That also makes a difference.  If the hash table is never
more than 50% full then we can expect most lookups will
succeed on the first try and almost all will succeed in
no more than two tries.

Date Sujet#  Auteur
23 May 24 * Good hash for pointers111Malcolm McLean
23 May 24 +* Re: Good hash for pointers3Richard Harnden
24 May 24 i`* Re: Good hash for pointers2Keith Thompson
24 May 24 i `- Re: Good hash for pointers1Malcolm McLean
23 May 24 +- Re: Good hash for pointers1Kaz Kylheku
24 May 24 +* Re: Good hash for pointers16Tim Rentsch
24 May 24 i`* Re: Good hash for pointers15Malcolm McLean
24 May 24 i `* Re: Good hash for pointers14Tim Rentsch
24 May 24 i  `* Re: Good hash for pointers13Malcolm McLean
24 May 24 i   +* Re: Good hash for pointers11Tim Rentsch
24 May 24 i   i`* Re: Good hash for pointers10bart
24 May 24 i   i +* Re: Good hash for pointers3Malcolm McLean
24 May 24 i   i i`* Re: Good hash for pointers2Chris M. Thomasson
24 May 24 i   i i `- Re: Good hash for pointers1Chris M. Thomasson
24 May 24 i   i +* Re: Good hash for pointers3Tim Rentsch
24 May 24 i   i i`* Re: Good hash for pointers2Malcolm McLean
25 May 24 i   i i `- Re: Good hash for pointers1Tim Rentsch
24 May 24 i   i `* Re: Good hash for pointers3David Brown
24 May 24 i   i  `* Re: Good hash for pointers2Malcolm McLean
24 May 24 i   i   `- Re: Good hash for pointers1Michael S
24 May 24 i   `- Re: Good hash for pointers1David Brown
24 May 24 +* Re: Good hash for pointers3Chris M. Thomasson
24 May 24 i`* Re: Good hash for pointers2Chris M. Thomasson
24 May 24 i `- Re: Good hash for pointers1jak
24 May 24 +* Re: Good hash for pointers84Bonita Montero
24 May 24 i`* Re: Good hash for pointers83Malcolm McLean
25 May 24 i `* Re: Good hash for pointers82bart
25 May 24 i  `* Re: Good hash for pointers81Tim Rentsch
25 May 24 i   +* Re: Good hash for pointers4bart
25 May 24 i   i`* Re: Good hash for pointers3Tim Rentsch
25 May 24 i   i `* Re: Good hash for pointers2bart
26 May 24 i   i  `- Re: Good hash for pointers1Tim Rentsch
25 May 24 i   `* Re: Good hash for pointers76Bonita Montero
25 May 24 i    `* Re: Good hash for pointers75Tim Rentsch
25 May 24 i     +* Re: Good hash for pointers9Malcolm McLean
25 May 24 i     i+- Re: Good hash for pointers1Tim Rentsch
25 May 24 i     i+* Re: Good hash for pointers5Janis Papanagnou
26 May 24 i     ii`* Re: Good hash for pointers4Malcolm McLean
26 May 24 i     ii `* Re: Good hash for pointers3bart
26 May 24 i     ii  +- Re: Good hash for pointers1Richard Harnden
27 May 24 i     ii  `- Re: Good hash for pointers1Kaz Kylheku
26 May 24 i     i`* Re: Good hash for pointers2Bonita Montero
26 May 24 i     i `- Re: Good hash for pointers1Bonita Montero
26 May 24 i     `* Re: Good hash for pointers65Bonita Montero
26 May 24 i      `* Re: Good hash for pointers64Tim Rentsch
26 May 24 i       `* Re: Good hash for pointers63Bonita Montero
26 May 24 i        `* Re: Good hash for pointers62Tim Rentsch
26 May 24 i         +* Re: Good hash for pointers29Bonita Montero
26 May 24 i         i+- Re: Good hash for pointers1Bonita Montero
27 May 24 i         i+* Re: Good hash for pointers5Bonita Montero
28 May 24 i         ii`* Re: Good hash for pointers4Ben Bacarisse
30 May 24 i         ii `* Re: Good hash for pointers3Bonita Montero
30 May 24 i         ii  +- Re: Good hash for pointers1Bonita Montero
31 May 24 i         ii  `- Re: Good hash for pointers1Tim Rentsch
31 May 24 i         i`* Re: Good hash for pointers22Tim Rentsch
2 Jun 24 i         i `* Re: Good hash for pointers21Michael S
2 Jun 24 i         i  +* Re: Good hash for pointers2Chris M. Thomasson
3 Jun 24 i         i  i`- Re: Good hash for pointers1Chris M. Thomasson
3 Jun 24 i         i  +* Re: Good hash for pointers5Tim Rentsch
3 Jun 24 i         i  i`* Re: Good hash for pointers4Michael S
4 Jun 24 i         i  i `* Re: Good hash for pointers3Tim Rentsch
4 Jun 24 i         i  i  `* Re: Good hash for pointers2Michael S
5 Jun 24 i         i  i   `- Re: Good hash for pointers1Tim Rentsch
3 Jun 24 i         i  `* Re: Good hash for pointers13Bonita Montero
3 Jun 24 i         i   `* Re: Good hash for pointers12Michael S
3 Jun 24 i         i    `* Re: Good hash for pointers11Bonita Montero
3 Jun 24 i         i     +* Re: Good hash for pointers8bart
3 Jun 24 i         i     i+* Re: Good hash for pointers6Michael S
3 Jun 24 i         i     ii+* Re: Good hash for pointers4bart
3 Jun 24 i         i     iii+- Re: Good hash for pointers1Michael S
3 Jun 24 i         i     iii+- Re: Good hash for pointers1Michael S
4 Jun 24 i         i     iii`- Re: Good hash for pointers1Tim Rentsch
4 Jun 24 i         i     ii`- Re: Good hash for pointers1Tim Rentsch
3 Jun 24 i         i     i`- Re: Good hash for pointers1Bonita Montero
3 Jun 24 i         i     `* Re: Good hash for pointers2Michael S
3 Jun 24 i         i      `- Re: Good hash for pointers1Bonita Montero
4 Jun 24 i         `* Re: Good hash for pointers32Michael S
5 Jun 24 i          +* Re: Good hash for pointers4Bonita Montero
5 Jun 24 i          i`* Re: Good hash for pointers3Michael S
5 Jun 24 i          i `* Re: Good hash for pointers2Bonita Montero
5 Jun 24 i          i  `- Re: Good hash for pointers1Michael S
5 Jun 24 i          +* Re: Good hash for pointers17Tim Rentsch
5 Jun 24 i          i+* AES problem (was: Good hash for pointers)2Michael S
6 Jun 24 i          ii`- Re: AES problem (was: Good hash for pointers)1Tim Rentsch
5 Jun 24 i          i+* Re: Good hash for pointers11Michael S
6 Jun 24 i          ii`* Re: Good hash for pointers10Tim Rentsch
6 Jun 24 i          ii `* Re: Good hash for pointers9Michael S
17 Jun 24 i          ii  `* Re: Good hash for pointers8Tim Rentsch
17 Jun 24 i          ii   `* Re: Good hash for pointers7Michael S
18 Jun 24 i          ii    `* Re: Good hash for pointers6Tim Rentsch
19 Jun 24 i          ii     +* Re: Good hash for pointers2Keith Thompson
19 Jun 24 i          ii     i`- Re: Good hash for pointers1Tim Rentsch
19 Jun 24 i          ii     `* Re: Good hash for pointers3James Kuyper
19 Jun 24 i          ii      +- Re: Good hash for pointers1Keith Thompson
23 Jun 24 i          ii      `- Re: Good hash for pointers1Tim Rentsch
6 Jun 24 i          i`* Re: Good hash for pointers3Michael S
16 Jun 24 i          i `* Re: Good hash for pointers2Tim Rentsch
16 Jun 24 i          i  `- Re: Good hash for pointers1Chris M. Thomasson
7 Jun 24 i          `* Re: Good hash for pointers10Bonita Montero
9 Jun 24 i           `* Re: Good hash for pointers9Bonita Montero
9 Jun 24 i            +* Re: Good hash for pointers2Richard Harnden
10 Jun 24 i            `* Re: Good hash for pointers6Malcolm McLean
26 May 24 +* Re: Good hash for pointers2Bonita Montero
26 May 24 `- Re: Good hash for pointers1Bonita Montero

Haut de la page

Les messages affichés proviennent d'usenet.

NewsPortal