Re: Good hash for pointers

Liste des GroupesRevenir à cl c  
Sujet : Re: Good hash for pointers
De : tr.17687 (at) *nospam* z991.linuxsc.com (Tim Rentsch)
Groupes : comp.lang.c
Date : 25. May 2024, 10:12:28
Autres entêtes
Organisation : A noiseless patient Spider
Message-ID : <86fru6gsqr.fsf@linuxsc.com>
References : 1 2 3 4
User-Agent : Gnus/5.11 (Gnus v5.11) Emacs/22.4 (gnu/linux)
bart <bc@freeuk.com> writes:

On 24/05/2024 19:57, Malcolm McLean wrote:
>
On 24/05/2024 19:28, Bonita Montero wrote:
>
Am 23.05.2024 um 13:11 schrieb Malcolm McLean:
>
What is a good hash function for pointers to use in portable ANSI C?
>
The pointers are nodes of a tree, which are read only, and I want
to associate read/write data with them.  So potentially a lage
number of pointers,and they might be consecutively ordered if they
are taken from an array, or they might be returned from repeared
calls to malloc() with small allocations.  Obviously I have no
control over pointer size or internal representation.
>
Use FNV.
>
Here's an attempt.
>
/* FNV hash of a pointer */
static unsigned int hash(void *address)
{
  int i;
  unsigned long answer = 2166136261;
  unsigned char *byte = (unsigned char *) &address;
>
  for (i = 0; i < sizeof(void *); i++)
  {
  answer *= 16777619;
  answer ^= byte[i];
  }
  return (unsigned int) (answer & 0xFFFFFFFF);
}
>
Now what will compilers make of that?
>
Compiler, or performance?
>
I tried this function with the test program shown below.  I used it to
populate a hash table of 64K entries with pointers from successive
calls to malloc.
>
Results, in terms of clashes, for different numbers N of entries
populated out of 64K were:
>
 10000     1100
 30000    12000
 50000    67000
 60000   216000
 65535  5500000    (largest N possible)
>
Result were rather variable as malloc produces different patterns of
pointers on different runs.  These were simply the result from the
first run.
>
Was this good?  I'd no idea.  But as a comparison, I used my own hash
function, normally used to hash identifiers, shown below the main
program as the function 'bchash'.
>
If this is subsituted instead, the results were:
>
 10000      230
 30000     3800
 50000    10300
 60000    50300
 65535  2700000
>
Hash tables need a certain amount of free capacity to stay efficient,
so 3/4 full (about 50K/64K) is about right.
>
Again I don't know if these figures are good, they are just better
than from your hash() function, for the inputs I used, within this
test, and for this size of table.
>
No doubt there are much better ones.
>
>
>
------------------------------------------
#include <stdio.h>
#include <stdlib.h>
>
static unsigned int hash(void *address)
{
    int i;
    unsigned long answer = 2166136261;
    unsigned char *byte = (unsigned char *) &address;
>
    for (i = 0; i < sizeof(void *); i++)
    {
        answer *= 16777619;
        answer ^= byte[i];
    }
    return (unsigned int) (answer & 0xFFFFFFFF);
}
>
void* table[65536];
>
int main(void) {
    void* p;
>
    int clashes=0, wrapped;
    int j;
>
    for (int i=0; i<30000; ++i) {
        p = malloc(1);
        j = hash(p) & 65535;
>
        wrapped=0;
        while (table[j]) {
            ++clashes;
            ++j;
            if (j>65535) {
                if (wrapped) { puts("Table full"); exit(1);}
                wrapped=1;
                j=0;
            }
        }
        table[j] = p;
>
    }
    printf("Clashes %d\n", clashes);
}
>
>
------------------------------------------
>
static unsigned int bchash(void *address)
{
    int i;
    unsigned long hsum = 0;
    unsigned char *byte = (unsigned char *) &address;
>
    for (i = 0; i < sizeof(void *); i++)   {
        hsum = (hsum<<4) - hsum + byte[i];
    }
    return (hsum<<5) - hsum;
}

It looks like your hash function was tuned for this testing
setup.  With different choices for testing it does much
worse.

The testing is done with malloc() blocks all of the same
requested size, and that size is 1.  Typical workloads
are likely to be both larger and more variable.

When adding a new entry finds a collision with an old
entry, linear probing is used to find a free slot.  It's
well understood that linear probing suffers badly as the
load factor goes up.  Better to take a few high bits of
the hash value -- as few as five or six is fine -- to
have the reprobe sequences have different strides.

Your hash function is expensive to compute, moreso even
than the "FNV" function shown earlier.  In a case like
this one where the compares are cheap, it's better to
have a dumb-but-fast hash function that might need a
few more looks to find an open slot, because the cost
of looking is so cheap compared to computing the hash
function.

Date Sujet#  Auteur
23 May 24 * Good hash for pointers111Malcolm McLean
23 May 24 +* Re: Good hash for pointers3Richard Harnden
24 May 24 i`* Re: Good hash for pointers2Keith Thompson
24 May 24 i `- Re: Good hash for pointers1Malcolm McLean
23 May 24 +- Re: Good hash for pointers1Kaz Kylheku
24 May 24 +* Re: Good hash for pointers16Tim Rentsch
24 May 24 i`* Re: Good hash for pointers15Malcolm McLean
24 May 24 i `* Re: Good hash for pointers14Tim Rentsch
24 May 24 i  `* Re: Good hash for pointers13Malcolm McLean
24 May 24 i   +* Re: Good hash for pointers11Tim Rentsch
24 May 24 i   i`* Re: Good hash for pointers10bart
24 May 24 i   i +* Re: Good hash for pointers3Malcolm McLean
24 May 24 i   i i`* Re: Good hash for pointers2Chris M. Thomasson
24 May 24 i   i i `- Re: Good hash for pointers1Chris M. Thomasson
24 May 24 i   i +* Re: Good hash for pointers3Tim Rentsch
24 May 24 i   i i`* Re: Good hash for pointers2Malcolm McLean
25 May 24 i   i i `- Re: Good hash for pointers1Tim Rentsch
24 May 24 i   i `* Re: Good hash for pointers3David Brown
24 May 24 i   i  `* Re: Good hash for pointers2Malcolm McLean
24 May 24 i   i   `- Re: Good hash for pointers1Michael S
24 May 24 i   `- Re: Good hash for pointers1David Brown
24 May 24 +* Re: Good hash for pointers3Chris M. Thomasson
24 May 24 i`* Re: Good hash for pointers2Chris M. Thomasson
24 May 24 i `- Re: Good hash for pointers1jak
24 May 24 +* Re: Good hash for pointers84Bonita Montero
24 May 24 i`* Re: Good hash for pointers83Malcolm McLean
25 May 24 i `* Re: Good hash for pointers82bart
25 May 24 i  `* Re: Good hash for pointers81Tim Rentsch
25 May 24 i   +* Re: Good hash for pointers4bart
25 May 24 i   i`* Re: Good hash for pointers3Tim Rentsch
25 May 24 i   i `* Re: Good hash for pointers2bart
26 May 24 i   i  `- Re: Good hash for pointers1Tim Rentsch
25 May 24 i   `* Re: Good hash for pointers76Bonita Montero
25 May 24 i    `* Re: Good hash for pointers75Tim Rentsch
25 May 24 i     +* Re: Good hash for pointers9Malcolm McLean
25 May 24 i     i+- Re: Good hash for pointers1Tim Rentsch
25 May 24 i     i+* Re: Good hash for pointers5Janis Papanagnou
26 May 24 i     ii`* Re: Good hash for pointers4Malcolm McLean
26 May 24 i     ii `* Re: Good hash for pointers3bart
26 May 24 i     ii  +- Re: Good hash for pointers1Richard Harnden
27 May 24 i     ii  `- Re: Good hash for pointers1Kaz Kylheku
26 May 24 i     i`* Re: Good hash for pointers2Bonita Montero
26 May 24 i     i `- Re: Good hash for pointers1Bonita Montero
26 May 24 i     `* Re: Good hash for pointers65Bonita Montero
26 May 24 i      `* Re: Good hash for pointers64Tim Rentsch
26 May 24 i       `* Re: Good hash for pointers63Bonita Montero
26 May 24 i        `* Re: Good hash for pointers62Tim Rentsch
26 May 24 i         +* Re: Good hash for pointers29Bonita Montero
26 May 24 i         i+- Re: Good hash for pointers1Bonita Montero
27 May 24 i         i+* Re: Good hash for pointers5Bonita Montero
28 May 24 i         ii`* Re: Good hash for pointers4Ben Bacarisse
30 May 24 i         ii `* Re: Good hash for pointers3Bonita Montero
30 May 24 i         ii  +- Re: Good hash for pointers1Bonita Montero
31 May 24 i         ii  `- Re: Good hash for pointers1Tim Rentsch
31 May 24 i         i`* Re: Good hash for pointers22Tim Rentsch
2 Jun 24 i         i `* Re: Good hash for pointers21Michael S
2 Jun 24 i         i  +* Re: Good hash for pointers2Chris M. Thomasson
3 Jun 24 i         i  i`- Re: Good hash for pointers1Chris M. Thomasson
3 Jun 24 i         i  +* Re: Good hash for pointers5Tim Rentsch
3 Jun 24 i         i  i`* Re: Good hash for pointers4Michael S
4 Jun 24 i         i  i `* Re: Good hash for pointers3Tim Rentsch
4 Jun 24 i         i  i  `* Re: Good hash for pointers2Michael S
5 Jun 24 i         i  i   `- Re: Good hash for pointers1Tim Rentsch
3 Jun 24 i         i  `* Re: Good hash for pointers13Bonita Montero
3 Jun 24 i         i   `* Re: Good hash for pointers12Michael S
3 Jun 24 i         i    `* Re: Good hash for pointers11Bonita Montero
3 Jun 24 i         i     +* Re: Good hash for pointers8bart
3 Jun 24 i         i     i+* Re: Good hash for pointers6Michael S
3 Jun 24 i         i     ii+* Re: Good hash for pointers4bart
3 Jun 24 i         i     iii+- Re: Good hash for pointers1Michael S
3 Jun 24 i         i     iii+- Re: Good hash for pointers1Michael S
4 Jun 24 i         i     iii`- Re: Good hash for pointers1Tim Rentsch
4 Jun 24 i         i     ii`- Re: Good hash for pointers1Tim Rentsch
3 Jun 24 i         i     i`- Re: Good hash for pointers1Bonita Montero
3 Jun 24 i         i     `* Re: Good hash for pointers2Michael S
3 Jun 24 i         i      `- Re: Good hash for pointers1Bonita Montero
4 Jun 24 i         `* Re: Good hash for pointers32Michael S
5 Jun 24 i          +* Re: Good hash for pointers4Bonita Montero
5 Jun 24 i          i`* Re: Good hash for pointers3Michael S
5 Jun 24 i          i `* Re: Good hash for pointers2Bonita Montero
5 Jun 24 i          i  `- Re: Good hash for pointers1Michael S
5 Jun 24 i          +* Re: Good hash for pointers17Tim Rentsch
5 Jun 24 i          i+* AES problem (was: Good hash for pointers)2Michael S
6 Jun 24 i          ii`- Re: AES problem (was: Good hash for pointers)1Tim Rentsch
5 Jun 24 i          i+* Re: Good hash for pointers11Michael S
6 Jun 24 i          ii`* Re: Good hash for pointers10Tim Rentsch
6 Jun 24 i          ii `* Re: Good hash for pointers9Michael S
17 Jun 24 i          ii  `* Re: Good hash for pointers8Tim Rentsch
17 Jun 24 i          ii   `* Re: Good hash for pointers7Michael S
18 Jun 24 i          ii    `* Re: Good hash for pointers6Tim Rentsch
19 Jun 24 i          ii     +* Re: Good hash for pointers2Keith Thompson
19 Jun 24 i          ii     i`- Re: Good hash for pointers1Tim Rentsch
19 Jun 24 i          ii     `* Re: Good hash for pointers3James Kuyper
19 Jun 24 i          ii      +- Re: Good hash for pointers1Keith Thompson
23 Jun 24 i          ii      `- Re: Good hash for pointers1Tim Rentsch
6 Jun 24 i          i`* Re: Good hash for pointers3Michael S
16 Jun 24 i          i `* Re: Good hash for pointers2Tim Rentsch
16 Jun 24 i          i  `- Re: Good hash for pointers1Chris M. Thomasson
7 Jun 24 i          `* Re: Good hash for pointers10Bonita Montero
9 Jun 24 i           `* Re: Good hash for pointers9Bonita Montero
9 Jun 24 i            +* Re: Good hash for pointers2Richard Harnden
10 Jun 24 i            `* Re: Good hash for pointers6Malcolm McLean
26 May 24 +* Re: Good hash for pointers2Bonita Montero
26 May 24 `- Re: Good hash for pointers1Bonita Montero

Haut de la page

Les messages affichés proviennent d'usenet.

NewsPortal