Sujet : Re: Good hash for pointers
De : already5chosen (at) *nospam* yahoo.com (Michael S)
Groupes : comp.lang.cDate : 06. Jun 2024, 09:00:09
Autres entêtes
Organisation : A noiseless patient Spider
Message-ID : <20240606110009.00001096@yahoo.com>
References : 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
User-Agent : Claws Mail 3.19.1 (GTK+ 2.24.33; x86_64-w64-mingw32)
On Wed, 05 Jun 2024 21:40:27 -0700
Tim Rentsch <
tr.17687@z991.linuxsc.com> wrote:
Michael S <already5chosen@yahoo.com> writes:
On Wed, 05 Jun 2024 08:58:46 -0700
Tim Rentsch <tr.17687@z991.linuxsc.com> wrote:
I did get your own hash function out and put it into my little
test rig. There may be summary comments below.
>
<snip>
>
As you probably already paid attention, I like bottom lines.
What is a bottom line here?
The bottom line is both the multiply-by-a-large-prime hashes
should be avoided.
>
I am not convinced.
So far I had seen no disadvantages for particular application mentioned
by Malcolm.
Of course, it is no good for Malcolm, because it can't be coded in what
Malcolm erroneously call "ANSI C". It is not easy to be coded even in
ISO C11. But in my own practice it does not matter. I happily use
language extensions, like gnu __int128 or Microsoft's __umulh() when
they are the best tool for a job.
Did you you encounter cases in which almost-bonita's-but-not-quite
hash function performed badly?
Yes. A clear example is using directly mapped hash table that
has a power of two elements, with inputs all from malloc( 48 )
calls. All keys map to one of only 1024 entries, in a hash
table that has 65536 slots.
I don't see it.
$ ./a 0x10000 0xC000 48
ref: 49152 keys, 65536 slots, 34590 occupied 11295 collisions. Worst
collision: 7
uut: 49152 keys, 65536 slots, 35791 occupied 11872 collisions. Worst
collision: 3
It sounds like in your tests you don't use 'linear scaling to table
size' for translation of hash value to table index. IMHO, you should.
May be, for some particular hash functions other methods of translation
also work, but I see no reasons to use other methods. This one is quite
fast, robust and looks to me like the most logical. For power-of-2 sized
tables it degenerates into simple right shift.