Sujet : Re: Good hash for pointers
De : malcolm.arthur.mclean (at) *nospam* gmail.com (Malcolm McLean)
Groupes : comp.lang.cDate : 26. May 2024, 20:10:00
Autres entêtes
Organisation : A noiseless patient Spider
Message-ID : <v301ec$3hcrr$1@dont-email.me>
References : 1 2
User-Agent : Mozilla Thunderbird
On 26/05/2024 19:06, Bonita Montero wrote:
Am 23.05.2024 um 13:11 schrieb Malcolm McLean:
>
What is a good hash function for pointers to use in portable ANSI C?
>
The pointers are nodes of a tree, which are read only, and I want to associate read/write data with them. So potentially a lage number of pointers,and they might be consecutively ordered if they are taken from an array, or they might be returned from repeared calls to malloc() with small allocations. Obviously I have no control over pointer size or internal representation.
>
Can you explain this ? The only puropose of what I'm aware why hashing
pointers makes sense is taking them as keys for an API talking to exter-
nal code. And an operating system kernel could take pointers as handles
and verify if they're legal with a hashtable lookup.
>
Yes.
I have an XML parser, which creates a tree of XML data, with child and sibling nodes. XML documents can be very large, so this tree could contain a lot of nodes.
Now I am implementing an XPath query function. XPath passes an exoression like "//book/author/name" and picks out all nodes with the element tag "name" which are childen of a node with the element tag "author" and grandchildren of a node with the element tag "book".
Now the algorithm I have chosen requires me to associate some temporary data with each node as workspace (specifically, I mark a node as deleted). However the nodes have no "deleted" member.
And we shouldn't have to change the nodes to write secondary code that works on the structures.
So the obvious thing to do is to just use a hash table with pointer values as keys. I can then associate as much data as i want with a node, without having to hack and change the parser. But I'll need that data for every node each time I traverse the tree. So the hash table look up is going to be the most expensive, rate limiting step.
-- Check out Basic Algorithms and my other books:https://www.lulu.com/spotlight/bgy1mm