Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations IamaSherpa on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

Hashtable/hashing-function question

Status
Not open for further replies.

DoraC

Programmer
May 7, 2002
98
US
Hi,

I've been forced to rely on HashMaps, Hashtables, etc. without, unfortunately, a good understanding of how they work... And, there's not much literature (either comprehensible or available) I've been able to find.

I understand a hash table to be a contiguous section of memory, apportioned into "bins". Provided with an object to be stored in the table, a hashing function calculates a key which determines where in the table the object is to be stored - the key calculated such that 1: collisions are minimized and 2: lookup times are small.

is this correct? And - how can I reconcile this with the fact that, in Java, one can provide keys - keys which can have no relation whatsoever to the value? Presumably, in the class HashMap, it is the "value" objects that are hashed but what, then, is the role of the hashing function if I can (and do!) specify my own key????? This is confusing..

Thank you! please forgive my idle curiosity...
dora
 
Java allows (or requires, depending on if you like it or not) you to provide the key values for the HashMaps and Hashtables. The hashCode() method may be used internally by these collections - I really never thought about it. I think HashMap and Hashtable are really misnomers.

You should basically not worry about hash values and think of the HashMaps and Hashtables as indexed collections that allow you to retrieve a value associated with a key quickly. These collections try to ensure that it takes about the same time to retrieve any value in the collection (as opposed to an array or ArrayList, which must be searched resulting in varied access times depending on the position of the desired element in the list).

Contiguous memory should not be assumed, and you shouldn't worry about it either - java manages memory for you. Specifying an initial size will improve performance if there are lots of elements to add, but in most cases even this is unnecessary.





"When you have eliminated the impossible, whatever remains, however
improbable, must be the truth." ~ Arthur Conan Doyle
 
Thanks for your thoughts...

so let me "think out loud"... if Java tries to ensure that search times are roughly equivalent for all elements stored in a Hashtable/HashMap this would imply some sort of sorting. Thus, the values must have sortable keys - but not every key used is going to be sortable (they won't necessarily implement the Comparable interface, or have another means of being ordered). So - perhaps hashCode() is used on the key Object, and this value is associated with the Value object. Thus, since hashcodes are sortable (I assume), given a key one could: 1) calculate hashcode 2) find corresponding bin in associated array (should be fast, since hashcode corresponds to array index) 3) retrieve value. In this case, you get collisions on hashcodes generated on the keys, not the values.

It would be really interesting to see the function used to map hashcodes to array indices (assuming the hashcode isn't the index value itself - I assume it would be something modulus number-of-bins???)

Again, forgive my idle thoughts. I can't help my curiosity, this stuff is interesting... :)
dora
 
>> I can't help my curiosity, this stuff is interesting

On that note you might be interested in knowing that Hashtable and HashMap pre-date the Sortable interface in the Java API. So at least, they used to be implemented with out it. How they are implemented today? I don't know.

-pete

 
It's interesting to note that the HashMap API specifically warns that the elements are not in any order if you iterate through them. So that would imply an index tree, rather than a sorted array.

My jdk1.3 installation has source code with it, so I peeked into HashMap.java and you're right - they use hashCode() on the key object as an index.




"When you have eliminated the impossible, whatever remains, however
improbable, must be the truth." ~ Arthur Conan Doyle
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top