As much as is reasonably practical, the hashCode method defined by class Object does return distinct integers for distinct objects. (This is typically implemented by converting the internal address of the object into an integer, but this implementation technique is not required by the JavaTM programming language.)
A hashcode is useful for storing an object in a collection, such as a hashset. By allowing an Object to define a Hashcode as something unique it allows the algorithm of the HashSet to work effectively.
Object itself uses the Object's address in memory, which is very unique, but may not be very useful if two different objects (for example two identical strings) should be considered the same, even if they are duplicated in memory.
the default hashcode implementation gives the internal address of the object in the jvm, as a 32 bits integer. Thus, two different (in memory) objects will have different hashcodes.
This is consistent with the default implementation of equals. If you want to override equals for your objects, you will have to adapt hashCode so that they are consistent.
The implementation of hashCode() may differ from class to class but the contract for hashCode() is very specific and stated clearly and explicitly in the Javadocs:
Returns a hash code value for the object. This method is supported for the benefit of hashtables such as those provided by java.util.Hashtable.
The general contract of hashCode is:
Whenever it is invoked on the same object more than once during an execution of a Java application, the hashCode method must consistently return the same integer, provided no information used in equals comparisons on the object is modified. This integer need not remain consistent from one execution of an application to another execution of the same application.
If two objects are equal according to the equals(Object) method, then calling the hashCode method on each of the two objects must produce the same integer result.
It is not required that if two objects are unequal according to the equals(java.lang.Object) method, then calling the hashCode method on each of the two objects must produce distinct integer results. However, the programmer should be aware that producing distinct integer results for unequal objects may improve the performance of hashtables.
As much as is reasonably practical, the hashCode method defined by class Object does return distinct integers for distinct objects. (This is typically implemented by converting the internal address of the object into an integer, but this implementation technique is not required by the JavaTM programming language.)
Two objects with different hash code must not be equal with regard to equals()
a.hashCode() != b.hashCode() must imply !a.equals(b)
However, two objects that are not equal with regard to equals() can have the same hash code. Storing these objects in a set or map will become less efficient if many objects have the same hash code.
If hashcode is not overriden you will call Object's hashcode, here is an excerpt from its javadoc:
As much as is reasonably practical, the hashCode method defined by class Object does return distinct integers for distinct objects. (This is typically implemented by converting the internal address of the object into an integer, but this implementation technique is not required by the JavaTM programming language.)
Not really an answer but adding to my earlier comment
internal address of the object cannot be guaranteed to remain unchanged in the JVM, whose garbage collector might move it around during heap compaction.
I tried to do something like this:
public static void main(String[] args) {
final Object object = new Object();
while (true) {
int hash = object.hashCode();
int x = 0;
Runtime r = Runtime.getRuntime();
List<Object> list = new LinkedList<Object>();
while (r.freeMemory() / (double) r.totalMemory() > 0.3) {
Object p = new Object();
list.add(p);
x += object.hashCode();//ensure optimizer or JIT won't remove this
}
System.out.println(x);
list.clear();
r.gc();
if (object.hashCode() != hash) {
System.out.println("Voila!");
break;
}
}
}
But the hashcode indeed doesn't change... can someone tell me how Sun's JDK actually implements Obect.hashcode?
returns 6 digit hex number. This is usually the memory location of the slot where the object is addressed. From an algorithmic per-se, I guess JDK does double hashing (native implementation) which is one of the best hashing functions for open addressing. This double hashing scheme highly reduces the possibility of collisions.
You must override hashCode in every class that overrides equals. Failure to do so will result in a violation of the general contract for Object.hashCode, which will prevent your class from functioning properly in conjunction with all hash-based collections, including HashMap, HashSet, and Hashtable.
In HotSpot JVM by default on the first invocation of non-overloaded Object.hashCode or System.identityHashCode a random number is generated and stored in the object header. The consequent calls to Object.hashCode or System.identityHashCode just extract this value from the header. By default it has nothing in common with object content or object location, just random number. This behavior is controlled by -XX:hashCode=n HotSpot JVM option which has the following possible values:
0: use global random generator. This is default setting in Java 7. It has the disadvantage that concurrent calls from multiple threads may cause a race condition which will result in generating the same hashCode for different objects. Also in highly-concurrent environment delays are possible due to contention (using the same memory region from different CPU cores).
5: use some thread-local xor-shift random generator which is free from the previous disadvantages. This is default setting in Java 8.
1: use object pointer mixed with some random value which is changed on the "stop-the-world" events, so between stop-the-world events (like garbage collection) generated hashCodes are stable (for testing/debugging purposes)
2: use always 1 (for testing/debugging purposes)
3: use autoincrementing numbers (for testing/debugging purposes, also global counter is used, thus contention and race conditions are possible)
4: use object pointer trimmed to 32 bit if necessary (for testing/debugging purposes)
Note that even if you set -XX:hashCode=4, the hashCode will not always point to the object address. Object may be moved later, but hashCode will stay the same. Also object addresses are poorly distributed (if your application uses not so much memory, most objects will be located close to each other), so you may end up having unbalanced hash tables if you use this option.
The default hashCode() implementation is nothing to do with object's memory address.
In openJDK, in version 6 and 7 it is a randomly generated number. In 8 and 9, it is a number based on the thread state.
So the result of identity hash generation(the value returned by default implementation of hashCode() method) is generated once and cached in the object's header.
If you want to learn more about this you can go through OpenJDK which defines entry points for hashCode() at
src/share/vm/prims/jvm.h
and
src/share/vm/prims/jvm.cpp
If you go through this above directory, it seems hundred lines of functions that seems to be far more complicated to understand. So, To simplify this, the naively way to represent the default hashcode implementation is something like below,
if (obj.hash() == 0) {
obj.set_hash(generate_new_hash());
}
return obj.hash();