Since String
in Java (like other languages) consumes a lot of memory because each character consumes two bytes, Java 8 has introduced a new feature called String Deduplication which takes advantage of the fact that the char arrays are internal to strings and final, so the JVM can mess around with them.
I have read this example so far but since I am not a pro java coder, I am having a hard time grasping the concept.
Here is what it says,
Various strategies for String Duplication have been considered, but the one implemented now follows the following approach: Whenever the garbage collector visits String objects it takes note of the char arrays. It takes their hash value and stores it alongside with a weak reference to the array. As soon as it finds another String which has the same hash code it compares them char by char. If they match as well, one String will be modified and point to the char array of the second String. The first char array then is no longer referenced anymore and can be garbage collected.
This whole process of course brings some overhead, but is controlled by tight limits. For example if a string is not found to have duplicates for a while it will be no longer checked.
My First question,
There is still a lack of resources on this topic since it is recently added in Java 8 update 20, could anyone here share some practical examples on how it help in reducing the memory consumed by String
in Java ?
Edit:
The above link says,
As soon as it finds another String which has the same hash code it compares them char by char
My 2nd question,
If hash code of two String
are same then the Strings
are already the same, then why compare them char
by char
once it is found that the two String
have same hash code ?