Java 的 String 常量池在哪里,堆还是堆?

我知道常量池和 JVM 用来处理字符串文字的 String 常量池的概念。但是我不知道 JVM 使用哪种类型的内存来存储 String 常量文本。堆栈还是堆?由于它是一个与任何实例都没有关联的文字,我假设它将被存储在堆栈中。但是如果它没有被任何实例引用,则文字必须由 GC run 收集(如果我错了请纠正我) ,那么如果它存储在堆栈中,该如何处理它呢?

107295 次浏览

The answer is technically neither. According to the Java Virtual Machine Specification, the area for storing string literals is in the runtime constant pool. The runtime constant pool memory area is allocated on a per-class or per-interface basis, so it's not tied to any object instances at all. The runtime constant pool is a subset of the method area which "stores per-class structures such as the runtime constant pool, field and method data, and the code for methods and constructors, including the special methods used in class and instance initialization and interface type initialization". The VM spec says that although the method area is logically part of the heap, it doesn't dictate that memory allocated in the method area be subject to garbage collection or other behaviors that would be associated with normal data structures allocated to the heap.

String literals are not stored on the stack. Never. In fact, no objects are stored on the stack.

String literals (or more accurately, the String objects that represent them) are were historically stored in a Heap called the "permgen" heap. (Permgen is short for permanent generation.)

Under normal circumstances, String literals and much of the other stuff in the permgen heap are "permanently" reachable, and are not garbage collected. (For instance, String literals are always reachable from the code objects that use them.) However, you can configure a JVM to attempt to find and collect dynamically loaded classes that are no longer needed, and this may cause String literals to be garbage collected.

CLARIFICATION #1 - I'm not saying that Permgen doesn't get GC'ed. It does, typically when the JVM decides to run a Full GC. My point is that String literals will be reachable as long as the code that uses them is reachable, and the code will be reachable as long as the code's classloader is reachable, and for the default classloaders, that means "for ever".

CLARIFICATION #2 - In fact, Java 7 and later uses the regular heap to hold the string pool. Thus, String objects that represent String literals and intern'd strings are actually in the regular heap. (See @assylias's Answer for details.)


But I am still trying to find out thin line between storage of string literal and string created with new.

There is no "thin line". It is really very simple:

  • String objects that represent / correspond to string literals are held in the string pool.
  • String objects that were created by a String::intern call are held in the string pool.
  • All other String objects are NOT held in the string pool.

Then there is the separate question of where the string pool is "stored". Prior to Java 7 it was the permgen heap. From Java 7 onwards it is the main heap.

As explained by this answer, the exact location of the string pool is not specified and can vary from one JVM implementation to another.

It is interesting to note that until Java 7, the pool was in the permgen space of the heap on hotspot JVM but it has been moved to the main part of the heap since Java 7:

Area: HotSpot
Synopsis: In JDK 7, interned strings are no longer allocated in the permanent generation of the Java heap, but are instead allocated in the main part of the Java heap (known as the young and old generations), along with the other objects created by the application. This change will result in more data residing in the main Java heap, and less data in the permanent generation, and thus may require heap sizes to be adjusted. Most applications will see only relatively small differences in heap usage due to this change, but larger applications that load many classes or make heavy use of the String.intern() method will see more significant differences. RFE: 6962931

And in Java 8 Hotspot, Permanent Generation has been completely removed.

String pooling

String pooling (sometimes also called as string canonicalisation) is a process of replacing several String objects with equal value but different identity with a single shared String object. You can achieve this goal by keeping your own Map (with possibly soft or weak references depending on your requirements) and using map values as canonicalised values. Or you can use String.intern() method which is provided to you by JDK.

At times of Java 6 using String.intern() was forbidden by many standards due to a high possibility to get an OutOfMemoryException if pooling went out of control. Oracle Java 7 implementation of string pooling was changed considerably. You can look for details in http://bugs.sun.com/view_bug.do?bug_id=6962931 and http://bugs.sun.com/view_bug.do?bug_id=6962930.

String.intern() in Java 6

In those good old days all interned strings were stored in the PermGen – the fixed size part of heap mainly used for storing loaded classes and string pool. Besides explicitly interned strings, PermGen string pool also contained all literal strings earlier used in your program (the important word here is used – if a class or method was never loaded/called, any constants defined in it will not be loaded).

The biggest issue with such string pool in Java 6 was its location – the PermGen. PermGen has a fixed size and can not be expanded at runtime. You can set it using -XX:MaxPermSize=96m option. As far as I know, the default PermGen size varies between 32M and 96M depending on the platform. You can increase its size, but its size will still be fixed. Such limitation required very careful usage of String.intern – you’d better not intern any uncontrolled user input using this method. That’s why string pooling at times of Java 6 was mostly implemented in the manually managed maps.

String.intern() in Java 7

Oracle engineers made an extremely important change to the string pooling logic in Java 7 – the string pool was relocated to the heap. It means that you are no longer limited by a separate fixed size memory area. All strings are now located in the heap, as most of other ordinary objects, which allows you to manage only the heap size while tuning your application. Technically, this alone could be a sufficient reason to reconsider using String.intern() in your Java 7 programs. But there are other reasons.

String pool values are garbage collected

Yes, all strings in the JVM string pool are eligible for garbage collection if there are no references to them from your program roots. It applies to all discussed versions of Java. It means that if your interned string went out of scope and there are no other references to it – it will be garbage collected from the JVM string pool.

Being eligible for garbage collection and residing in the heap, a JVM string pool seems to be a right place for all your strings, isn’t it? In theory it is true – non-used strings will be garbage collected from the pool, used strings will allow you to save memory in case then you get an equal string from the input. Seems to be a perfect memory saving strategy? Nearly so. You must know how the string pool is implemented before making any decisions.

source.

To the great answers that already included here I want to add something that missing in my perspective - an illustration.

As you already JVM divides the allocated memory to a Java program into two parts. one is stack and another one is heap. Stack is used for execution purpose and heap is used for storage purpose. In that heap memory, JVM allocates some memory specially meant for string literals. This part of the heap memory is called string constants pool.

So for example, if you init the following objects:

String s1 = "abc";
String s2 = "123";
String obj1 = new String("abc");
String obj2 = new String("def");
String obj3 = new String("456);

String literals s1 and s2 will go to string constant pool, objects obj1, obj2, obj3 to the heap. All of them, will be referenced from the Stack.

Also, please note that "abc" will appear in heap and in string constant pool. Why is String s1 = "abc" and String obj1 = new String("abc") will be created this way? It's because String obj1 = new String("abc") explicitly creates a new and referentially distinct instance of a String object and String s1 = "abc" may reuse an instance from the string constant pool if one is available. For a more elaborate explanation: https://stackoverflow.com/a/3298542/2811258

enter image description here

As other answers explain Memory in Java is divided into two portions

1. Stack: One stack is created per thread and it stores stack frames which again stores local variables and if a variable is a reference type then that variable refers to a memory location in heap for the actual object.

2. Heap: All kinds of objects will be created in heap only.

Heap memory is again divided into 3 portions

1. Young Generation: Stores objects which have a short life, Young Generation itself can be divided into two categories Eden Space and Survivor Space.

2. Old Generation: Store objects which have survived many garbage collection cycles and still being referenced.

3. Permanent Generation: Stores metadata about the program e.g. runtime constant pool.

String constant pool belongs to the permanent generation area of Heap memory.

We can see the runtime constant pool for our code in the bytecode by using javap -verbose class_name which will show us method references (#Methodref), Class objects ( #Class ), string literals ( #String )

runtime-constant-pool

You can read more about it on my article How Does JVM Handle Method Overloading and Overriding Internally.