在 Java 中将对象赋值为 null 会影响垃圾收集吗?

在 Java 中将一个未使用的对象引用分配给 null是否会以任何可测量的方式改进垃圾收集过程?

我使用 Java (和 C #)的经验告诉我,尝试并智胜虚拟机或 JIT 编译器通常是违反直觉的,但我看到同事们使用这种方法,我很好奇这是一个很好的实践,还是那些巫毒编程迷信之一?

44185 次浏览

Yes.

From "The Pragmatic Programmer" p.292:

By setting a reference to NULL you reduce the number of pointers to the object by one ... (which will allow the garbage collector to remove it)

At least in java, it's not voodoo programming at all. When you create an object in java using something like

Foo bar = new Foo();

you do two things: first, you create a reference to an object, and second, you create the Foo object itself. So long as that reference or another exists, the specific object can't be gc'd. however, when you assign null to that reference...

bar = null ;

and assuming nothing else has a reference to the object, it's freed and available for gc the next time the garbage collector passes by.

Typically, no.

But like all things: it depends. The GC in Java these days is VERY good and everything should be cleaned up very shortly after it is no longer reachable. This is just after leaving a method for local variables, and when a class instance is no longer referenced for fields.

You only need to explicitly null if you know it would remain referenced otherwise. For example an array which is kept around. You may want to null the individual elements of the array when they are no longer needed.

For example, this code from ArrayList:

public E remove(int index) {
RangeCheck(index);


modCount++;
E oldValue = (E) elementData[index];


int numMoved = size - index - 1;
if (numMoved > 0)
System.arraycopy(elementData, index+1, elementData, index,
numMoved);
elementData[--size] = null; // Let gc do its work


return oldValue;
}

Also, explicitly nulling an object will not cause an object to be collected any sooner than if it just went out of scope naturally as long as no references remain.

Both:

void foo() {
Object o = new Object();
/// do stuff with o
}

and:

void foo() {
Object o = new Object();
/// do stuff with o
o = null;
}

Are functionally equivalent.

Good article is today's coding horror.

The way GC's work is by looking for objects that do not have any pointers to them, the area of their search is heap/stack and any other spaces they have. So if you set a variable to null, the actual object is now not pointed by anyone, and hence could be GC'd.

But since the GC might not run at that exact instant, you might not actually be buying yourself anything. But if your method is fairly long (in terms of execution time) it might be worth it since you will be increasing your chances of GC collecting that object.

The problem can also be complicated with code optimizations, if you never use the variable after you set it to null, it would be a safe optimization to remove the line that sets the value to null (one less instruction to execute). So you might not actually be getting any improvement.

So in summary, yes it can help, but it will not be deterministic.

"It depends"

I do not know about Java but in .net (C#, VB.net...) it is usually not required to assign a null when you no longer require a object.

However note that it is "usually not required".

By analyzing your code the .net compiler makes a good valuation of the life time of the variable...to accurately tell when the object is not being used anymore. So if you write obj=null it might actually look as if the obj is still being used...in this case it is counter productive to assign a null.

There are a few cases where it might actually help to assign a null. One example is you have a huge code that runs for long time or a method that is running in a different thread, or some loop. In such cases it might help to assign null so that it is easy for the GC to know its not being used anymore.

There is no hard & fast rule for this. Going by the above place null-assigns in your code and do run a profiler to see if it helps in any way. Most probably you might not see a benefit.

If it is .net code you are trying to optimize, then my experience has been that taking good care with Dispose and Finalize methods is actually more beneficial than bothering about nulls.

Some references on the topic:

http://blogs.msdn.com/csharpfaq/archive/2004/03/26/97229.aspx

http://weblogs.asp.net/pwilson/archive/2004/02/20/77422.aspx

I assume the OP is referring to things like this:

private void Blah()
{
MyObj a;
MyObj b;


try {
a = new MyObj();
b = new MyObj;


// do real work
} finally {
a = null;
b = null;
}
}

In this case, wouldn't the VM mark them for GC as soon as they leave scope anyway?

Or, from another perspective, would explicitly setting the items to null cause them to get GC'd before they would if they just went out of scope? If so, the VM may spend time GC'ing the object when the memory isn't needed anyway, which would actually cause worse performance CPU usage wise because it would be GC'ing more earlier.

It depends.

Generally speaking shorter you keep references to your objects, faster they'll get collected.

If your method takes say 2 seconds to execute and you don't need an object anymore after one second of method execution, it makes sense to clear any references to it. If GC sees that after one second, your object is still referenced, next time it might check it in a minute or so.

Anyway, setting all references to null by default is to me premature optimization and nobody should do it unless in specific rare cases where it measurably decreases memory consuption.

In my experience, more often than not, people null out references out of paranoia not out of necessity. Here is a quick guideline:

  1. If object A references object B and you no longer need this reference and object A is not eligible for garbage collection then you should explicitly null out the field. There is no need to null out a field if the enclosing object is getting garbage collected anyway. Nulling out fields in a dispose() method is almost always useless.

  2. There is no need to null out object references created in a method. They will get cleared automatically once the method terminates. The exception to this rule is if you're running in a very long method or some massive loop and you need to ensure that some references get cleared before the end of the method. Again, these cases are extremely rare.

I would say that the vast majority of the time you will not need to null out references. Trying to outsmart the garbage collector is useless. You will just end up with inefficient, unreadable code.

Explicitly setting a reference to null instead of just letting the variable go out of scope, does not help the garbage collector, unless the object held is very large, where setting it to null as soon as you are done with is a good idea.

Generally setting references to null, mean to the READER of the code that this object is completely done with and should not be concerned about any more.

A similar effect can be achieved by introducing a narrower scope by putting in an extra set of braces

{
int l;
{  // <- here
String bigThing = ....;
l = bigThing.length();
}  // <- and here
}

this allows the bigThing to be garbage collected right after leaving the nested braces.

public class JavaMemory {
private final int dataSize = (int) (Runtime.getRuntime().maxMemory() * 0.6);


public void f() {
{
byte[] data = new byte[dataSize];
//data = null;
}


byte[] data2 = new byte[dataSize];
}


public static void main(String[] args) {


JavaMemory jmp = new JavaMemory();
jmp.f();


}


}

Above program throws OutOfMemoryError. If you uncomment data = null;, the OutOfMemoryError is solved. It is always good practice to set the unused variable to null

Even if nullifying the reference were marginally more efficient, would it be worth the ugliness of having to pepper your code with these ugly nullifications? They would only be clutter and obscure the intent code that contains them.

Its a rare codebase that has no better candidate for optimisation than trying to outsmart the Garbage collector (rarer still are developers who succeed in outsmarting it). Your efforts will most likely be better spent elsewhere instead, ditching that crufty Xml parser or finding some opportunity to cache computation. These optimisations will be easier to quantify and don't require you dirty up your codebase with noise.

I was working on a video conferencing application one time and noticed a huge huge huge difference in performance when I took the time to null references as soon as I didn't need the object anymore. This was in 2003-2004 and I can only imagine the GC has gotten even smarter since. In my case I had hundreds of objects coming and going out of scope every second, so I noticed the GC when it kicked in periodically. However after I made it a point to null objects the GC stopped pausing my application.

So it depends on what your doing...

In the future execution of your program, the values of some data members will be used to computer an output visible external to the program. Others might or might not be used, depending on future (And impossible to predict) inputs to the program. Other data members might be guaranteed not to be used. All resources, including memory, allocated to those unused data are wasted. The job of the garbage collector (GC) is to eliminate that wasted memory. It would be disastrous for the GC to eliminate something that was needed, so the algorithm used might be conservative, retaining more than the strict minimum. It might use heuristic optimizations to improve its speed, at the cost of retaining some items that are not actually needed. There are many potential algorithms the GC might use. Therefore it is possible that changes you make to your program, and which do not affect the correctness of your program, might nevertheless affect the operation of the GC, either making it run faster to do the same job, or to sooner identify unused items. So this kind of change, setting an unusdd object reference to null, in theory is not always voodoo.

Is it voodoo? There are reportedly parts of the Java library code that do this. The writers of that code are much better than average programmers and either know, or cooperate with, programmers who know details of the garbage collector implementations. So that suggests there is sometimes a benefit.

As you said there are optimizations, i.e. JVM knows the place when the variable was last used and the object referenced by it can be GCed right after this last point (still executing in current scope). So nulling out references in most cases does not help GC.

But it can be useful to avoid "nepotism" (or "floating garbage") problem (read more here or watch video). The problem exists because heap is split into Old and Young generations and there are different GC mechanisms applied: Minor GC (which is fast and happens often to clean young gen) and Major Gc (which causes longer pause to clean Old gen). "Nepotism" does not allow for garbage in Young gen to be collected if it is referenced by garbage which was already tenured to an Old gen.

This is 'pathological' because ANY promoted node will result in the promotion of ALL following nodes until a GC resolves the issue.

To avoid nepotism it's a good idea to null out references from an object which is supposed to be removed. You can see this technique applied in JDK classes: LinkedList and LinkedHashMap

private E unlinkFirst(Node<E> f) {
final E element = f.item;
final Node<E> next = f.next;
f.item = null;
f.next = null; // help GC
// ...
}

Oracle doc point out "Assign null to Variables That Are No Longer Needed" https://docs.oracle.com/cd/E19159-01/819-3681/abebi/index.html