使用 GC.Collect()有什么错?

尽管我确实理解使用这个函数的严重含义(或者至少我是这么认为的) ,但是我不明白为什么它会成为受人尊敬的程序员永远不会使用的东西之一,即使是那些甚至不知道它是用来做什么的人。

假设我正在开发一个应用程序,其中内存使用情况根据用户的操作而变化很大。应用程序生命周期可以分为两个主要阶段: 编辑和实时处理。在编辑阶段,假设创建了数十亿甚至数万亿个对象; 其中一些很小,一些没有,一些可能有终结器,一些可能没有,并假设它们的生命周期从几毫秒到长时间不等。接下来,用户决定切换到实时阶段。在这一点上,假设性能扮演了一个基本的角色,程序流中最轻微的改变都可能带来灾难性的后果。然后,通过使用对象池之类的方法,对象创建被减少到最小,但是,GC 意外地插入并抛弃它,有人死亡。

问题是: 在这种情况下,在进入第二阶段之前调用 GC.Collect ()是否明智?

毕竟,这两个阶段从未在时间上相互重叠,GC 可能收集到的所有优化和统计信息在这里几乎没有用处..。

注意: 正如你们中的一些人指出的,。NET 可能不是这种应用程序的最佳平台,但这已经超出了这个问题的范围。其目的是澄清 GC.Collect ()调用是否可以改善应用程序的整体行为/性能。我们都同意,在这样的情况下,你会做这样的事情是极其罕见的,但话说回来,GC 试图猜测,而且在大多数时候做得非常好,但这仍然只是猜测。

谢谢。

69153 次浏览

I think you are right about the scenario, but I'm not sure about the API.

Microsoft says that in such cases you should add memory pressure as a hint to the GC that it should soon perform a collection.

Well, obviously you should not write code with real-time requirements in languages with non-real-time garbage collection.

In a case with well-defined stages, there is no problem with triggering the garbage-collector. But this case is extremely rare. The problem is that many developers are going to try to use this to paper-over problems in a cargo-cult style, and adding it indiscriminately will cause performance problems.

There are situations where it's useful, but in general it should be avoided. You could compare it to GOTO, or riding a moped: you do it when you need to, but you don't tell your friends about it.

From my experience it has never been advisable to make a call to GC.Collect() in production code. In debugging, yes, it has it's advantages to help clarify potential memory leaks. I guess my fundamental reason is that the GC has been written and optimized by programmers much smarter then I, and if I get to a point that I feel I need to call GC.Collect() it is a clue that I have gone off path somewhere. In your situation it doesn't sound like you actually have memory issues, just that you are concerned what instability the collection will bring to your process. Seeing that it will not clean out objects still in use, and that it adapts very quickly to both rising and lowering demands, I would think you will not have to worry about it.

Bottom line, you can profile the application and see how these additional collections affect things. I'd suggest staying away from it though unless you are going to profile. The GC is designed to take care of itself and as the runtime evolves, they may increase efficiency. You don't want a bunch of code hanging around that may muck up the works and not be able to take advantage of these improvements. There is a similar argument for using foreach instead of for, that being, that future improvements under the covers can be added to foreach and your code doesn't have to change to take advantage.

If you call GC.Collect() in production code you are essentially declaring that you know more then the authors of the GC. That may be the case. However it's usually not, and therefore strongly discouraged.

Calling GC.Collect() forces the CLR to do a stack walk to see if each object can be truely be released by checking references. This will affect scalability if the number of objects is high, and has also been known to trigger garbage collection too often. Trust the CLR and let the garbage collector run itself when appropriate.

One of the biggest reasons to call GC.Collect() is when you have just performed a significant event which creates lots of garbage, such as what you describe. Calling GC.Collect() can be a good idea here; otherwise, the GC may not understand that it was a 'one time' event.

Of course, you should profile it, and see for yourself.

From Rico's Blog...

Rule #1

Don't.

This is really the most important rule. It's fair to say that most usages of GC.Collect() are a bad idea and I went into that in some detail in the orginal posting so I won't repeat all that here. So let's move on to...

Rule #2

Consider calling GC.Collect() if some non-recurring event has just happened and this event is highly likely to have caused a lot of old objects to die.

A classic example of this is if you're writing a client application and you display a very large and complicated form that has a lot of data associated with it. Your user has just interacted with this form potentially creating some large objects... things like XML documents, or a large DataSet or two. When the form closes these objects are dead and so GC.Collect() will reclaim the memory associated with them...

So it sounds like this situation may fall under Rule #2, you know that there's a moment in time where a lot of old objects have died, and it's non-recurring. However, don't forget Rico's parting words.

Rule #1 should trump Rule #2 without strong evidence.

Measure, measure, measure.

Well, the GC is one of those things I have a love / hate relationship with. We have broken it in the past through VistaDB and blogged about it. They have fixed it, but it takes a LONG time to get fixes from them on things like this.

The GC is complex, and a one size fits all approach is very, very hard to pull off on something this large. MS has done a fairly good job of it, but it is possible to fool the GC at times.

In general you should not add a Collect unless you know for a fact you just dumped a ton of memory and it will go to a mid life crisis if the GC doesn't get it cleaned up now.

You can screw up the entire machine with a series of bad GC.Collect statements. The need for a collect statement almost always points to a larger underlying error. The memory leak usually has to do with references and a lack of understanding to how they work. Or using of the IDisposable on objects that don't need it and putting a much higher load on the GC.

Watch closely the % of time spent in GC through the system performance counters. If you see your app using 20% or more of its time in the GC you have serious object management issues (or an abnormal usage pattern). You want to always minimize the time the GC spends because it will speed up your entire app.

It is also important to note that the GC is different on servers than workstations. I have seen a number of small difficult to track down problems with people not testing both of them (or not even aware that their are two of them).

And just to be as full in my answer as possible you should also test under Mono if you are targeting that platform as well. Since it is a totally different implementation it may experience totally different problems that the MS implementation.

The .NET Framework itself was never designed to run in a realtime environment. If you truly need realtime processing you would either use an embedded realtime language that isn't based on .NET or use the .NET Compact Framework running on a Windows CE device.

What's wrong with it? The fact that you're second-guessing the garbage collector and memory allocator, which between them have a much greater idea about your application's actual memory usage at runtime than you do.

The desire to call GC.Collect() usually is trying to cover up for mistakes you made somewhere else!

It would be better if you find where you forgot to dispose stuff you didn't need anymore.

So how about when you are using COM objects like MS Word or MS Excel from .NET? Without calling GC.Collect after releasing the COM objects we have found that the Word or Excel application instances still exist.

In fact the code we use is:

Utils.ReleaseCOMObject(objExcel)


' Call the Garbage Collector twice. The GC needs to be called twice in order to get the
' Finalizers called - the first time in, it simply makes a list of what is to be finalized,
' the second time in, it actually does the finalizing. Only then will the object do its
' automatic ReleaseComObject. Note: Calling the GC is a time-consuming process,
' but one that may be necessary when automating Excel because it is the only way to
' release all the Excel COM objects referenced indirectly.
' Ref: http://www.informit.com/articles/article.aspx?p=1346865&seqNum=5
' Ref: http://support.microsoft.com/default.aspx?scid=KB;EN-US;q317109
GC.Collect()
GC.WaitForPendingFinalizers()
GC.Collect()
GC.WaitForPendingFinalizers()

So would that be an incorrect use of the garbage collector? If so how do we get the Interop objects to die? Also if it isn't meant to be used like this, why is the GC's Collect method even Public?

Under .net, the time required to perform a garbage collection is much more strongly related to the amount of stuff that isn't garbage, than to the amount of stuff that is. Indeed, unless an object overrides Finalize (either explicitly, or via C# destructor), is the target of a WeakReference, sits on the Large Object Heap, or is special in some other gc-related way, the only thing identifying the memory in which it sits as being an object is the existence of rooted references to it. Otherwise, the GC's operation is analogous to taking from a building everything of value, and dynamiting the building, building a new one on the site of the old one, and putting all the valuable items in it. The effort required to dynamite the building is totally independent of the amount of garbage within it.

Consequently, calling GC.Collect is apt to increase the overall amount of work the system has to do. It will delay the occurrence of the next collection, but will probably do just as much work immediately as the next collection would have required when it occurred; at the point when the next collection would have occurred, the total amount of time spent collecting will have been about the same as had GC.Collect not been called, but the system will have accumulated some garbage, causing the succeeding collection to be required sooner than had GC.Collect not been called.

The times I can see GC.Collect really being useful are when one needs to either measure the memory usage of some code (since memory usage figures are only really meaningful following a collection), or profile which of several algorithms is better (calling GC.Collect() before running each of several pieces of code can help ensure a consistent baseline state). There are a few other cases where one might know things the GC doesn't, but unless one is writing a single-threaded program, there's no way one can know that a GC.Collect call which would help one thread's data structures avoid "mid-life crisis" wouldn't cause other threads' data to have a "mid-life crises" which would otherwise have been avoided.

The worst it will do is make your program freeze for a bit. So if that's OK with you, do it. Usually it's not needed for thick client or web apps with mostly user interaction.

I have found that sometimes programs with long-running threads, or batch programs, will get OutOfMemory exception even though they are disposing objects properly. One I recall was a line-of-business database transaction processing; the other was an indexing routine on a background thread in a thick client app.

In both cases, the result was simple: No GC.Collect, out of memory, consistently; GC.Collect, flawless performance.

I've tried it to solve memory problems several other times, to no avail. I took it out.

In short, don't put it in unless you're getting errors. If you put it in and it doesn't fix the memory problem, take it back out. Remember to test in Release mode and compare apples to apples.

The only time things can go wrong with this is when you get moralistic about it. It's not a values issue; many programmers have died and gone straight to heaven with many unneccessary GC.Collects in their code, which outlives them.

Creating images in a loop - even if you call dispose, the memory is not recovered. Garbage collect every time. I went from 1.7GB memory on my photo processing app to 24MB and performance is excellent.

There are absolutely time that you need to call GC.Collect.

Infact, I don't think it is a very bad practice to call GC.Collect.
There may be cases when we need that. Just for instance, I have a form which runs a thread, which inturn opens differnt tables in a database, extracts the contents in a BLOB field to a temp file, encrypt the file, then read the file into a binarystream and back into a BLOB field in another table.

The whole operation takes quite a lot of memory, and it is not certain about the number of rows and size of file content in the tables.

I used to get OutofMemory Exception often and I thought it would be wise to periodically run GC.Collect based on a counter variable. I increment a counter and when a specified level is reached, GC is called to collect any garbage that may have formed, and to reclaim any memory lost due to unforeseen memory leaks.

After this, I think it is working well, atleast no exception!!!
I call in the following way:

var obj = /* object utilizing the memory, in my case Form itself */
GC.Collect(GC.GetGeneration(obj ,GCCollectionMode.Optimized).

We had a similar issue with the garbage collector not collecting garbage and freeing up memory.

In our program, we were processing some modest sized Excel Spreadsheets with OpenXML. The spreadsheets contained anywhere from 5 to 10 "sheets" with about 1000 rows of 14 columns.

The program in a 32 bit environment (x86) would crash with an "out of memory" error. We did get it to run in an x64 environment, but we wanted a better solution.

We found one.

Here are some simplified code fragments of what didn't work and what did work when it comes to explicitly calling the Garbage Collector to free up memory from disposed objects.

Calling the GC from inside the subroutine didn't work. Memory was never reclaimed...

For Each Sheet in Spreadsheets
ProcessSheet(FileName,sheet)
Next


Private Sub ProcessSheet(ByVal Filename as string, ByVal Sheet as string)
' open the spreadsheet
Using SLDoc as SLDocument = New SLDocument(Filename, Sheet)
' do some work....
SLDoc.Save
End Using
GC.Collect()
GC.WaitForPendingFinalizers()
GC.Collect()
GC.WaitForPendingFinalizers()
End Sub

By Moving the GC call to outside the scope of the subroutine, the garbage was collected and the memory was freed up.

For Each Sheet in Spreadsheets
ProcessSheet(FileName,sheet)
GC.Collect()
GC.WaitForPendingFinalizers()
GC.Collect()
GC.WaitForPendingFinalizers()
Next


Private Sub ProcessSheet(ByVal Filename as string, ByVal Sheet as string)
' open the spreadsheet
Using SLDoc as SLDocument = New SLDocument(Filename, Sheet)
' do some work....
SLDoc.Save
End Using
End Sub

I hope this helps others that are frustrated with the .NET garbage collection when it appears to ignore the calls to GC.Collect().

Paul Smith

Nothing is wrong with explicitly calling for a collection. Some people just really want to believe that if it is a service provided by the vendor, don't question it. Oh, and all of those random freezes at the wrong moments of your interactive application? The next version will make it better!

Letting a background process deal with memory manipulation means not having to deal with it ourselves, true. But this does not logically mean that it is best for us to not deal with it ourselves under all circumstances. The GC is optimized for most cases. But this does not logically mean that it is optimized in all cases.

Have you ever answered an open question such as 'which is the best sorting algorithm' with a definitive answer? If so, don't touch the GC. For those of you who asked for the conditions, or gave 'in this case' type answers, you may proceed to learn about the GC and when to activate it.

Gotta say, I've had application freezes in Chrome and Firefox that frustrate the hell out of me, and even then for some cases the memory grows unhindered -- If only they'd learn to call the garbage collector -- or given me a button so that as I begin to read the text of a page I can hit it and thus be free of freezes for the next 20 minutes.