分析大型 Java 堆转储的工具

我有一个 HotSpot JVM 堆转储要分析。VM 使用 -Xmx31g运行,堆转储文件大小为48GB。

  • 我甚至不会尝试 jhat,因为它需要大约5倍的堆内存(在我的例子中是240GB) ,而且速度非常慢。
  • 在分析堆转储数小时后,EclipseMAT 使用 ArrayIndexOutOfBoundsException崩溃。

还有什么其他工具可用于该任务?最好使用一套命令行工具,其中包括一个将堆转储转换为用于分析的高效数据结构的程序,以及几个处理预结构化数据的其他工具。

99605 次浏览

The accepted answer to this related question should provide a good start for you (if you have access to the running process, generates live jmap histograms instead of heap dumps, it's very fast):

Method for finding memory leak in large Java heap dumps

Most other heap analysers (I use IBM http://www.alphaworks.ibm.com/tech/heapanalyzer) require at least a percentage of RAM more than the heap if you're expecting a nice GUI tool.

Other than that, many developers use alternative approaches, like live stack analysis to get an idea of what's going on.

Although I must question why your heaps are so large? The effect on allocation and garbage collection must be massive. I'd bet a large percentage of what's in your heap should actually be stored in a database / a persistent cache etc etc.

Normally, what I use is ParseHeapDump.sh included within Eclipse Memory Analyzer and described here, and I do that onto one our more beefed up servers (download and copy over the linux .zip distro, unzip there). The shell script needs less resources than parsing the heap from the GUI, plus you can run it on your beefy server with more resources (you can allocate more resources by adding something like -vmargs -Xmx40g -XX:-UseGCOverheadLimit to the end of the last line of the script. For instance, the last line of that file might look like this after modification

./MemoryAnalyzer -consolelog -application org.eclipse.mat.api.parse "$@" -vmargs -Xmx40g -XX:-UseGCOverheadLimit

Run it like ./path/to/ParseHeapDump.sh ../today_heap_dump/jvm.hprof

After that succeeds, it creates a number of "index" files next to the .hprof file.

After creating the indices, I try to generate reports from that and scp those reports to my local machines and try to see if I can find the culprit just by that (not just the reports, not the indices). Here's a tutorial on creating the reports.

Example report:

./ParseHeapDump.sh ../today_heap_dump/jvm.hprof org.eclipse.mat.api:suspects

Other report options:

org.eclipse.mat.api:overview and org.eclipse.mat.api:top_components

If those reports are not enough and if I need some more digging (i.e. let's say via oql), I scp the indices as well as hprof file to my local machine, and then open the heap dump (with the indices in the same directory as the heap dump) with my Eclipse MAT GUI. From there, it does not need too much memory to run.

EDIT: I just liked to add two notes :

  • As far as I know, only the generation of the indices is the memory intensive part of Eclipse MAT. After you have the indices, most of your processing from Eclipse MAT would not need that much memory.
  • Doing this on a shell script means I can do it on a headless server (and I normally do it on a headless server as well, because they're normally the most powerful ones). And if you have a server that can generate a heap dump of that size, chances are, you have another server out there that can process that much of a heap dump as well.

I suggest trying YourKit. It usually needs a little less memory than the heap dump size (it indexes it and uses that information to retrieve what you want)

A not so well known tool - http://dr-brenschede.de/bheapsampler/ works well for large heaps. It works by sampling so it doesn't have to read the entire thing, though a bit finicky.

First step: increase the amount of RAM you are allocating to MAT. By default it's not very much and it can't open large files.

In case of using MAT on MAC (OSX) you'll have file MemoryAnalyzer.ini file in MemoryAnalyzer.app/Contents/MacOS. It wasn't working for me to make adjustments to that file and have them "take". You can instead create a modified startup command/shell script based on content of this file and run it from that directory. In my case I wanted 20 GB heap:

./MemoryAnalyzer -vmargs -Xmx20g --XX:-UseGCOverheadLimit ... other params desired

Just run this command/script from Contents/MacOS directory via terminal, to start the GUI with more RAM available.

This person http://blog.ragozin.info/2015/02/programatic-heapdump-analysis.html

wrote a custom "heap analyzer" that just exposes a "query style" interface through the heap dump file, instead of actually loading the file into memory.

https://github.com/aragozin/heaplib

Though I don't know if "query language" is better than the eclipse OQL mentioned in the accepted answer here.

This is not a command line solution, however I like the tools:

Copy the heap dump to a server large enough to host it. It is very well possible that the original server can be used.

Enter the server via ssh -X to run the graphical tool remotely and use jvisualvm from the Java binary directory to load the .hprof file of the heap dump.

The tool does not load the complete heap dump into memory at once, but loads parts when they are required. Of course, if you look around enough in the file the required memory will finally reach the size of the heap dump.

Try using jprofiler , its works good in analyzing large .hprof, I have tried with file sized around 22 GB.

https://www.ej-technologies.com/products/jprofiler/overview.html

$499/dev license but has a free 10 day evaluation

I came across an interesting tool called JXray. It provides limited evaluation trial license. Found it very useful to find memory leaks. You may give it a shot.

The latest snapshot build of Eclipse Memory Analyzer has a facility to randomly discard a certain percentage of objects to reduce memory consumption and allow the remaining objects to be analyzed. See Bug 563960 and the nightly snapshot build to test this facility before it is included in the next release of MAT. Update: it is now included in released version 1.11.0.

When the problem can be "easily" reproduced, one unmentioned alternative is to take heap dumps before memory grows that big (e.g., jmap -dump:format=b,file=heap.bin <pid>).

In many cases you will already get an idea of what's going on without waiting for an OOM.

In addition, MAT provides a feature to compare different snapshots, which can come handy (see https://stackoverflow.com/a/55926302/898154 for instructions and a description).