Debugging Java Memory issues: Thread and Heap Dump Analysis

·

6 min read

Cross posted to sbvkrishna.me/debugging-java-memory-issues

Java has garbage collection (GC) feature which automatically handles memory management for us - determines what memory is no longer being used by a Java application, destroying/deallocation the unused objects and reclaiming the runtime unused memory for reuse. Even though GC makes Java memory-efficient, we often fall prey to crippling memory leaks which impacts the software performance in unpredictable ways and can even bring down an entire system.

Monitoring the JVM’s GC activity with Memory & Heap usage and collecting the Thread/Heap Dumps when required can help identify and root cause common Java memory issues. Usually, memory leaks show up as OutOfMemoryError (OOM), but not necessarily always. For example, if the Heap size is too small and if the applcation is trying to allocate memory for new objects but it breached the max heap size configured, the error would look like:

java.lang.OutOfMemoryError: Java heap space

Working with Java Heap Dumps

Java uses the heap as the storage for Java objects. Collecting and analysing the Heap Dumps is great way to identify and root cause lot of memory related issues.

Collecting Heap Dumps

Java has a built-in feature for dumping heap snapshots to files in HPROF binary format. We can create .hprof memory snapshots on demand or automatically configure the programs to create Heap dumps, which can help with uncovering inefficient heap usage and debugging memory leaks.

We can collect Heap Dumps for a Java application through:

  1. Automatically on OutOfMemoryError

    • If the Java application is configured the VM flag -XX:+HeapDumpOnOutOfMemoryError, then a heap dump is collected on the first OOM Error. Note that there will not be any overhead involved unless an OOM actually occurs, so it's recommended to enable this for all production systems which can be handy if and when any memory issue comes.

    • It's important to note that the dump file can be huge, up to Gigabytes, so ensure that the target file system has enough space. If the applcation hits OOM Error frequently and is creating Heap dumps often, the disk space might run out of space soon causing many more cascading issues!

  2. On-demand via jmap utility (OracleJDK/OpenJDK HotSpot)

    • In the Java bin directory, you can use jmap utility to get a live heap dump and/or dump the heap on OOM with a JVM argument.

    • Find the ProcessId (PID) for the Java application by running ps aux | grep java.

    • Run below jmap command to generate the Heap dump for given process with PID with given <file_name>.

        jmap -F -dump:format=b,file=<file_name>.hprof <PID>
      

Analysing Heap Dumps

The Heap Dumps collected are not readable directly. After collecting the Heap Dump, we would need to parse the Heap Dump file for analysis to produce a readable and easy-to-understand report. An Analyzer can help to quickly calculate the retained sizes of objects, see who/what is preventing the Garbage Collector from collecting objects and identify memory leak suspects. There are multiple Analyzer tools available to do this - Eclipse MAT is one such tool.

Using Eclipse MAT

Eclipse Memory Analyzer (MAT) is a fast and feature-rich Java heap analyzer that helps you find memory leaks and reduce memory consumption. To analyse Heap Dumps (even very large ones) using Eclipse MAT, you can follow below steps:

  1. Install JDK11 (or newer)

    • If you have Homebrew and would like to install Amazon Corretto (free OpenJDK distribution), you can run brew tap homebrew/cask-versions && brew install --cask corretto.

    • To verify the installation if on MacOS, run /usr/libexec/java_home -V. This will print out the directory path to which the JDK is installed to.

  2. Install Eclipse MAT

    • Install Eclipse MAT by following the Official Installation guide

    • Assuming Eclipse MAT is installed and we are inside the mat/ directory, modify MemoryAnalyzer.ini file settings to use a large heap to handle large dumps and add the Java bin path to vm option.

    -vm
    /usr/lib/jvm/java-11-amazon-corretto.x86_64/bin/java
    -startup
    plugins/org.eclipse.equinox.launcher_1.6.200.v20210416-2027.jar
    --launcher.library
    plugins/org.eclipse.equinox.launcher.gtk.linux.x86_64_1.2.200.v20210429-1609
    -vmargs
    -Xms24g
    -Xmx24g
  1. Run MAT against the heap dump.

     ./ParseHeapDump.sh /tmp/jvm.hprof
    

    This takes a while to execute and generates the index files and other files to make repeated analysis faster.

  2. Then use the index file created in the previous step and run a "Leak suspects" report on the heap dump. This creates a small output file jvm_Leak_Suspects.zip which would have easy to view HTML report with the memory leak suspects.

     ./ParseHeapDump.sh /tmp/jvm.hprof org.eclipse.mat.api:suspects
    

Other Alternative Heap Dump Analysis Tools

  • VisualVM : A GUI Java profiling/analysis tool that can be used for Performance profiling (including per-thread analysis), Thread/Heap dumping, Monitoring.

    • For JDK 8, this comes bundled as jvisualvm. For JDK 9, you will need to manually download it.
  • JProfiler [Paid] : JProfiler is a Java profiler combining CPU, Memory and Thread profiling in one application.

Working with Thread Dumps

A thread dump is a snapshot of the state of all the threads of a Java application process. A thread dump can be very useful for diagnosing/debugging problems related to threads, like high CPU usage etc.

Collecting Thread Dumps

Java JDK provides multiple tools for capturing the thread dump of a Java application which are located under the bin folder inside the JDK home directory. One such tool is jstack which can be used by following below steps:

  1. Find the ProcessId (PID) for the Java application by running ps aux | grep java.

  2. Run jstack with given PID and given FileName (like /tmp/thread_dump_1.txt).

     jstack -l <PID> > /tmp/thread_dump_1.txt
    
  3. The output stored in the file is plain text and can be viewed in a simple text editor as well.

Identifying threads consuming high CPU

  1. Capture the Thread dump of the Java applcation using above steps (if not done already).

  2. Find the processId (PID) for the application by running ps aux | grep java.

  3. Identify the threadIds consuming high CPU for the given processId PID from previous step.

     top -n 1 -H -p <PID>
    
  4. Convert the threadId to corresponding Hexadecimal value using some Decimal to Hexadecimal converter.

  5. Lookup the Hex value of threadId in thread dump output file to identify the name and stack trace of desired thread.

Debugging Garbage Collection issues

Garbage Collection logs

The Garbage Collection logs can record memory usage information and garbage collection related performance metrics (throughput, accumulated pauses, longest pause, etc.) that can be analyzed using tools like GCViewer. Enabling GC logs in our software systems can come in handy when things go wrong incase issues like Memory leaks.

Below are some common JVM flags used for Garbage Collection logging in Java 8. Refer to baeldung.com/java-gc-logging-to-file for more info.

"-XX:+PrintGCDetails"
"-XX:+PrintGCDateStamps"
"-XX:+PrintTenuringDistribution"
"-XX:+PrintClassHistogram"
"-Xloggc:gc_log_file.log"

References

  1. Java Garbage Collection

  2. Analyzing hprof file