JVM - Memory Management

Reference-object (opens in a new tab) classes

var referenceQueue = new ReferenceQueue<Person>();
var referent = new Person("John", "Doe");
var reference = new WeakReference<>(referent, referenceQueue);
  • To mark the Reference whose referent to be garbage collected, there are 3 ways:

    • reference.enqueue()

      The referent will be garbage collected and enqueued immediately.

    • referent = null

      The referent will be garbage collected and enqueued, but not as immediate as reference.enqueue().

    • reference.clear()

      Only clears the referent field, without enqueuing the reference.

java.lang.ref.ReferenceQueue

  • A ReferenceQueue is used to track Reference objects whose referent has been garbage collected, by using ReferenceQueue::poll. When a referent is garbage collected, its Reference will be enqued to the ReferenceQueue it was created with.
  • To be placed on a ReferenceQueue, a Reference object must be created with a ReferenceQueue via its Constructor.
  • Soft and weak reference objects can be created with a reference queue or not, but phantom reference objects must be created with a reference queue.
  • You can use ReferenceQueue::poll to check if any references have been garbage collected.

java.lang.ref.SoftReference

  • Keep objects alive provided there’s enough memory.
  • Suitable for implementing a cache for external resources, so the cache will be effective until JVM runs out of memory.

java.lang.ref.WeakReference

  • Keep objects alive only while they’re strongly reachable. When they become weakly reachable only, they are eligible for collection.
  • Suitable for implementing a cache for transient data, so the cache be cleared more eagerly.
  • If GC finds that an object is weakly reachable (reachable only through weak references), it'll clear the weak references to that object immediately.

java.lang.ref.PhantomReference

  • Phantom reference objects are placed in their reference queue after they become phantomly reachable, but before their reference field is cleared. This is so a program can perform post-finalization cleanup and clear the phantom reference upon completion of the cleanup.

  • Lets you clean up after finalization but before the space is reclaimed (replaces or augments the use of finalize())

  • it can be used instead of a finalize method, guaranteeing that the object is not resurrected during finalization. This allows the object to be garbage collected in a single cycle, rather than needing to wait for a second GC cycle to ensure that it has not been resurrected.

  • A second use is to detect exactly when an object has been removed from memory (by using in combination with a ReferenceQueue object), ensuring that its memory is available, for example deferring allocation of a large amount of memory (e.g., a large image) until previous memory is freed.

  • In Java 8 and earlier versions, the reference needs to be cleared before the memory for a finalized referent can be reclaimed. A change in Java 9 will allow memory from a finalized referent to be reclaimable immediately.

  • Resources

Garbage Collectors

Parallel GC

  • Cons

    • Stop-the-world pauses

    • Old‑generation compaction cost

      Full-GC/major collection compaction can cause long pauses for large heaps.

    • Scaling limits

      While parallel, pause times still increase with heap size and live-set; less scalable than concurrent collectors (e.g., G1, ZGC).

    • Less focus on fragmentation

      May require full GCs to combat fragmentation, leading to long interruptions.

G1

ZGC

Shenandoah GC

GC Log

Note: GC Log is not outdated but JFR is the modern way going forward.

  • Java GC Log Analysis & Tuning Guide: From Diagnosis to Optimization

    • Key metrics

      • Pause time

        Stop-the-world impact on application latency

      • GC Frequency

        Indicates memory pressure and allocation rate

      • Young vs Full GC

        Balance between short and long-lived objects

      • Promotion Rate

        Object survival trend across generations

      • Heap Usage After GC

        Retained memory (potential memory leak flag)

      • Throughput

        Time spent in app vs time spent in GC

    • Common GC symptoms

      • Long pause times
      • Too frequent GCs
      • Promotion failures
      • Fragmentation
      • Low application throughput
    • GC tuning by use cases

      • Low-latency applications

        -XX:+UseZGC
        -Xms4g -Xmx4g
        -XX:+UseLargePages
        -XX:SoftMaxHeapSize=3g
        -XX:+ZUncommit
        -Xlog:gc*:file=gc.log
      • High-throughput applications (e.g. ETL, Batch Jobs)

        -XX:+UseParallelGC
        -XX:+UseParallelOldGC
        -Xms8g -Xmx8g
        -XX:ParallelGCThreads=8
        -XX:+UseAdaptiveSizePolicy
        -Xlog:gc*:file=gc.log
      • Balanced workloads (e.g. Web Services)

        -XX:+UseG1GC
        -Xms4g -Xmx4g
        -XX:MaxGCPauseMillis=200
        -XX:InitiatingHeapOccupancyPercent=35
        -XX:+UseStringDeduplication
        -Xlog:gc*:file=gc.log
  • Azul GC Log Analyzer

Memory

  • If you want to inspect any object's value on heap, taking a heap dump will work the same as debugging. Local variables can only be examined via debugging.

Heap - retrieve heap histogram

Shows the number of objects within an application without doing a full heap dump

Will trigger a full GC before the heap histogram is taken.

jcmd $PID GC.class_histogram

or

jmap -histo $PID

Take a heap dump - jcmd - interactive

Will trigger a full GC before the heap dump is taken.

jcmd $PID GC.heap_dump ${file.hprof}

Take a heap dump - jmap - interactive

Usage:
    jmap [option] <pid>
        (to connect to running process)
    jmap [option] <executable <core>
        (to connect to a core file)
    jmap [option] [server_id@]<remote server IP or hostname>
        (to connect to remote debug server)

where <option> is one of:
    <none>               to print same info as Solaris pmap
    -heap                to print java heap summary
    -histo[:live]        to print histogram of java object heap; if the "live"
                         suboption is specified, only count live objects
    -clstats             to print class loader statistics
    -finalizerinfo       to print information on objects awaiting finalization
    -dump:<dump-options> to dump java heap in hprof binary format
                         dump-options:
                           live         dump only live objects; if not specified,
                                        all objects in the heap are dumped.
                           format=b     binary format
                           file=<file>  dump heap to <file>
                         Example: jmap -dump:live,format=b,file=heap.bin <pid>
    -F                   force. Use with -dump:<dump-options> <pid> or -histo
                         to force a heap dump or histogram when <pid> does not
                         respond. The "live" suboption is not supported
                         in this mode.
    -h | -help           to print this help message
    -J<flag>             to pass <flag> directly to the runtime system

Take a heap dump - JVM option - when out of memory

-XX:+HeapDumpOnOutOfMemoryError
-XX:HeapDumpPath=${project.build.directory}/heapDump.log

Open a heap dump (.hprof)

Open a heap dump - Eclipse MAT (opens in a new tab) (Preferred tool)

OQL (Object Query Language)

OQL is SQL-like query language to query Java heap. OQL allows to filter/select information wanted from Java heap. While pre-defined queries such as "show all instances of class X" are already supported by HAT, OQL adds more flexibility. OQL is based on JavaScript expression language.

OQL - Select all Strings of length 100 or more
OQL - Select all int arrays of length 256 or more
OQL - Show content of Strings that match a regular expression
OQL - Show names of all ClassLoader classes
select classof(cl) from instanceof java.lang.ClassLoader cl
OQL - Show instances of the Class identified by given id string
OQL - Select all classes that have name pattern java.net.\*
OQL - Show class name of each Reference type object
select classof(o) from instanceof java.lang.ref.Reference o
OQL - Show all subclasses of java.io.InputStream
OQL - Show all superclasses of java.io.BufferedInputStream
OQL - Returns whether two given Java objects are identical or not
MAT - Dominator Tree

List the biggest objects and what they keep alive.

From the dominator tree, you can organize by the retained heap size, and then begin drilling down the tree to see the contributors to the largest GC roots.

MAT - Top Consumers

Print the most expensive objects grouped by class and by package.

MAT - Duplicate Classes

Detect classes loaded by multiple class loaders.

MAT - Component Report

Analyze objects which belong to a common root package or class loader.

  • Open a Heap Dump

  • Run the Component Report

    Before starting, one has to decide what constitutes a component. Typically, a component is either a set of classes in a common root package or classes loaded by the same class loader.

    To select a set of packages:

    Choose Leak IdentificationComponent Report from the Query drop-down list in the toolbar.

    Provide a class name pattern (for example java.util.\*) and hit Finish.

  • Getting an Overview

    The size of the component relative to the size of the whole heap.

    The Top Consumer section lists the biggest object, classes, class loader, and packages.

    The Retained Set is a histogram of all objects kept alive by the component.

  • Duplicate Strings

    Duplicate Strings are a prime example for memory waste: multiple char arrays with identical content.

    The content of the char arrays typically gives away ideas how to reduce the duplicates.

  • Empty Collections

    Empty collections take up space without storing any data. If only a few collections actually hold data, think about lazy initialization, i.e. create the collection only when needed.

  • Collection Fill Ratios

    Collections are typically created with a default initial capacity. Many collections with low fill ratios indicate that the initial capacity could be reduced.

  • Soft Reference Statistics

    Softly referenced objects require more GC activity (possibly during a stop-the-world phase) and should be used in small doses only.

  • Finalizer Statistics

    Finalizers come with a lot of drawbacks. For example, they require multiple Garbage Collections to run until the objects are actually freed. Also, the time of execution is up to the VM which makes Finalizer unpredictable.

  • Map Collision Ratios

    Maps place the values in different buckets based on the hash code of the keys. High collision ratios can indicate sub-optimal hash codes. This is not a memory problem (a better hash code does not save space) but rather a performance problem because of the (usually) linear access inside the buckets.

Open a heap dump - IntelliJ

Open a heap dump - VisualVM

Open a heap dump - HeapHero (opens in a new tab)

Limited features with free version

Open a heap dump - hprof-slurp (opens in a new tab)

To enable the analysis of huge heap dumps which are much larger than the amount of RAM available on the host system.

hprof-slurp processes dump files in a streaming fashion in a single pass without storing intermediary results on the host.

This approach makes it possible to provide an extremely fast overview of dump files without the need to spin up expensive beefy instance.