Before digging deep into Garbage Collection, there are few terms that we should be familiar with.
Memory Management- Allocating and Deallocating objects in the memory space. there are different components of the memory space.
What is Garbage Collection and why is it useful?
It is a process of finding the unused or unreferenced objects and deleting them to free up memory. In Java, this is automated process. the JVM automatically determine what memory is no longer being used and recycle the memory for other uses. GC is a daemon thread that is running in background
When is Object Eligible for Garbage Collection?
- If no live threads can access the object then it is qualified under garbage collection
- If two objects has reference to each other and do not have references from outside then it is set for garbage collection
- If we explicitly assign the reference of object to null then it is set for garbage collection
- if the object is created inside a block and the reference goes to garbage collection once the program exits from this block
- When the heap memory is running low or when there is idle time in the application
- When the Eden space is full
- Not enough free space to allocate an object
- when we manually call system.gc()
- Not enough free space to create an object
When any one of the above conditions are met anytime in the lifecycle of the application, GC is triggered. But we cannot predict when this happens
JVM initiates GC to free up memory it stops the application from running for at least a short time and execute the GC process. this process is called “stop-the-world”.
What is the Internal Process Of GC
The GC is a process of looking at the heap memory, identifying he unreachable objects and destroying them with compaction. As the list of heap memory objects grows it takes time for the GC to scan all the objects. To improve performance of JVM, Generational Garbage Collection is adapted. Here heap space is divided into generations
- Young Generation - All the objects are new, once filled up minor GC is triggered. all the dead objects are destroyed
- Old or Tenured Generation - survived objects from young generation and it stores long surviving objects. the threshold is set in younger generation and if it meets then goes to old generation. the garbage collection here is called Major GC which does cleaning of both young and older generation spaces
- Permanent Generation - till Java7, it contains all the metadata required by JVM to describe classes and methods. It is removed in java8
What are the types of GC?
There are 4 different garbage collectors. we can choose which GC to use and each one has differences in throughput and application pauses
Serial GC -designed for single threaded system and small heap size.
- Freezes all applications while working
- Can be turned on using -XX:+UseSerialGC JVM option
- java -Xmx12m -Xms3m -Xmn1m -XX:PermSize=20m -XX:MaxPermSize=20m -XX:+UseSerialGC -jar myjar.jar
Parallel/Throughput GC -Default collector in JDK8. uses multiple threads to scan through the heap space and perform compaction
- Pauses the application threads while performing minor or full GC
- suited for applications which handle such pauses
- Can be turned on using -XX:+UseParallelGC JVM option. With this command line option you get a multi-thread young generation collector with a single-threaded old generation collector.
- java -Xmx12m -Xms3m -Xmn1m -XX:PermSize=20m -XX:MaxPermSize=20m -XX:+UseParallelGC -jar myjar.jar
- With the -XX:+UseParallelOldGC option, the GC is both a multithreaded young generation collector and multithreaded old generation collector. It is also a multithreaded compacting collector.
CMS Collector (Concurrent-mark-sweep) -uses multiple threads to scan through the heap
- It goes into Stop-the-World mode in 2 cases
- initial marking of the objects in old generation that are reachable/static variables
- when application changes the state of heap while algorithm was running and forcing it to go back and check if the right objects are marked
- this may face promotion failure, which means some objects moved from young to old generation and GC does not have time to allocate or increase space in old generation. To avoid this we increase heap size or more background threads to collector
- To enable the CMS Collector use: -XX:+UseConcMarkSweepGC and to set the number of threads use: -XX:ParallelCMSThreads=<n>
G1 Collector -designed for heap spaces greater than 4GB
- divides heap size into regions spanning from 1MB to 32Mb based on heap size
- concurrent global marking is done where probing of liveliness of objects is done and it now knows which parts of heap memory is free.
- It collects these unreachable objects first hence it is called Garbage first
- It selects the number of regions to do garbage collection based on the pause time target
- There are multiple phases which this process occurs
- Young-only phase - promotes young generation objects to old generation.
- Initial Marking - marks the objects for space reclamation phase while promoting the objects. Marking finishes with 2 special stop-the-world pauses: Remark and Cleanup
- Remark - This pause finalizes the marking and performs global reference processing and class unloading. Between remark and cleanup G11 calculates a summary of liveliness information which will be finalized and used in cleanup pause to update internal data structures
- CleanUp - This pause also takes the completely empty regions and determines whether space reclamation phase will actually follow. If yes, then young-only phase completes with single young only collection
- Space reclamation phase - evacuates young generation regions and live objects of old generation regions. It ends when G1 determines evacuating more wouldn’t yield enough free space worth the effort
- G1 can be enabled using -XX:+UseG1GC flag
- This reduces the chance of heap being depleted before the background threads have finished scanning for unreachable objects
- It compacts heap-on -the-go
- In Java8 there is an optimization of G1 Collector called spring deduplication. the char arrays that represent our strings occupies much of the heap space. G1 collector identifies he same string duplicates and points to same char array across heap avoiding multiple copies of same string we can use -XX:+UseStringDeduplication JVM argument
- G1 is default GC in JDK9
How to customize the GC Manually?
- choosing GC algorithms
- Adjusting heap size
- setting the GC frequency
- Use JVM flags
- monitor GC behavior
- Tune the garbage collector parameters
Here are some examples for these scenarios
You can adjust the heap size by using the -Xms and -Xmx JVM flags. The -Xms flag sets the initial heap size, and the -Xmx flag sets the maximum heap size. For example, to set the initial heap size to 1 GB and the maximum heap size to 2 GB, you can use the following command:
Choosing the right garbage collection algorithm:
You can choose the right garbage collection algorithm by using the -XX:+Use<GCAlgorithm> flag, where <GCAlgorithm> is the name of the garbage collection algorithm. For example, to use the G1 garbage collector, you can use the following command: java -XX:+UseG1GC MyApp
Adjusting the garbage collection frequency:
You can adjust the garbage collection frequency by using the -XX:GCTimeRatio and -XX:MaxGCPauseMillis flags. The -XX:GCTimeRatio flag sets the ratio of time spent on garbage collection to time spent on application execution. The -XX:MaxGCPauseMillis flag sets the maximum pause time for garbage collection. For example, to set the garbage collection time ratio to 1:4 and the maximum pause time to 500 milliseconds, you can use the following command:
java -XX:GCTimeRatio=1 -XX:MaxGCPauseMillis=500 MyApp
Tuning the garbage collector parameters:
You can tune the garbage collector parameters by using the -XX: flag followed by the parameter name and value. For example, to set the size of the young generation to 512 MB and the size of the tenured generation to 1 GB, you can use the following command:
java -XX:NewSize=512m -XX:MaxTenuringThreshold=15 -XX:MaxHeapSize=1g MyApp