Garbage Collector (GC) in Java
In a programming language like C, allocating and deallocating memory is a manual process. In Java, proccess of deallocating memory is handled automatically by the garbage collector. Garbage Collection (GC) is a form of automatic memory management. The garbage collector, or just collector, attempts to reclaim garbage, or memory occupied by objects that are no longer in use by the program.
When Java programs run on the JVM, objects are created in the heap memory, which is a portion of memory dedicated to the program. Eventually, some objects will no longer be needed. GC is a process of looking at heap memory, identifying which objects are in use and which are not, and deleting the unused objects. An in-use object, or a referenced object, means that some part of your program still maintains a pointer to that object. An unused object, or unreferenced object, is no longer referenced by any part of your program. So the memory used by an unreferenced object can be reclaimed.
This process has the two parts.
- Marking
- Deletion
Marking: This is the first step of the process. In which GC try to identify which piece of memory are in use and which are not. And mark them accordingly.
Deletion: Deletion involves the removal of the unreferened object from the memory. So that other object can take place of it for better memory utilisation. Deletion is of two types
- a. Normal deletion: Normal deletion removes unreferenced objects leaving referenced objects and pointers to free space.
The memory allocator holds references to blocks of free space where the new object can be allocated.
- b. Deletion with Compacting: To further improve performance, in addition to deleting unreferenced objects, you can also compact the remaining referenced objects. By moving referenced object together, this makes new memory allocation much easier and faster.
Generational Garbage Collector
Marking and Compacting process all the objects in a JVM is inefficient. As more and more objects are allocated, the list of objects grows and grows leading to longer and longer garbage collection time. However, empirical analysis of applications has shown that most objects are short-lived.
JVM Generations
To resolve this issue heap is broken up into smaller parts or generations. The heap parts are: Young Generation, Old or Tenured Generation, and Permanent Generation
The Young Generation is where all new objects are allocated and aged. When the young generation fills up, this causes a minor garbage collection. Minor collections can be optimized assuming a high object mortality rate. A young generation full of dead objects is collected very quickly. Some surviving objects are aged and eventually move to the old generation.
Stop the World Event - All minor garbage collections are "Stop the World" events. This means that all application threads are stopped until the operation completes. Minor garbage collections are always Stop the World events.
The Old Generation is used to store long surviving objects. Typically, a threshold is set for young generation object and when that age is met, the object gets moved to the old generation. Eventually, the old generation needs to be collected. This event is called a major garbage collection.
Major garbage collections are also Stop the World events. Often a major collection is much slower because it involves all live objects. So for Responsive applications, major garbage collections should be minimized. Also note, that the length of the Stop the World event for a major garbage collection is affected by the kind of garbage collector that is used for the old generation space.
The Permanent generation contains metadata required by the JVM to describe the classes and methods used in the application. The permanent generation is populated by the JVM at runtime based on classes in use by the application. In addition, Java SE library classes and methods may be stored here.
Example
- First, any new objects are allocated to the eden space. Both survivor spaces start out empty.
2. When the eden space fills up, a minor garbage collection is triggered.
3. Referenced objects are moved to the first survivor space. Unreferenced objects are deleted when the eden space is cleared.
4. At the next minor GC, the same thing happens for the eden space. Unreferenced objects are deleted and referenced objects are moved to a survivor space. However, in this case, they are moved to the second survivor space (S1). In addition, objects from the last minor GC on the first survivor space (S0) have their age incremented and get moved to S1. Once all surviving objects have been moved to S1, both S0 and eden are cleared. Notice we now have differently aged object in the survivor space.
5. At the next minor GC, the same process repeats. However this time the survivor spaces switch. Referenced objects are moved to S0. Surviving objects are aged. Eden and S1 are cleared.
6. This slide demonstrates promotion. After a minor GC, when aged objects reach a certain age threshold (8 in this example) they are promoted from young generation to old generation.
7. As minor GCs continue to occure objects will continue to be promoted to the old generation space.
8. So that pretty much covers the entire process with the young generation. Eventually, a major GC will be performed on the old generation which cleans up and compacts that space.
Common Heap Related Switches
There are many different command line switches that can be used with Java. This section describes some of the more commonly used switches that are also used in this OBE.
Types of GC
The Serial GC
The serial collector is the default for client style machines in Java SE 5 and 6. With the serial collector, both minor and major garbage collections are done serially (using a single virtual CPU). In addition, it uses a mark-compact collection method. This method moves older memory to the beginning of the heap so that new memory allocations are made into a single continuous chunk of memory at the end of the heap. This compacting of memory makes it faster to allocate new chunks of memory to the heap.
The Parallel GC
The parallel garbage collector uses multiple threads to perform the young genertion garbage collection. By default on a host with N CPUs, the parallel garbage collector uses N garbage collector threads in the collection. The number of garbage collector threads can be controlled with command-line options:
-XX:ParallelGCThreads=<desired number>
The Concurrent Mark Sweep (CMS) Collector
The Concurrent Mark Sweep (CMS) collector (also referred to as the concurrent low pause collector) collects the tenured generation. It attempts to minimize the pauses due to garbage collection by doing most of the garbage collection work concurrently with the application threads. Normally the concurrent low pause collector does not copy or compact the live objects. A garbage collection is done without moving the live objects. If fragmentation becomes a problem, allocate a larger heap. CMS collector on young generation uses the same algorithm as that of the parallel collector.
The G1 Garbage Collector
The Garbage First or G1 garbage collector is available in Java 7 and is designed to be the long term replacement for the CMS collector. The G1 collector is a parallel, concurrent, and incrementally compacting low-pause garbage collector that has quite a different layout from the other garbage collectors described previously.
Wasnt aware of the complex behind the scenes working of the garbage collector. Great post 👍