Garbage Collection in Java
In the post Heap Memory Allocation in Java, I have already explained why the heap is divided into generations and how it helps in garbage collection, adopting an approach known as "generational collection".
In this post, we will see more details about garbage collection in Java, basic garbage collection process, garbage collection adjustment and types of garbage collectors available in Java.
What is garbage collection?
Garbage collection in Java is an automatic process of inspecting heap memory and identifying which objects are in use and which are not and excluding unused objects after identification.
Here the object that is in use (referenced object) means that a part of your program is still maintaining a reference to the object. Unused objects (object not referenced) are not referenced by any part of the program and can be safely deleted.
Basic Process of Garbage Collection
The basic garbage collection process in Java has three stages:
1. Marking
2. sweeping
3. Compaction
1. Marking
The first step in the garbage collection process is called marking. In this step, the GC goes through the memory and identifies which objects are still in use and which objects are not used.
2. sweeping
In this scanning step (normal exclusion), objects that are already marked as unused in the marking step are removed to free up space.
3. Compaction
The exclusion of unused objects fragments memory, resulting in the allocation of additional memory that is not contiguous. That is why there is another step that moves the reference objects together, which results in having a contiguous free space, making the new memory allocation much easier and faster.
Performance parameters for the GC
There are two objectives for any application in relation to garbage collection -
• Maximum aim of pause time
• goal of application transfer speed
Maximum goal of pause time
All the smaller garbage collections and the main garbage collections are "Stop the World" events, which means that all the application chains are stopped until the garbage collection is completed. Therefore, the objective here is to minimize that pause time or to restrict it, placing an upper limit.
This is what this parameter does; The goal of maximum pause time is to limit the longest of these pauses.
Note here that only the parallel collector provides a command line option to specify a maximum pause time goal.
Command line option
-XX: MaxGCPauseMillis = <nnn>
This option is a suggestion to the garbage collector that pause times of <nnn> milliseconds or less are desired.
Throughput Goal
As the garbage collection is "Stop the world event", stop all threads of the application so that we can divide the total time in -
• The time of waste collection
• The application time
The goal of the transfer rate is measured in terms of time garbage is collected and the time spent outside the garbage collection.
Note here that only the parallel collector provides a command line option to specify a transfer rate target.
Command line option
-XX: GCTimeRatio = <nnn>
The proportion of garbage collection time for the application time is 1 / (1 + <nnn>). For example, -XX: GCTimeRatio = 9 defines a goal of 1/10 or 10% of the total time for garbage collection.
Garbage collectors available in Java
The Java HotSpot VM has three different types of collectors -
1. Serial GC
2. Parallel GC also known as Throughput Collector.
3. Mostly Concurrent Collector – Java HotSpot offers two types of mostly concurrent collector.
o Concurrent Mark Sweep (CMS) Collector
o Garbage First, Garbage collector (G1 Collector)
Serial Collector
The serial collector uses a single chain to make smaller and larger collections. The serial collector is more suitable for single-processor machines, because it can not take advantage of multiprocessor hardware. As only a single chain executes garbage collection, GC Serial is more suitable for applications that do not have low pause time requirements.
Command Line Switches
The serial collector is selected by default on certain hardware (client machines) and operating system configurations. Serial collector can be explicitly enabled with the option
-XX:+UseSerialGC.
Parallel Collector
In the parallel collector (also known as transfer rate collector), several chains are used for garbage collection.
The command line option to activate the parallel collector is -XX: + UseParallelGC.
By default, with this option, minor and main collections run in parallel to further reduce the overload of garbage collection.
The characteristic that allows the parallel collector to execute large collections in parallel is known as parallel compaction. If the parallel compaction is not activated, the main collections will be made using a single chain.
The command line option to disable parallel compaction is: -XX: + UseParallelOldGC.
Only the parallel collector provides command line options to adjust the performance parameters as indicated above.
Command Line Switches
· Maximum Garbage Collection Pause Time: The maximum pause time goal is specified with the command-line option -XX:MaxGCPauseMillis=<N>.
· Throughput goal: The throughput goal is specified by the command-line option -XX:GCTimeRatio=<N>
Concurrent Mark Sweep (CMS) Collector
Concurrent Mark Sweep (CMS) Collector, as the name suggests, performs garbage collection concurrently while the application is running. Since application also keep running that results in low pause time but the application throughput is affected because processor resources are shared.
This collector should be considered for any application with a low pause time requirement.
Like other available collectors the CMS collector is generational; thus both minor and major collections occur. The CMS collector attempts to reduce pause times due to major collections by using separate garbage collector threads to trace the reachable objects concurrently with the execution of the application threads. CMS (Concurrent Mark Sweep) garbage collection does not do compaction.
Pauses in CMS collector
The CMS collector pauses an application twice during a simultaneous collection cycle. The first pause marks these objects as living, which are directly attainable from the roots and other parts of the heap. This first break is known as the initial break of the brand.
The second pause comes at the end of the concurrent follow-up phase and finds objects that were lost by simultaneous tracking due to updates by the application reference chains in an object after the CMS collector finished tracking that object. This second pause is known as the observation pause.
Command line options
• The command line option to activate the CMS collector is -XX: + UseConcMarkSweepGC.
• Command line option to define the number of segments -XX: ParallelCMSThreads = <n>
Garbage-First Garbage Collector
The Garbage-First (G1) garbage collector is a server-style garbage collector which is suited for multiprocessor machines with large memories. G1 garbage collector minimizes the garbage collection (GC) pause time while achieving high throughput at the same time.
It minimizes the garbage collection (GC) pause time by trying to adhere to pause time goals which is set using the flag MaxGCPauseMillis.
Technique used by G1 collector
The technique used by the G1 collector to achieve high performance goals and pause time is explained below
The G1 Collector partitions the heap into a set of heap regions of equal size. Initially, the G1 executes a simultaneous global marking in the whole heap to determine which objects are still referenced and which are not (not referenced). Once the marking is done, the G1 knows which regions are almost empty. He first collects those almost empty regions, as well as the name Garbage-First. By using this garbage collection method, the G1 releases the large amount of free space by sweeping only a small heap region.
G1 attempts to join the specified pause time destination (defined by the MaxGCPauseMillis flag) using a pause forecast model. It calculates how many regions can be collected within the determined pause time limit and only collects those regions.
G1 is generational in a logical sense
As already said, the heap is partitioned into a set of equal sized heap regions. A set of empty regions is designated as the logical young generation. The objects are assigned from this new logical generation and this new generation (those heap regions) is collected when it is full. In some cases, regions outside the set of young regions (regions designated as generation with guarantee) can be collected at the same time. This is called the mixed collection.
The collector G1 also compacts the memory by copying the active objects to selected regions, initially empty.
To getting expert-level training for Java Training in your location – java training in chennai | java training in bangalore | java training in pune | java training in chennai | java training in bangalore | java training in tambaram | java training in omr | java training in velachery | java training in annanagar | java training in chennai | java interview questions and answers | core java interview questions and answers | java training in marathahalli | java training in btm layout | java training in jayanagar | java training in chennai | java training in usa | For getting java online training | java online training
No comments:
Post a Comment