Profile Image
rakesh

Performance tunning recommendation for Old GC

Hi Team,

As per the attached report full GC occuring very frequently and there is long pauses. In Old Gen interractive graph before GC and after GC line graph is overlapping. it means Old GC not clearing any heap memory. 

 

Can you please try to analyze this report and give your expert recommendation to mitigate this issue.

 

Regards

Rakesh



Report URL - https://gceasy.io/my-gc-report.jsp?p=c2hhcmVkLzIwMjEvMDEvMjgvLS1nY19sb2dfMjhKYW4wOTA0XzEubG9nLS0yMy0zOC0yOA==&channel=WEB

    Please Sign In or to post your comment or answer

    Profile Image

    sainath

    Hi Rakesh,

     

    By analyzing the attached report I can say that your application is suffering from long-running GC events. 233 GC events took more than 5.0 seconds. Long-running GCs are unfavorable for the application's performance. 

     

    Below are my observations and recommendations:

     

    1. If you observe the Object stats section in the below image from the GC report, the avg object creation rate is 187.23 Mb/sec, This might be quite high. If it is very high, the garbage collection rate will also be very high. A high garbage collection rate will increase the GC pause time as well. Thus, optimizing the application to create fewer objects is THE EFFECTIVE strategy to reduce long GC pauses. This might be a time-consuming exercise, but it is 100% worth doing. In order to optimize the object creation rate in the application, you can consider using java profilers like JProfiler, YourKit, JVisualVM...). 

     

     

    2. You are setting -XX:MaxGCPauseMillis to 500 milliseconds. If you can reduce this number then JVM will try its best to meet it to reduce the pause time. 

     

    3. You have kept the JVM heap size above 16GB. Large heap size can also cause long GC pauses. If the heap size is quite high, then more garbage will be get accumulated in the heap. When Full GC is triggered to evict all the accumulated garbage in the heap, it will take a long time to complete. Logic is simple: If you have a small can full of trash, it's going to be quick and easy to dispose of them. On the other hand, if you have a truckload of trash, it's going to take more time to dispose of them.

     

    4. Your CPU is spending more time in system calls within the kernel, as opposed to library code, which is still running in user-space. In 4 GC event(s), 'sys' time is greater than 'usr' time. It's not a healthy sign.

     

     

     

     

    Profile Image

    rakesh

    Hi Team,

    Thanks for your quick investigation.

    It means we have to reduce the -XX:MaxGCPauseMillis from 500 to some lower value.

     

    and allocated heap memory should also need to be decreased from 16 GB to some lower value.

     

    I read one more question in ycrash:

    https://answers.ycrash.io/question/any-performance-tuning-recommendation-?q=249

     

    There was one remommendation of using XX:G1HeapRegionSize

     

    is this value will help here?

     

    Regards

    Rakesh

    Profile Image

    Ram Lakshmanan

    Hello Rakesh!

     

     You are experiencing long GC pauses because of 'Last ditch collection'. Below is the excerpt from your GC log report:

     

     

     

    You can see there are 140 'Last ditch collection' GC events and they are averaging pause time of '7 seconds 994 milliseconds'. 

     

    When JVM is unable to create objects in metaspace then it first runs 'Metadata GC Threshold' GC event. If it still doesn't help, then it will attempt to expand metaspace. If that also doesn't help then 'Last Ditch Collection' GC event is triggered. 

     

    So in your case problem is in metaspace and not in old gen. Below is your metaspace GC usage graph:

     

     

    You can see how frequently GC events are triggered in metaspace. Currently you are setting -XX:MaxMetaspaceSize as 512mb. May be if you can set '-XX:MetaspaceSize=256m', metaspace GC events might get triggerred much earlier. At that point, since there is going to be less objects in metaspace, reclamation will be quicker. Your metaspace GC pause times might get smaller than 4 seconds. If it didn't yield favourable results we can try '-XX:MinMetaspaceFreeRatio' property. For more details refer to this article. This is the exact feedback, I had given for your earlier question as well. 

    Profile Image

    rakesh

    Thanks Ram,

    For responding, even in my last question you recommended to reduce the metaspace size from 512 mb to 256 mb. that change we tried on 15th of january and we ran our system for 3 days but therafter GC report was not good. I am attaching that report here:

    https://gceasy.io/my-gc-report.jsp?p=c2hhcmVkLzIwMjEvMDEvMjkvLS1nY19sb2dfMThKYW4xMTU2XzIubG9nLS00LTE2LTM3&channel=WEB

     

     

    In that report Last ditch collection and Metadata GC threshold was the main culprit of long pauses. 

     

    We are not able to understand what else property we have to change to fine tune the G1GC setting.

    Your help is much appreciated if you suggest any recommendation.

     

    Regards

    Rakesh

     

    Profile Image

    Ram Lakshmanan

    Rakesh - I recommended adding new property 'MetaspaceSize' with value 256MB, I didn't ask to lower 'MaxMetaspaceSize' from 512mb to 256MB. They are two different properties. Lowering 'MaxMetaspaceSize' in your case, will only degrade the performance.

    Profile Image

    rakesh

    Thanks Ram,

    We will try to add one more property 'MetaspaceSize' with value 256MB thereafter I will share the stats accordingly.

     

    Regards

    Rakeh

    Profile Image

    rakesh

    Hi Ram,

    One more query, Sorry to bother you, should we also need to change  -XX:MaxGCPauseMillis from 500 to 200 as suggested by sainath?

     

    Regards

    Rakesh

    Profile Image

    Ram Lakshmanan

    Hello Rakesh! '-XX:MaxGCPauseMillis' is a soft goal that you are asking the Garbage collector to meet. Garbage collector will only try to meet, but there is no guarantee it will be meet. Besides that even though you have configured 500ms as '-XX:MaxGCPauseMillis', even that goal isn't meet. So let's first play with the metaspace setting first and then make a decision. 

    Profile Image

    rakesh

    Thanks Ram.

    Got something else on mind? Post Your Question