Profile Image
Maksim Solovjov

ZGC: Allocation Stall issue

Hi there

 

We are using ZGC and it's generally is working totally great ( compared to G1 ).

However, we sometimes experience an Allocation Stall events.

 

They aren't too big, but still something what bothers.

 

Could you take a look at the report and suggest something to use?

 

What especially bothers me is that during the Allocation Stall event there is no increase of Allocation Rate.

In fact, it seems that there is a big decrease of Allocation Rate, but it's probably caused by Allocation Stall.

( please take a look at 01:25 and 01:55 times ).

 

The second thing what bothers me is the Heap Before GC and Heap After GC graphs.

They also somewhat counterintuitive. On the times of Allocation Stall, the Heap After GC graph shows still a big spike, whereas the Heap Before GC shows a big drop.

 

 

P.S

We already tried to play around with XX:ZFragmentationLimit, but it doesn't look to help.

 



Report URL - https://gceasy.io/my-gc-report.jsp?p=YXJjaGl2ZWQvMjAyMS8wOS8yNy8tLXByb2R1Y3Rpb24tc2VydmVyLTItOTMtMTBidGVzdC0yN1NlcDIwMjEtZ2MubG9nLS0xMi00OS00NQ==&channel=WEB

  • zgc

  • allocationstallissue

  • allocationrate

  • heapbeforegc

  • heapaftergc

  • cpuspike

  • fragmentationlimit

Please Sign In or to post your comment or answer

Profile Image

Ram Lakshmanan

Hello Makism!

 

 Greetings. Interesting data points. Here are my initial thoughts around it.  

 

Allocation Stall vs Allocation Rate

 As you have pointed out allocation stalls are happening at 01:25am, 01:55am and 03:16pm as indicated in the below graph. Red triangles indicates the cluster of allocation stalls happening around this time frame. 

 

 

 During this exact time frame, allocation rate (i.e. in other words object creation rate) is degrading as shown in the below graph:

 

 

 Allocation stall is a mechanism employed by the JVM to stall the application, so that new objects will not be created. You might see great GC throughput and GC pause time behaviour because of allocation stall, however it will degrade the overall application's response time. Can you check how was your application response time during the time frame in which allocation stalls were happening i.e. during 01:25am, 01:55am and 03:16pm. Allocation stalls are more prevasive in Z GC than other applications. This is main side-effect I am seeing in Z GC algorithms.

 

Heap usage (before GC) & Heap usage (after GC) graph disconnect

 Actually there is no disconnect. Infact it makes perfect sense. If you notice spike in 'Heap usage (after GC)' graph happens few seconds before the drop in 'Heap usage (before GC)' graph. Spike in activity is reported at 01:25:03am, 01:25:21am in the 'Heap usage (after GC)' graph, where as drop in the 'Heap usage (before GC)' graph is happening at 01:25:28am to 01:26:03am. Just because of spike in the heap usage, allocation stalls are kicking-in. You can zoom the graph and study the timeframe more closely.

 

 

Profile Image

Ram Lakshmanan

Hello Makism!

 

 Here are some general tips to reduce allocation stall:

 

a) Increase the number of concurrent GC threads. It probably needs 10 or 12 concurrent GC threads in the absence of making other changes.
b) Increase the size of the Java heap to offer ZGC additional head room.
c) Make changes to the application to either reduce the amount of live data, or reduce the allocation rate.
d) You may consider passing 'XX:+UseLargePages' JVM argument

 

 There tips are harnessed from these references:

 

https://bugs.openjdk.java.net/browse/JMC-6203
https://stackoverflow.com/questions/61923094/getting-allocation-stall-when-enabling-zgc
https://wiki.openjdk.java.net/display/zgc/Main#Main-EnablingLargePagesOnLinux

 

Profile Image

Kousika M

Hello Maksim Solovjov,

To better understand the concept of allocation stalls, I recommend checking out this blog post: Allocation Stalls in Java ZGC: Causes and Solutions

 

This article dives into the reasons behind allocation stalls in Java's Z Garbage Collector (ZGC), which is designed for low-latency garbage collection. It explains how allocation stalls happen when the JVM temporarily pauses the application due to memory allocation issues. The blog further outlines several causes and solutions, including memory tuning, JVM parameter adjustments, and code optimizations, to mitigate these stalls and improve application performance.

 

 

Got something else on mind? Post Your Question

Not the answer you're looking for? Browse other questions tagged
  • zgc

  • allocationstallissue

  • allocationrate

  • heapbeforegc

  • heapaftergc

  • cpuspike

  • fragmentationlimit