Profile Image
David

How do I improve the GC performance of Spark Streaming application?

I have a streaming job using Spark Structured Streaming. The streaming job was running normally, but the worker memory usage kept increasing over time. I looked for the solution, and I found G1GC. I decided to use G1GC on the Spark workers. The memory usage was stable now. I just wonder what else I can do to improve the GC performance. Therefore, I provided the GC logs here. I hope you guys can help me to give some explanations.



Report URL - https://gceasy.io/my-gc-report.jsp?p=c2hhcmVkLzIwMjEvMTEvMTYvLS0xNnhteC0zNWlob3AtNGNvbmMtMzVtYXgtd29ya2VyLnppcC0tMy0yNy0z&channel=WEB

  • gc-log

  • spark-structured-streaming

  • g1gc

  • jvmarguments

Please Sign In or to post your comment or answer

Profile Image

Ankita

Hi David,

 

1) Poor throughput : Your application's GC throughput is 97.941% . Learn more about poor GC throughput's consequence here.

 

2) High object creation rate : If you observe the Object stats section in the below image from the GC report, the avg object creation rate is 1.31 gb/sec, This is very high. If it is very high, the garbage collection rate will also be very high. A high garbage collection rate will increase the GC pause time as well. Thus, optimizing the application to create fewer objects is THE EFFECTIVE strategy to reduce long GC pauses. This might be a time-consuming exercise, but it is 100% worth doing. In order to optimize the object creation rate in the application, you can consider using java profilers like JProfilerYourKit, JVisualVM...). 

 

 

Profile Image

Ram Lakshmanan

Hello David!

 

 Given your 16GB heap size, your average GC pause time (45.5ms) and max GC pause time (620 ms) are pretty good. Your application's GC throughput is 97.941%. I think there might be  room to improve your GC throughput.

 

 Here are the next steps:
 
 1. Can you share the JVM arguments that you are passing to your application? I can suggest whether any quick wins can be made here.
 
 2. Here is a video clip that gives pretty good introduction to GC tuning. It might be of help to you.
 
 3. Here are few good tips to tune G1 GC performance.

Got something else on mind? Post Your Question

Not the answer you're looking for? Browse other questions tagged
  • gc-log

  • spark-structured-streaming

  • g1gc

  • jvmarguments