Hi Paul,
Which Lettuce client version are you using?
We have a service using Reactor which makes lookups on Redis using a Lettuce client. Once we get beyond a certain volume of requests per second our response time spikes. We hit our Connection TOs which is very high for a redis call. We have metrics on the cache Get which seems to be the origin of the problem. I'm trying to understand whether the redis requests are waiting on each other. They should be non-blocking.
Hello Paul!
Greetings.
Did you capture this thread dump when your response time was spiking up? I see only 90 threads and none of them are in BLOCKED state. You should capture atleast 3 snapshots of thread dumps in a gap of 10 seconds when response time spikes up. Capturing the data at right point in time is quite critical.
There could be several reasons why response time could be spiking up. Some of the reasons are:
:
:
So just thread dump is not enough to diagnose the problem. Besides thread dumps you might have to capture other logs/artifacts to do thorough analysis.
You can use the open source yCrash script which will capture 360-degree application level artifacts (like GC logs, 3 snapshots of thread dumps, heap dumps) and system level artifacts (like top, top -H, netstat, vmstat, iostat, dmesg, diskusage, kernel parameters...). Once you have these data, either you can manually analyze them or upload it to yCrash tool, which will analyze all these artifacts and generate one unified root cause analysis marrying all these artifacts. It can indicate the root cause of the problem.
Edit your Comment