application stop responding and using this thread dump, not sure what is the issue

stt123

https://fastthread.io/my-thread-report.jsp?p=c2hhcmVkLzIwMjMvMDQvMTgvdGhyZWFkZHVtcDIuemlwLS00LTQ2LTQw&;

Ankita

Hi stt123,

Your application has total thread count 960 which is very high.

1) Total 446 threads are present in EE thread group. If you check the stactrace of the thread, more than 90% of the threads in this thread pool are not doing any work. Consider resizing this threadpool.

2) 12 threads are stuck waiting for response from external system. It can slow down transactions. Here is the stacktrace of the threads.

stackTrace:
java.lang.Thread.State: RUNNABLE
at java.net.SocketInputStream.socketRead0(java.base@11.0.18/Native Method)
at java.net.SocketInputStream.socketRead(java.base@11.0.18/SocketInputStream.java:115)
at java.net.SocketInputStream.read(java.base@11.0.18/SocketInputStream.java:168)
at java.net.SocketInputStream.read(java.base@11.0.18/SocketInputStream.java:140)
at sun.security.ssl.SSLSocketInputRecord.read(java.base@11.0.18/SSLSocketInputRecord.java:484)
at sun.security.ssl.SSLSocketInputRecord.readHeader(java.base@11.0.18/SSLSocketInputRecord.java:478)
at sun.security.ssl.SSLSocketInputRecord.bytesInCompletePacket(java.base@11.0.18/SSLSocketInputRecord.java:70)
at sun.security.ssl.SSLSocketImpl.readApplicationRecord(java.base@11.0.18/SSLSocketImpl.java:1455)
at sun.security.ssl.SSLSocketImpl$AppInputStream.read(java.base@11.0.18/SSLSocketImpl.java:1066)
at com.ibm.mq.jmqi.remote.impl.RemoteTCPConnection.receive(RemoteTCPConnection.java:1667)
at com.ibm.mq.jmqi.remote.impl.RemoteRcvThread.receiveBuffer(RemoteRcvThread.java:733)
at com.ibm.mq.jmqi.remote.impl.RemoteRcvThread.receiveOneTSH(RemoteRcvThread.java:699)
at com.ibm.mq.jmqi.remote.impl.RemoteRcvThread.run(RemoteRcvThread.java:138)
at com.ibm.msg.client.commonservices.workqueue.WorkQueueItem.runTask(WorkQueueItem.java:314)
at com.ibm.msg.client.commonservices.workqueue.SimpleWorkQueueItem.runItem(SimpleWorkQueueItem.java:99)
at com.ibm.msg.client.commonservices.workqueue.WorkQueueItem.run(WorkQueueItem.java:338)
at com.ibm.msg.client.commonservices.workqueue.WorkQueueManager.runWorkQueueItem(WorkQueueManager.java:312)
at com.ibm.msg.client.commonservices.j2se.workqueue.WorkQueueManagerImplementation$ThreadPoolWorker.run(WorkQueueManagerImplementation.java:1227)
Locked ownable synchronizers:
- <0x00000007311d0ec0> (a java.util.concurrent.locks.ReentrantLock$NonfairSync)

You can check this below link to get some tips to solve the issue.

https://blog.fastthread.io/2018/09/02/threads-stuck-in-java-net-socketinputstream-socketread0/

Hi stt123,

Your application has total thread count 960 which is very high.

1) Total 446 threads are present in EE thread group. If you check the stactrace of the thread, more than 90% of the threads in this thread pool are not doing any work. Consider resizing this threadpool.

2) 12 threads are stuck waiting for response from external system. It can slow down transactions. Here is the stacktrace of the threads.

stackTrace:
java.lang.Thread.State: RUNNABLE
at java.net.SocketInputStream.socketRead0(java.base@11.0.18/Native Method)
at java.net.SocketInputStream.socketRead(java.base@11.0.18/SocketInputStream.java:115)
at java.net.SocketInputStream.read(java.base@11.0.18/SocketInputStream.java:168)
at java.net.SocketInputStream.read(java.base@11.0.18/SocketInputStream.java:140)
at sun.security.ssl.SSLSocketInputRecord.read(java.base@11.0.18/SSLSocketInputRecord.java:484)
at sun.security.ssl.SSLSocketInputRecord.readHeader(java.base@11.0.18/SSLSocketInputRecord.java:478)
at sun.security.ssl.SSLSocketInputRecord.bytesInCompletePacket(java.base@11.0.18/SSLSocketInputRecord.java:70)
at sun.security.ssl.SSLSocketImpl.readApplicationRecord(java.base@11.0.18/SSLSocketImpl.java:1455)
at sun.security.ssl.SSLSocketImpl$AppInputStream.read(java.base@11.0.18/SSLSocketImpl.java:1066)
at com.ibm.mq.jmqi.remote.impl.RemoteTCPConnection.receive(RemoteTCPConnection.java:1667)
at com.ibm.mq.jmqi.remote.impl.RemoteRcvThread.receiveBuffer(RemoteRcvThread.java:733)
at com.ibm.mq.jmqi.remote.impl.RemoteRcvThread.receiveOneTSH(RemoteRcvThread.java:699)
at com.ibm.mq.jmqi.remote.impl.RemoteRcvThread.run(RemoteRcvThread.java:138)
at com.ibm.msg.client.commonservices.workqueue.WorkQueueItem.runTask(WorkQueueItem.java:314)
at com.ibm.msg.client.commonservices.workqueue.SimpleWorkQueueItem.runItem(SimpleWorkQueueItem.java:99)
at com.ibm.msg.client.commonservices.workqueue.WorkQueueItem.run(WorkQueueItem.java:338)
at com.ibm.msg.client.commonservices.workqueue.WorkQueueManager.runWorkQueueItem(WorkQueueManager.java:312)
at com.ibm.msg.client.commonservices.j2se.workqueue.WorkQueueManagerImplementation$ThreadPoolWorker.run(WorkQueueManagerImplementation.java:1227)
Locked ownable synchronizers:
- <0x00000007311d0ec0> (a java.util.concurrent.locks.ReentrantLock$NonfairSync)

You can check this below link to get some tips to solve the issue.

https://blog.fastthread.io/2018/09/02/threads-stuck-in-java-net-socketinputstream-socketread0/

Ram Lakshmanan

Hello!

I agree with points Ankita had mentioned. Those two concerns needs to be addressed. However it's not clear to me whether addressing those two concerns alone will address the application unresponsiveness that you are experiencing.

There could be several reasons why application is becoming unreponsive. Some of the reasons are:

Garbage collection pauses
Threads getting BLOCKED
Network connectivity
Load balancer routing issue
Heavy CPU consumption of threads
Operating System running with old patches
Memory Leak
DB not responding properly

:

So just thread dump is not enough to diagnose the problem. You have captured only thread dump, that too one snapshot of it. It's always a good practice to capture 3 thread dumps in a gap of 10 seconds between each one. Besides thread dumps you might have to capture other logs/artifacts to do thorough analysis.

You can use the open source yCrash script which will capture 360-degree application level artifacts (like GC logs, 3 snapshots of thread dumps, heap dumps) and system level artifacts (like top, top -H, netstat, vmstat, iostat, dmesg, diskusage, kernel parameters...). Once you have these data, either you can manually analyze them or upload it to yCrash tool, which will analyze all these artifacts and generate root cause analysis report. It has potential to indicate the root cause of the problem.

stt123

application stop responding and using this thread dump, not sure what is the issue

threaddump