Profile Image
akashjain0802

RUNNABLE threads stack trace shows locked

Hello,
It often happens that my client's Spring kafka application becomes unresponsive after running for a few hours. Restarting the application fixes the issue but only temporarily. The issue reappears again. What is wrong here?
I also know that this application has been designed to have multiple kafka consumers, a producer, a database JDBC client, Rest API client, exposing an API - all of this is housed in a single application i.e. single JVM at runtime. I feel that breaking this application down into a more modular architecture is a better design approach.
I see quite a lot of threads for KafkaConsumer are in RUNNABLE state. However, their stacktrace shows they are 'locked' on to some other thread. Does it mean that those threads are actually not in a runnable thread?
As a quick fix will it help if the number of threads are reduced say to 10 from the current 800+?


Report URL - https://fastthread.io/my-thread-report.jsp?p=c2hhcmVkLzIwMjQvMDcvMy90aHJlYWRfZHVtcF91c3BfMjM5MjhkZWQzNmE4XzJfMjgwNjIwMjRfMS50eHQtLTUtNDYtMzc=

    Please Sign In or to post your comment or answer

    Profile Image

    Kousika M

    Hello Akashjain,

    Greetings!

    Threads in runnable state usually means that threads are ready to run but are currently blocked, waiting to acquire a lock held by another thread. Essentially, they are in a state where they could run if they weren't waiting for the lock, hence still they are technically in the RUNNABLE state.


    As you mentioned, reducing the thread count indeed will help to relive this issue. However, make sure that the reduced number of threads can still handle your expected load.


    Apart from Runnable state, report also pointed out two problems in your report. Its clear that 249 threads were executing the Native Method which can slow down the process and application can become unresponsive. Please refer the attached screenshot and stacktrace of a few threads below,

     

    stackTrace:
    java.lang.Thread.State: RUNNABLE
    at sun.nio.ch.EPoll.wait(java.base@17-ea/Native Method)
    at sun.nio.ch.EPollSelectorImpl.doSelect(java.base@17-ea/EPollSelectorImpl.java:118)
    at sun.nio.ch.SelectorImpl.lockAndDoSelect(java.base@17-ea/SelectorImpl.java:129)
    - locked <0x0000000742a8fba0> (a sun.nio.ch.Util$2)
    - locked <0x0000000742a8fb20> (a sun.nio.ch.EPollSelectorImpl)
    at sun.nio.ch.SelectorImpl.select(java.base@17-ea/SelectorImpl.java:141)
    at org.apache.tomcat.util.net.NioEndpoint$Poller.run(NioEndpoint.java:758)
    at java.lang.Thread.run(java.base@17-ea/Thread.java:831)

     

    stackTrace:
    java.lang.Thread.State: RUNNABLE
    at sun.nio.ch.EPoll.wait(java.base@17-ea/Native Method)
    at sun.nio.ch.EPollSelectorImpl.doSelect(java.base@17-ea/EPollSelectorImpl.java:118)
    at sun.nio.ch.SelectorImpl.lockAndDoSelect(java.base@17-ea/SelectorImpl.java:129)
    - locked <0x0000000741bed800> (a sun.nio.ch.Util$2)
    - locked <0x0000000741bed7b0> (a sun.nio.ch.EPollSelectorImpl)
    at sun.nio.ch.SelectorImpl.select(java.base@17-ea/SelectorImpl.java:141)
    at org.apache.kafka.common.network.Selector.select(Selector.java:874)
    at org.apache.kafka.common.network.Selector.poll(Selector.java:465)
    at org.apache.kafka.clients.NetworkClient.poll(NetworkClient.java:560)
    at org.apache.kafka.clients.consumer.internals.ConsumerNetworkClient.poll(ConsumerNetworkClient.java:280)
    at org.apache.kafka.clients.consumer.internals.ConsumerNetworkClient.poll(ConsumerNetworkClient.java:251)
    at org.apache.kafka.clients.consumer.KafkaConsumer.pollForFetches(KafkaConsumer.java:1307)
    at org.apache.kafka.clients.consumer.KafkaConsumer.poll(KafkaConsumer.java:1243)
    at org.apache.kafka.clients.consumer.KafkaConsumer.poll(KafkaConsumer.java:1216)
    at org.springframework.kafka.listener.KafkaMessageListenerContainer$ListenerConsumer.pollConsumer(KafkaMessageListenerContainer.java:1676)
    at org.springframework.kafka.listener.KafkaMessageListenerContainer$ListenerConsumer.doPoll(KafkaMessageListenerContainer.java:1651)
    at org.springframework.kafka.listener.KafkaMessageListenerContainer$ListenerConsumer.pollAndInvoke(KafkaMessageListenerContainer.java:1452)
    at org.springframework.kafka.listener.KafkaMessageListenerContainer$ListenerConsumer.run(KafkaMessageListenerContainer.java:1344)
    at java.util.concurrent.CompletableFuture$AsyncRun.run(java.base@17-ea/CompletableFuture.java:1804)
    at java.lang.Thread.run(java.base@17-ea/Thread.java:831)
    

     

    stackTrace:
    java.lang.Thread.State: RUNNABLE
    at sun.nio.ch.EPoll.wait(java.base@17-ea/Native Method)
    at sun.nio.ch.EPollSelectorImpl.doSelect(java.base@17-ea/EPollSelectorImpl.java:118)
    at sun.nio.ch.SelectorImpl.lockAndDoSelect(java.base@17-ea/SelectorImpl.java:129)
    - locked <0x0000000741bffc98> (a sun.nio.ch.Util$2)
    - locked <0x0000000741bffc48> (a sun.nio.ch.EPollSelectorImpl)
    at sun.nio.ch.SelectorImpl.select(java.base@17-ea/SelectorImpl.java:141)
    at org.apache.kafka.common.network.Selector.select(Selector.java:874)
    at org.apache.kafka.common.network.Selector.poll(Selector.java:465)
    at org.apache.kafka.clients.NetworkClient.poll(NetworkClient.java:560)
    at org.apache.kafka.clients.consumer.internals.ConsumerNetworkClient.poll(ConsumerNetworkClient.java:280)
    at org.apache.kafka.clients.consumer.internals.ConsumerNetworkClient.poll(ConsumerNetworkClient.java:251)
    at org.apache.kafka.clients.consumer.KafkaConsumer.pollForFetches(KafkaConsumer.java:1307)
    at org.apache.kafka.clients.consumer.KafkaConsumer.poll(KafkaConsumer.java:1243)
    at org.apache.kafka.clients.consumer.KafkaConsumer.poll(KafkaConsumer.java:1216)
    at org.springframework.kafka.listener.KafkaMessageListenerContainer$ListenerConsumer.pollConsumer(KafkaMessageListenerContainer.java:1676)
    at org.springframework.kafka.listener.KafkaMessageListenerContainer$ListenerConsumer.doPoll(KafkaMessageListenerContainer.java:1651)
    at org.springframework.kafka.listener.KafkaMessageListenerContainer$ListenerConsumer.pollAndInvoke(KafkaMessageListenerContainer.java:1452)
    at org.springframework.kafka.listener.KafkaMessageListenerContainer$ListenerConsumer.run(KafkaMessageListenerContainer.java:1344)
    at java.util.concurrent.CompletableFuture$AsyncRun.run(java.base@17-ea/CompletableFuture.java:1804)
    at java.lang.Thread.run(java.base@17-ea/Thread.java:831)
    

     

    stackTrace:
    java.lang.Thread.State: RUNNABLE
    at sun.nio.ch.EPoll.wait(java.base@17-ea/Native Method)
    at sun.nio.ch.EPollSelectorImpl.doSelect(java.base@17-ea/EPollSelectorImpl.java:118)
    at sun.nio.ch.SelectorImpl.lockAndDoSelect(java.base@17-ea/SelectorImpl.java:129)
    - locked <0x00000007454e1308> (a sun.nio.ch.Util$2)
    - locked <0x00000007454e12b8> (a sun.nio.ch.EPollSelectorImpl)
    at sun.nio.ch.SelectorImpl.select(java.base@17-ea/SelectorImpl.java:141)
    at org.apache.kafka.common.network.Selector.select(Selector.java:874)
    at org.apache.kafka.common.network.Selector.poll(Selector.java:465)
    at org.apache.kafka.clients.NetworkClient.poll(NetworkClient.java:560)
    at org.apache.kafka.clients.consumer.internals.ConsumerNetworkClient.poll(ConsumerNetworkClient.java:280)
    at org.apache.kafka.clients.consumer.internals.ConsumerNetworkClient.poll(ConsumerNetworkClient.java:251)
    at org.apache.kafka.clients.consumer.KafkaConsumer.pollForFetches(KafkaConsumer.java:1307)
    at org.apache.kafka.clients.consumer.KafkaConsumer.poll(KafkaConsumer.java:1243)
    at org.apache.kafka.clients.consumer.KafkaConsumer.poll(KafkaConsumer.java:1216)
    at org.springframework.kafka.listener.KafkaMessageListenerContainer$ListenerConsumer.pollConsumer(KafkaMessageListenerContainer.java:1676)
    at org.springframework.kafka.listener.KafkaMessageListenerContainer$ListenerConsumer.doPoll(KafkaMessageListenerContainer.java:1651)
    at org.springframework.kafka.listener.KafkaMessageListenerContainer$ListenerConsumer.pollAndInvoke(KafkaMessageListenerContainer.java:1452)
    at org.springframework.kafka.listener.KafkaMessageListenerContainer$ListenerConsumer.run(KafkaMessageListenerContainer.java:1344)
    at java.util.concurrent.CompletableFuture$AsyncRun.run(java.base@17-ea/CompletableFuture.java:1804)
    at java.lang.Thread.run(java.base@17-ea/Thread.java:831)
    

     

    stackTrace:
    java.lang.Thread.State: RUNNABLE
    at sun.nio.ch.EPoll.wait(java.base@17-ea/Native Method)
    at sun.nio.ch.EPollSelectorImpl.doSelect(java.base@17-ea/EPollSelectorImpl.java:118)
    at sun.nio.ch.SelectorImpl.lockAndDoSelect(java.base@17-ea/SelectorImpl.java:129)
    - locked <0x0000000745529a28> (a sun.nio.ch.Util$2)
    - locked <0x00000007455299d8> (a sun.nio.ch.EPollSelectorImpl)
    at sun.nio.ch.SelectorImpl.select(java.base@17-ea/SelectorImpl.java:141)
    at org.apache.kafka.common.network.Selector.select(Selector.java:874)
    at org.apache.kafka.common.network.Selector.poll(Selector.java:465)
    at org.apache.kafka.clients.NetworkClient.poll(NetworkClient.java:560)
    at org.apache.kafka.clients.consumer.internals.ConsumerNetworkClient.poll(ConsumerNetworkClient.java:280)
    at org.apache.kafka.clients.consumer.internals.ConsumerNetworkClient.poll(ConsumerNetworkClient.java:251)
    at org.apache.kafka.clients.consumer.KafkaConsumer.pollForFetches(KafkaConsumer.java:1307)
    at org.apache.kafka.clients.consumer.KafkaConsumer.poll(KafkaConsumer.java:1243)
    at org.apache.kafka.clients.consumer.KafkaConsumer.poll(KafkaConsumer.java:1216)
    at org.springframework.kafka.listener.KafkaMessageListenerContainer$ListenerConsumer.pollConsumer(KafkaMessageListenerContainer.java:1676)
    at org.springframework.kafka.listener.KafkaMessageListenerContainer$ListenerConsumer.doPoll(KafkaMessageListenerContainer.java:1651)
    at org.springframework.kafka.listener.KafkaMessageListenerContainer$ListenerConsumer.pollAndInvoke(KafkaMessageListenerContainer.java:1452)
    at org.springframework.kafka.listener.KafkaMessageListenerContainer$ListenerConsumer.run(KafkaMessageListenerContainer.java:1344)
    at java.util.concurrent.CompletableFuture$AsyncRun.run(java.base@17-ea/CompletableFuture.java:1804)
    at java.lang.Thread.run(java.base@17-ea/Thread.java:831)

     

    stackTrace:
    java.lang.Thread.State: RUNNABLE
    at sun.nio.ch.EPoll.wait(java.base@17-ea/Native Method)
    at sun.nio.ch.EPollSelectorImpl.doSelect(java.base@17-ea/EPollSelectorImpl.java:118)
    at sun.nio.ch.SelectorImpl.lockAndDoSelect(java.base@17-ea/SelectorImpl.java:129)
    - locked <0x000000074447a9c0> (a sun.nio.ch.Util$2)
    - locked <0x000000074447a970> (a sun.nio.ch.EPollSelectorImpl)
    at sun.nio.ch.SelectorImpl.select(java.base@17-ea/SelectorImpl.java:141)
    at org.apache.kafka.common.network.Selector.select(Selector.java:874)
    at org.apache.kafka.common.network.Selector.poll(Selector.java:465)
    at org.apache.kafka.clients.NetworkClient.poll(NetworkClient.java:560)
    at org.apache.kafka.clients.consumer.internals.ConsumerNetworkClient.poll(ConsumerNetworkClient.java:280)
    at org.apache.kafka.clients.consumer.internals.ConsumerNetworkClient.poll(ConsumerNetworkClient.java:251)
    at org.apache.kafka.clients.consumer.KafkaConsumer.pollForFetches(KafkaConsumer.java:1307)
    at org.apache.kafka.clients.consumer.KafkaConsumer.poll(KafkaConsumer.java:1243)
    at org.apache.kafka.clients.consumer.KafkaConsumer.poll(KafkaConsumer.java:1216)
    at org.springframework.kafka.listener.KafkaMessageListenerContainer$ListenerConsumer.pollConsumer(KafkaMessageListenerContainer.java:1676)
    at org.springframework.kafka.listener.KafkaMessageListenerContainer$ListenerConsumer.doPoll(KafkaMessageListenerContainer.java:1651)
    at org.springframework.kafka.listener.KafkaMessageListenerContainer$ListenerConsumer.pollAndInvoke(KafkaMessageListenerContainer.java:1452)
    at org.springframework.kafka.listener.KafkaMessageListenerContainer$ListenerConsumer.run(KafkaMessageListenerContainer.java:1344)
    at java.util.concurrent.CompletableFuture$AsyncRun.run(java.base@17-ea/CompletableFuture.java:1804)
    at java.lang.Thread.run(java.base@17-ea/Thread.java:831)
    

     

    Since more that 90% of threads in ThreadPoolTaskScheduler pool are not doing any work, you can consider resizing the thread pool,

    Got something else on mind? Post Your Question