This might be obvious but just checking anyways, did you confirm whether or not all of the messages have already been consumed by Spark? If that's the case then I wouldn't expect much to happen unless new data comes into your Kafka topic.
If you're a hundred percent sure that there's still plenty more data to be consumed by Spark and that didn't happen, then I would suggest generating Java thread dumps (use Java's jstack command) from your driver's process. On Sat, Apr 18, 2020 at 2:43 PM Sean Owen <sro...@gmail.com> wrote: > I don't think that means it's stuck on removing something; it was > removed. Not sure what it is waiting on - more data perhaps? > > On Sat, Apr 18, 2020 at 2:22 PM Alchemist <alchemistsrivast...@gmail.com> > wrote: > > > > I am running a simple Spark structured streaming application that is > pulling data from a Kafka Topic. I have a Kafka Topic with nearly 1000 > partitions. I am running this app on 6 node EMR cluster with 4 cores and > 16GB RAM. I observed that Spark is trying to pull data from all 1024 Kafka > partition and after running successful for few iteration it is stuck with > following exception: > > > > 20/04/18 00:51:41 INFO ContextCleaner: Cleaned accumulator 101 > > 20/04/18 00:51:41 INFO ContextCleaner: Cleaned accumulator 66 > > 20/04/18 00:51:41 INFO ContextCleaner: Cleaned accumulator 77 > > 20/04/18 00:51:41 INFO ContextCleaner: Cleaned accumulator 78 > > > > 20/04/18 00:51:41 INFO BlockManagerInfo: Removed broadcast_2_piece0 on > in memory (size: 4.5 KB, free: 2.7 GB) > > 20/04/18 00:51:41 INFO BlockManagerInfo: Removed broadcast_2_piece0 on > ip- in memory (size: 4.5 KB, free: 2.7 GB) > > 20/04/18 00:51:41 INFO BlockManagerInfo: Removed broadcast_2_piece0 on > ip- in memory (size: 4.5 KB, free: 2.7 GB) > > Then Sparks show RUNNING but it is NOT Processing any data. > > > > --------------------------------------------------------------------- > To unsubscribe e-mail: user-unsubscr...@spark.apache.org > >