This might be obvious but just checking anyways, did you confirm whether or
not all of the messages have already been consumed by Spark? If that's the
case then I wouldn't expect much to happen unless new data comes into your
Kafka topic.

If you're a hundred percent sure that there's still plenty more data to be
consumed by Spark and that didn't happen, then I would suggest generating
Java thread dumps (use Java's jstack command) from your driver's process.

On Sat, Apr 18, 2020 at 2:43 PM Sean Owen <sro...@gmail.com> wrote:

> I don't think that means it's stuck on removing something; it was
> removed. Not sure what it is waiting on - more data perhaps?
>
> On Sat, Apr 18, 2020 at 2:22 PM Alchemist <alchemistsrivast...@gmail.com>
> wrote:
> >
> > I am running a simple Spark structured streaming application that is
> pulling data from a Kafka Topic. I have a Kafka Topic with nearly 1000
> partitions. I am running this app on 6 node EMR cluster with 4 cores and
> 16GB RAM. I observed that Spark is trying to pull data from all 1024 Kafka
> partition and after running successful for few iteration it is stuck with
> following exception:
> >
> > 20/04/18 00:51:41 INFO ContextCleaner: Cleaned accumulator 101
> > 20/04/18 00:51:41 INFO ContextCleaner: Cleaned accumulator 66
> > 20/04/18 00:51:41 INFO ContextCleaner: Cleaned accumulator 77
> > 20/04/18 00:51:41 INFO ContextCleaner: Cleaned accumulator 78
> >
> > 20/04/18 00:51:41 INFO BlockManagerInfo: Removed broadcast_2_piece0 on
> in memory (size: 4.5 KB, free: 2.7 GB)
> > 20/04/18 00:51:41 INFO BlockManagerInfo: Removed broadcast_2_piece0 on
> ip- in memory (size: 4.5 KB, free: 2.7 GB)
> > 20/04/18 00:51:41 INFO BlockManagerInfo: Removed broadcast_2_piece0 on
> ip- in memory (size: 4.5 KB, free: 2.7 GB)
> > Then Sparks show RUNNING but it is NOT Processing any data.
> >
>
> ---------------------------------------------------------------------
> To unsubscribe e-mail: user-unsubscr...@spark.apache.org
>
>

Reply via email to