I found something https://issues.apache.org/jira/browse/SPARK-45101 I tried in a lower environment but found no issues. Although there is a version mismatch running spark 3.2.3 in lower env and *3.2.0 in production.* Thanks & Regards, Nayan Sharma *+91-8095382952*
<https://www.linkedin.com/in/nayan-sharma> <http://stackoverflow.com/users/3687426/nayan-sharma?tab=profile> On Tue, Apr 8, 2025 at 5:13 PM Ángel Álvarez Pascua < angel.alvarez.pas...@gmail.com> wrote: > Hi Nayan, > > Your issue is quite interesting indeed. I'm not a super expert on Spark > Structured Streaming, but I faced a similar issue last year and I still > have pending to look into it on Databricks and write an article about. Do > you end up by any chance having native OOMs? Have you analyzed/played with > backpressure configuration? > > As you said, the threaddumps don't give much info, but maybe a heapdump > could be much more insightful. > > Have you tried to reproduced this issue in a more simple isolated testing > environment? I'd would be super glad to help ... I'm passionate+stubborn > about this kind of mysteries ... > > Regards, > Ángel. > > El lun, 7 abr 2025 a las 11:26, nayan sharma (<nayansharm...@gmail.com>) > escribió: > >> I took thread dump from 3 nodes. I couldn't figure out anything out of >> it. It took almost 25hr after which the job started showing active task >> stuck behaviour. >> >> We have set triggers for 1 minute and 1 batch which is successful has 32 >> tasks in it. But the batch for which is stuck task are stuck has total 34 >> task and 32 in completed jobs and 2 in active >> >> Thanks & Regards, >> Nayan Sharma >> *+91-8095382952* >> >> <https://www.linkedin.com/in/nayan-sharma> >> <http://stackoverflow.com/users/3687426/nayan-sharma?tab=profile> >> >> >> On Sun, Apr 6, 2025 at 2:21 PM Shay Elbaz <sel...@paypal.com> wrote: >> >>> Hi Nayan, >>> >>> >>> >>> Did you try to take a thread dump while there are accumulated jobs? >>> >>> >>> >>> Shay >>> >>> >>> >>> *From: *nayan sharma <nayansharm...@gmail.com> >>> *Date: *Sunday, 6 April 2025 at 6:23 >>> *To: *user.spark <user@spark.apache.org> >>> *Subject: *High count of Active Jobs >>> >>> This message contains hyperlinks, take precaution before opening these >>> links. >>> >>> Hi, >>> >>> I am facing issues with the high number of active jobs showing in UI. I >>> am using spark structure streaming to read data from Solace and >>> writing back to Kafka. >>> >>> After 24-28Hr, I see active jobs start accumulating and it happens on >>> random intervals after 24-28hr. And also I can see many active tasks more >>> than 107817 and the job is up for 58 hr. >>> >>> >>> >>> Spark Config >>> >>> executor 15 >>> >>> core 5 >>> >>> memory 10g >>> >>> driver 15g >>> >>> >>> >>> Job main flow is working absolutely fine. I can see data getting >>> populated in Kafka. On HDFS, every minute batch data is stored. No issues >>> at main flow. >>> >>> >>> >>> I did a lot of research but no luck. some say GC is not working or spark >>> task scheduler is not in sync. Can anybody faced such issue in past or can >>> guide me where to look, >>> >>> >>> >>> >>> >>> Thanks & Regards, >>> >>> Nayan Sharma >>> >>> *[image: Image removed by sender.]* *+91-8095382952* >>> >>> >>> >>> [image: Image removed by sender.] >>> <https://www.linkedin.com/in/nayan-sharma>[image: Image removed by >>> sender.] >>> <http://stackoverflow.com/users/3687426/nayan-sharma?tab=profile> >>> >> >> --------------------------------------------------------------------- >> To unsubscribe e-mail: user-unsubscr...@spark.apache.org > >