Hi Nayan, No worries about the logs. If speculation is enabled, that could explain everything. I understand it's your production environment, but would it be possible to disable speculation temporarily and see if the issue persists?
Thanks, Ángel PS: I've started writing an article about this interesting issue. Thanks again for reporting it. El lun, 14 abr 2025 a las 2:55, nayan sharma (<nayansharm...@gmail.com>) escribió: > Hi Ángel, > Yes, speculation is enabled. I will lower log4j logging and share. It will > take atleast 24hr before we can capture anything. > > Thanks, > Nayan > > Thanks & Regards, > Nayan Sharma > *+91-8095382952* > > <https://www.linkedin.com/in/nayan-sharma> > <http://stackoverflow.com/users/3687426/nayan-sharma?tab=profile> > > On Mon, 14 Apr, 2025, 1:28 am Ángel Álvarez Pascua, < > angel.alvarez.pas...@gmail.com> wrote: > >> Hi Nayan, >> >> Do you happen to have spark.speculation enabled? >> Could you also lower the Log4j logging level to DEBUG and share the logs >> with me? >> >> I haven’t been able to reproduce the issue—every time a task fails and is >> retried, Spark correctly marks the stage as complete. >> >> Thanks, >> Ángel >> >>>