One of the reason that any jobs running on YARN (Spark, MR, Hive, etc) can get stuck is if there is data unavailability issue with HDFS. This can arise if either the Namenode is not reachable or if the particular data block is unavailable due to node failures.
Can you check if your YARN service can communicate with Name node service? Akshay Bhardwaj +91-97111-33849 On Thu, May 16, 2019 at 4:27 PM Rishi Shah <rishishah.s...@gmail.com> wrote: > on yarn > > On Thu, May 16, 2019 at 1:36 AM Akshay Bhardwaj < > akshay.bhardwaj1...@gmail.com> wrote: > >> Hi Rishi, >> >> Are you running spark on YARN or spark's master-slave cluster? >> >> Akshay Bhardwaj >> +91-97111-33849 >> >> >> On Thu, May 16, 2019 at 7:15 AM Rishi Shah <rishishah.s...@gmail.com> >> wrote: >> >>> Any one please? >>> >>> On Tue, May 14, 2019 at 11:51 PM Rishi Shah <rishishah.s...@gmail.com> >>> wrote: >>> >>>> Hi All, >>>> >>>> At times when there's a data node failure, running spark job doesn't >>>> fail - it gets stuck and doesn't return. Any setting can help here? I would >>>> ideally like to get the job terminated or executors running on those data >>>> nodes fail... >>>> >>>> -- >>>> Regards, >>>> >>>> Rishi Shah >>>> >>> >>> >>> -- >>> Regards, >>> >>> Rishi Shah >>> >> > > -- > Regards, > > Rishi Shah >