Re: [DISCUSS] Spark cannot identify the problem executor

2021-02-11 Thread attilapiros
Hi, There is an existing way to handle this situation. Those tasks will become zombie tasks [1] and they should not be counted into the tasks failures [2]. Even the shuffle blocks should be unregistered for the lost executor, although the lost executor might be already cached as a mapoutput in th

Re: [DISCUSS] Spark cannot identify the problem executor

2020-09-11 Thread Sean Owen
-dev, +user Executors do not communicate directly, so I don't think that's quite what you are seeing. You'd have to clarify. On Fri, Sep 11, 2020 at 12:08 AM 陈晓宇 wrote: > > Hello all, > > We've been using spark 2.3 with blacklist enabled and often meet the problem > that when executor A has som

[DISCUSS] Spark cannot identify the problem executor

2020-09-10 Thread 陈晓宇
Hello all, We've been using spark 2.3 with blacklist enabled and often meet the problem that when executor A has some problem(like connection issue). Tasks on executor B, executor C will fail saying cannot read from executor A. Finally the job will fail due to task on executor B failed 4 times.