Hi,
There is an existing way to handle this situation. Those tasks will become
zombie tasks [1] and they should not be counted into the tasks failures
[2]. Even the shuffle blocks should be unregistered for the lost executor,
although the lost executor might be already cached as a mapoutput in th
-dev, +user
Executors do not communicate directly, so I don't think that's quite
what you are seeing. You'd have to clarify.
On Fri, Sep 11, 2020 at 12:08 AM 陈晓宇 wrote:
>
> Hello all,
>
> We've been using spark 2.3 with blacklist enabled and often meet the problem
> that when executor A has som
Hello all,
We've been using spark 2.3 with blacklist enabled and often meet the
problem that when executor A has some problem(like connection issue). Tasks
on executor B, executor C will fail saying cannot read from executor A.
Finally the job will fail due to task on executor B failed 4 times.