The checkpoint was only acknowledged shortly after it was started.
On Thu, Mar 4, 2021 at 12:38 PM Dan Hill wrote:
> I dove deeper into it and made a little more progress (by giving more
> resources).
>
> Here is a screenshot of one bottleneck:
> https://drive.google.com/file/d/1CIatEuIJwmKjBE9_
I dove deeper into it and made a little more progress (by giving more
resources).
Here is a screenshot of one bottleneck:
https://drive.google.com/file/d/1CIatEuIJwmKjBE9__RihVlxSilchtKS1/view
My job isn't making any progress. It's checkpointing and failing. The
taskmaster text logs are empty d
Thanks! Yes, I've looked at these. My job is facing backpressure
starting at an early join step. I'm unclear if more time is fine for the
backfill or if I need more resources.
On Tue, Mar 2, 2021 at 12:50 AM Yun Gao wrote:
> Hi Dan,
>
> I think you could see the detail of the checkpoints via
Hi Dan,
I think you could see the detail of the checkpoints via the checkpoint UI[1].
Also, if you see in the
pending checkpoints some tasks do not take snapshot, you might have a look
whether this task
is backpressuring the previous tasks [2].
Best,
Yun
[1]
https://ci.apache.org/projects/