Re: Slow restart from savepoint with large broadcast state when increasing parallelism

2022-12-16 Thread Jun Qin
ting? >> >> Thanks >> Jun >> >> >> >> >> 发自我的手机 >> >> >> ---- 原始邮件 ---- >> 发件人: Ken Krugler > <mailto:kkrugler_li...@transpac.com>> >> 日期: 2022年12月14日周三 19:32 >> 收件人: User mailto:user@flin

Re: Slow restart from savepoint with large broadcast state when increasing parallelism

2022-12-16 Thread Ken Krugler
gler > 日期: 2022年12月14日周三 19:32 > 收件人: User > 主 题: Slow restart from savepoint with large broadcast state when > increasing parallelism > Hi all, > > I have a job with a large amount of broadcast state (62MB). > > I took a savepoint when my workflow was running with

回复:Slow restart from savepoint with large broadcast state when increasing parallelism

2022-12-15 Thread Jun Qin
Hi Ken,Without knowning the details, the first thing I would suggest to check is whether you have reached a threshold which is configured in your state storage (e.g., s3) therefore your further download were throttled. Checking your storage metrics or logs should help to confirm whether this is the

Slow restart from savepoint with large broadcast state when increasing parallelism

2022-12-14 Thread Ken Krugler
Hi all, I have a job with a large amount of broadcast state (62MB). I took a savepoint when my workflow was running with parallelism 300. I then restarted the workflow with parallelism 400. The first 297 sub-tasks restored their broadcast state fairly quickly, but after that it slowed to a cra