Re: Occasional broadcast timeout when dynamic allocation is on

2019-02-26 Thread Abdeali Kothari
I've been facing this issue for the past few months too. I always thought it was an infrastructure issue, but we were never able to figure out what the infra issue was. If others are facing this issue too - then maybe it's a valid bug. Does anyone have any ideas on how we can debug this? On Fri,

Occasional broadcast timeout when dynamic allocation is on

2019-02-22 Thread Artem P
Hi! We have dynamic allocation enabled for our regular jobs and sometimes they fail with java.util.concurrent.TimeoutException: Futures timed out after [300 seconds]. Seems like spark driver starts broadcast just before the job has received any executors from the YARN and if it takes more than