At Netflix, we disable the broadcast timeout in our defaults. I found that it never helped catch problems. With lazy evaluation, I think it is reasonable for a table that should be broadcast to take a long time to build. Just because a join uses a subset or aggregation of a large table or requires a join itself, doesn't mean that it isn't better for the final plan to broadcast the data.
I'm not sure that a timeout for `sparkContext.broadcast` would be helpful either. What bad behavior would this catch? Let's just remove the timeout entirely, or disable it by default. On Wed, Jan 30, 2019 at 9:27 AM Justin Uang <justin.u...@gmail.com> wrote: > Hi all, > > We have noticed a lot of broadcast timeouts on our pipelines, and from > some inspection, it seems that they happen when I have two threads trying > to save two different DataFrames. We use the FIFO scheduler, so if I launch > a job that needs all the executors, the second DataFrame's collect on the > broadcast side is guaranteed to take longer than 5 minutes, and will throw. > > My question is why do we have a timeout on a collect when broadcasting? It > seems silly that we have a small default timeout on something that is > influenced by contention on the cluster. We are basically saying that all > broadcast jobs need to finish in 5 minutes, regardless of our scheduling > policy on the cluster. > > I'm curious about the original intention of the broadcast timeout. Perhaps > is the broadcast timeout really meant to be a timeout on > sparkContext.broadcast, instead of the child.executeCollectIterator()? In > that case, would it make sense to move the timeout to wrap only > sparkContext.broadcast? > > Best, > > Justin > -- Ryan Blue Software Engineer Netflix