Why join here - just add two columns to the DataFrame directly? On Mon, May 17, 2021 at 1:04 PM Andrew Melo <andrew.m...@gmail.com> wrote:
> Anyone have ideas about the below Q? > > It seems to me that given that "diamond" DAG, that spark could see > that the rows haven't been shuffled/filtered, it could do some type of > "zip join" to push them together, but I've not been able to get a plan > that doesn't do a hash/sort merge join > > Cheers > Andrew > >