I have 2 dataframes, Dataframe A which contains 1 element per partition that is gigabytes big (an index)
Dataframe B which is made up out of millions of small rows. I want to join B on A but i want all the work to be done on the executors holding the partitions of dataframe A Is there a way to accomplish this without putting dataframe B in a broadcast variable or doing a broadcast join ?