i have seen similar behavior in my standalone cluster, I tried to increase the 
number of partitions and at some point it seems all the executors or worker 
nodes start to make parallel connection to remote data store. But it would be 
nice if someone could point us to some references on how to make proper use of 
the repartition of data from a remote data store read by spark SQL, thanks a lot

zhou




> On Jul 14, 2016, at 9:18 AM, Jakub Stransky <stransky...@gmail.com> wrote:
> 
> <image.png>


---------------------------------------------------------------------
To unsubscribe e-mail: user-unsubscr...@spark.apache.org

Reply via email to