never mind, I think pyspark is already doing async socket read / write,
but on scala side in PythonRDD.scala
On Sat, Feb 6, 2016 at 6:27 PM, Renyi Xiong wrote:
> Hi,
>
> is it a good idea to have 2 threads in pyspark worker? - main thread
> responsible for receive and send data over socket whi
Hi,
is it a good idea to have 2 threads in pyspark worker? - main thread
responsible for receive and send data over socket while the other thread is
calling user functions to process data?
since CPU is idle (?) during network I/O, this should improve concurrency
quite a bit.
can expert answer t