Re: CallbackServer in PySpark Streaming

2015-02-11 Thread Todd Gao
Oh I see! Thank you very much, Davies. You correct some of my wrong understandings. On Thu, Feb 12, 2015 at 9:50 AM, Davies Liu wrote: > Yes. > > On Wed, Feb 11, 2015 at 5:44 PM, Todd Gao > wrote: > > Thanks Davies. > > I am not quite familiar with Spark Streaming. Do you mean that the > comput

Re: CallbackServer in PySpark Streaming

2015-02-11 Thread Davies Liu
Yes. On Wed, Feb 11, 2015 at 5:44 PM, Todd Gao wrote: > Thanks Davies. > I am not quite familiar with Spark Streaming. Do you mean that the compute > routine of DStream is only invoked in the driver node, > while only the compute routines of RDD are distributed to the slaves? > > On Thu, Feb 12,

Re: CallbackServer in PySpark Streaming

2015-02-11 Thread Todd Gao
Thanks Davies. I am not quite familiar with Spark Streaming. Do you mean that the compute routine of DStream is only invoked in the driver node, while only the compute routines of RDD are distributed to the slaves? On Thu, Feb 12, 2015 at 2:38 AM, Davies Liu wrote: > The CallbackServer is part o

Re: CallbackServer in PySpark Streaming

2015-02-11 Thread Davies Liu
The CallbackServer is part of Py4j, it's only used in driver, not used in slaves or workers. On Wed, Feb 11, 2015 at 12:32 AM, Todd Gao wrote: > Hi all, > > I am reading the code of PySpark and its Streaming module. > > In PySpark Streaming, when the `compute` method of the instance of > PythonTr

CallbackServer in PySpark Streaming

2015-02-11 Thread Todd Gao
Hi all, I am reading the code of PySpark and its Streaming module. In PySpark Streaming, when the `compute` method of the instance of PythonTransformedDStream is invoked, a connection to the CallbackServer is created internally. I wonder where is the CallbackServer for each PythonTransformedDStre