Re: CallbackServer in PySpark Streaming

Davies Liu Wed, 11 Feb 2015 17:52:17 -0800

Yes.

On Wed, Feb 11, 2015 at 5:44 PM, Todd Gao <[email protected]> wrote:
> Thanks Davies.
> I am not quite familiar with Spark Streaming. Do you mean that the compute
> routine of DStream is only invoked in the driver node,
> while only the compute routines of RDD are distributed to the slaves?
>
> On Thu, Feb 12, 2015 at 2:38 AM, Davies Liu <[email protected]> wrote:
>>
>> The CallbackServer is part of Py4j, it's only used in driver, not used
>> in slaves or workers.
>>
>> On Wed, Feb 11, 2015 at 12:32 AM, Todd Gao
>> <[email protected]> wrote:
>> > Hi all,
>> >
>> > I am reading the code of PySpark and its Streaming module.
>> >
>> > In PySpark Streaming, when the `compute` method of the instance of
>> > PythonTransformedDStream is invoked, a connection to the CallbackServer
>> > is created internally.
>> > I wonder where is the CallbackServer for each PythonTransformedDStream
>> > instance on the slave nodes in distributed environment.
>> > Is there a CallbackServer running on every slave node?
>> >
>> > thanks
>> > Todd
>
>


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Re: CallbackServer in PySpark Streaming

Reply via email to