Hi Praveen,

Thank you for the answer. That's interesting because if I only bring up one
executor for the Spark Streaming, it seems only the receiver is working, no
other tasks are happening, by checking the log and UI. Maybe it's just
because the receiving task eats all the resource?, not because one executor
can only run one receiver?

Fang, Yan
yanfang...@gmail.com
+1 (206) 849-4108


On Fri, Jul 11, 2014 at 6:06 AM, Praveen Seluka <psel...@qubole.com> wrote:

> Here are my answers. But am just getting started with Spark Streaming - so
> please correct me if am wrong.
> 1) Yes
> 2) Receivers will run on executors. Its actually a job thats submitted
> where # of tasks equals # of receivers. An executor can actually run more
> than one task at the same time. Hence you could have more number of
> receivers than executors but its not recommended I think.
> 3) As said in 2, the executor where receiver task is running can be used
> for map/reduce tasks. In yarn-cluster mode, the driver program is actually
> run as application master (lives in the first container thats launched) and
> this is not an executor - hence its not used for other operations.
> 4) the driver runs in a separate container. I think the same executor can
> be used for receiver and the processing task also (this part am not very
> sure)
>
>
>  On Fri, Jul 11, 2014 at 12:29 AM, Yan Fang <yanfang...@gmail.com> wrote:
>
>> Hi all,
>>
>> I am working to improve the parallelism of the Spark Streaming
>> application. But I have problem in understanding how the executors are used
>> and the application is distributed.
>>
>> 1. In YARN, is one executor equal one container?
>>
>> 2. I saw the statement that a streaming receiver runs on one work machine
>> (*"n**ote that each input DStream creates a single receiver (running on
>> a worker machine) that receives a single stream of data"*). Does the
>> "work machine" mean the executor or physical machine? If I have more
>> receivers than the executors, will it still work?
>>
>> 3. Is the executor that holds receiver also used for other operations,
>> such as map, reduce, or fully occupied by the receiver? Similarly, if I run
>> in yarn-cluster mode, is the executor running driver program used by other
>> operations too?
>>
>> 4. So if I have a driver program (cluster mode) and streaming receiver,
>> do I have to have at least 2 executors because the program and streaming
>> receiver have to be on different executors?
>>
>> Thank you. Sorry for having so many questions but I do want to understand
>> how the Spark Streaming distributes in order to assign reasonable
>> recourse.*_* Thank you again.
>>
>> Best,
>>
>> Fang, Yan
>> yanfang...@gmail.com
>> +1 (206) 849-4108
>>
>
>

Reply via email to