Hi Yi,

Does a single task consume from a single partition or it consumes from more/all 
partitions?

Thanks
Bruno

> On 14 Sep 2015, at 23:22, Yi Pan <nickpa...@gmail.com> wrote:
> 
> Hi, Bruno,
> 
> The number of containers are configurable in YarnJobFactory via
> yarn.container.count.
> Each container is a single threaded model and you can run multiple tasks in
> a single container.
> At maximum, you can have as many containers as the number of tasks in this
> config to achieve 1 task / thread.
> 
> Hope that clarifies the config a bit more for you.
> 
> Thanks!
> 
> -Yi
> 
> On Mon, Sep 14, 2015 at 3:16 PM, Bruno Bonacci <bruno.bona...@gmail.com>
> wrote:
> 
>> Thanks Yan for writing me back,
>> 
>> That's ok for ThreadJobFactory and ProcessJobFactory but what about the
>> YarnJobFactory?
>> How many task/executors will be spawning?
>> 
>> 
>> Bruno
>> 
>>> On Mon, Sep 14, 2015 at 7:08 PM, Yan Fang <yanfang...@gmail.com> wrote:
>>> 
>>> Hi Bruno,
>>> 
>>> AFAIK, there is no existing JobFactory that brings as many threads as the
>>> partition number. But I think nothing stops you to implement this: you
>> can
>>> get the partition information from the JobCoordinator, and then bring as
>>> many threads as the partition/task number.
>>> 
>>> Since the two local factories (ThreadJobFactory and ProcessJobFactory)
>> are
>>> mainly for development, there is no additional document. But most of the
>>> code here
>>> <
>> https://github.com/apache/samza/tree/master/samza-core/src/main/scala/org/apache/samza/job/local
>>> is
>>> self-explained.
>>> 
>>> Thanks,
>>> 
>>> Fang, Yan
>>> yanfang...@gmail.com
>>> 
>>> On Sat, Sep 12, 2015 at 1:47 PM, Bruno Bonacci <bruno.bona...@gmail.com>
>>> wrote:
>>> 
>>>> Hi,
>>>> I'm looking for additional documentation on the different RUNTIME
>>>> EXECUTION MODELS of the different `job.factory.class`.
>>>> 
>>>> I'm particularly interested on how each factory (ThreadJobFactory,
>>>> ProcessJobFactory and YarnJobFactory) will create tasks consume and
>>> process
>>>> messages out of Kafka and the thread model used.
>>>> 
>>>> I did a few tests with the ThreadJob factory consuming out of a kafka
>>>> topic with 5 partitions and I was expecting that it would use multiple
>>>> threads to consume/process the different partitions, however it is
>>>> using only one thread at runtime.
>>>> 
>>>> Is there any way to tell Samza to use multiple processing threads (1
>> per
>>>> partition)??
>>>> 
>>>> 
>>>> Thanks
>>>> Bruno
>> 

Reply via email to