Many thanks Yi

Bye
Bruno

> On 14 Sep 2015, at 23:49, Yi Pan <nickpa...@gmail.com> wrote:
> 
> Hi, Bruno,
> 
> The number of partitions consumed by a single task is also configurable via
> the partition assignment policies (job.systemstreampartition.
> grouper.factory). By default, there are two partition assignment policies
> implemented: org.apache.samza.container.grouper.stream.GroupByPartitionFactory
> and 
> org.apache.samza.container.grouper.stream.GroupBySystemStreamPartitionFactory.
> The detailed explanation is available here:
> http://samza.apache.org/learn/documentation/0.9/jobs/configuration-table.html
> 
> Thanks!
> 
> -Yi
> 
>> On Mon, Sep 14, 2015 at 3:37 PM, <bruno.bona...@gmail.com> wrote:
>> 
>> Hi Yi,
>> 
>> Does a single task consume from a single partition or it consumes from
>> more/all partitions?
>> 
>> Thanks
>> Bruno
>> 
>>> On 14 Sep 2015, at 23:22, Yi Pan <nickpa...@gmail.com> wrote:
>>> 
>>> Hi, Bruno,
>>> 
>>> The number of containers are configurable in YarnJobFactory via
>>> yarn.container.count.
>>> Each container is a single threaded model and you can run multiple tasks
>> in
>>> a single container.
>>> At maximum, you can have as many containers as the number of tasks in
>> this
>>> config to achieve 1 task / thread.
>>> 
>>> Hope that clarifies the config a bit more for you.
>>> 
>>> Thanks!
>>> 
>>> -Yi
>>> 
>>> On Mon, Sep 14, 2015 at 3:16 PM, Bruno Bonacci <bruno.bona...@gmail.com>
>>> wrote:
>>> 
>>>> Thanks Yan for writing me back,
>>>> 
>>>> That's ok for ThreadJobFactory and ProcessJobFactory but what about the
>>>> YarnJobFactory?
>>>> How many task/executors will be spawning?
>>>> 
>>>> 
>>>> Bruno
>>>> 
>>>>> On Mon, Sep 14, 2015 at 7:08 PM, Yan Fang <yanfang...@gmail.com>
>> wrote:
>>>>> 
>>>>> Hi Bruno,
>>>>> 
>>>>> AFAIK, there is no existing JobFactory that brings as many threads as
>> the
>>>>> partition number. But I think nothing stops you to implement this: you
>>>> can
>>>>> get the partition information from the JobCoordinator, and then bring
>> as
>>>>> many threads as the partition/task number.
>>>>> 
>>>>> Since the two local factories (ThreadJobFactory and ProcessJobFactory)
>>>> are
>>>>> mainly for development, there is no additional document. But most of
>> the
>>>>> code here
>>>>> <
>> https://github.com/apache/samza/tree/master/samza-core/src/main/scala/org/apache/samza/job/local
>>>>> is
>>>>> self-explained.
>>>>> 
>>>>> Thanks,
>>>>> 
>>>>> Fang, Yan
>>>>> yanfang...@gmail.com
>>>>> 
>>>>> On Sat, Sep 12, 2015 at 1:47 PM, Bruno Bonacci <
>> bruno.bona...@gmail.com>
>>>>> wrote:
>>>>> 
>>>>>> Hi,
>>>>>> I'm looking for additional documentation on the different RUNTIME
>>>>>> EXECUTION MODELS of the different `job.factory.class`.
>>>>>> 
>>>>>> I'm particularly interested on how each factory (ThreadJobFactory,
>>>>>> ProcessJobFactory and YarnJobFactory) will create tasks consume and
>>>>> process
>>>>>> messages out of Kafka and the thread model used.
>>>>>> 
>>>>>> I did a few tests with the ThreadJob factory consuming out of a kafka
>>>>>> topic with 5 partitions and I was expecting that it would use multiple
>>>>>> threads to consume/process the different partitions, however it is
>>>>>> using only one thread at runtime.
>>>>>> 
>>>>>> Is there any way to tell Samza to use multiple processing threads (1
>>>> per
>>>>>> partition)??
>>>>>> 
>>>>>> 
>>>>>> Thanks
>>>>>> Bruno
>> 

Reply via email to