Many thanks Yi Bye Bruno
> On 14 Sep 2015, at 23:49, Yi Pan <nickpa...@gmail.com> wrote: > > Hi, Bruno, > > The number of partitions consumed by a single task is also configurable via > the partition assignment policies (job.systemstreampartition. > grouper.factory). By default, there are two partition assignment policies > implemented: org.apache.samza.container.grouper.stream.GroupByPartitionFactory > and > org.apache.samza.container.grouper.stream.GroupBySystemStreamPartitionFactory. > The detailed explanation is available here: > http://samza.apache.org/learn/documentation/0.9/jobs/configuration-table.html > > Thanks! > > -Yi > >> On Mon, Sep 14, 2015 at 3:37 PM, <bruno.bona...@gmail.com> wrote: >> >> Hi Yi, >> >> Does a single task consume from a single partition or it consumes from >> more/all partitions? >> >> Thanks >> Bruno >> >>> On 14 Sep 2015, at 23:22, Yi Pan <nickpa...@gmail.com> wrote: >>> >>> Hi, Bruno, >>> >>> The number of containers are configurable in YarnJobFactory via >>> yarn.container.count. >>> Each container is a single threaded model and you can run multiple tasks >> in >>> a single container. >>> At maximum, you can have as many containers as the number of tasks in >> this >>> config to achieve 1 task / thread. >>> >>> Hope that clarifies the config a bit more for you. >>> >>> Thanks! >>> >>> -Yi >>> >>> On Mon, Sep 14, 2015 at 3:16 PM, Bruno Bonacci <bruno.bona...@gmail.com> >>> wrote: >>> >>>> Thanks Yan for writing me back, >>>> >>>> That's ok for ThreadJobFactory and ProcessJobFactory but what about the >>>> YarnJobFactory? >>>> How many task/executors will be spawning? >>>> >>>> >>>> Bruno >>>> >>>>> On Mon, Sep 14, 2015 at 7:08 PM, Yan Fang <yanfang...@gmail.com> >> wrote: >>>>> >>>>> Hi Bruno, >>>>> >>>>> AFAIK, there is no existing JobFactory that brings as many threads as >> the >>>>> partition number. But I think nothing stops you to implement this: you >>>> can >>>>> get the partition information from the JobCoordinator, and then bring >> as >>>>> many threads as the partition/task number. >>>>> >>>>> Since the two local factories (ThreadJobFactory and ProcessJobFactory) >>>> are >>>>> mainly for development, there is no additional document. But most of >> the >>>>> code here >>>>> < >> https://github.com/apache/samza/tree/master/samza-core/src/main/scala/org/apache/samza/job/local >>>>> is >>>>> self-explained. >>>>> >>>>> Thanks, >>>>> >>>>> Fang, Yan >>>>> yanfang...@gmail.com >>>>> >>>>> On Sat, Sep 12, 2015 at 1:47 PM, Bruno Bonacci < >> bruno.bona...@gmail.com> >>>>> wrote: >>>>> >>>>>> Hi, >>>>>> I'm looking for additional documentation on the different RUNTIME >>>>>> EXECUTION MODELS of the different `job.factory.class`. >>>>>> >>>>>> I'm particularly interested on how each factory (ThreadJobFactory, >>>>>> ProcessJobFactory and YarnJobFactory) will create tasks consume and >>>>> process >>>>>> messages out of Kafka and the thread model used. >>>>>> >>>>>> I did a few tests with the ThreadJob factory consuming out of a kafka >>>>>> topic with 5 partitions and I was expecting that it would use multiple >>>>>> threads to consume/process the different partitions, however it is >>>>>> using only one thread at runtime. >>>>>> >>>>>> Is there any way to tell Samza to use multiple processing threads (1 >>>> per >>>>>> partition)?? >>>>>> >>>>>> >>>>>> Thanks >>>>>> Bruno >>