Hi, I support what Lukas saying. Samza packaging requirements are not friendly, I use the ThreadJobFactory for the same reason.
Bruno On Tue, Sep 15, 2015 at 5:39 PM, Lukas Steiblys <lu...@doubledutch.me> wrote: > Hi Yan, > > We use Samza in a production environment using ProcessJobFactory in Docker > containers because it greatly simplifies our deployment process and makes > much better use of resources. > > Is there any plan to make the ThreadJobFactory or ProcessJobFactory > multithreaded? I will look into doing that myself, but I think it might be > useful to implement this for everyone. I am sure there are plenty of cases > where people do not want to use YARN, but want more parallelism in their > tasks. > > Lukas > > -----Original Message----- From: Yan Fang > Sent: Monday, September 14, 2015 11:08 AM > To: dev@samza.apache.org > Subject: Re: Runtime Execution Model > > > Hi Bruno, > > AFAIK, there is no existing JobFactory that brings as many threads as the > partition number. But I think nothing stops you to implement this: you can > get the partition information from the JobCoordinator, and then bring as > many threads as the partition/task number. > > Since the two local factories (ThreadJobFactory and ProcessJobFactory) are > mainly for development, there is no additional document. But most of the > code here > < > https://github.com/apache/samza/tree/master/samza-core/src/main/scala/org/apache/samza/job/local > > > is > self-explained. > > Thanks, > > Fang, Yan > yanfang...@gmail.com > > On Sat, Sep 12, 2015 at 1:47 PM, Bruno Bonacci <bruno.bona...@gmail.com> > wrote: > > Hi, >> I'm looking for additional documentation on the different RUNTIME >> EXECUTION MODELS of the different `job.factory.class`. >> >> I'm particularly interested on how each factory (ThreadJobFactory, >> ProcessJobFactory and YarnJobFactory) will create tasks consume and >> process >> messages out of Kafka and the thread model used. >> >> I did a few tests with the ThreadJob factory consuming out of a kafka >> topic with 5 partitions and I was expecting that it would use multiple >> threads to consume/process the different partitions, however it is >> using only one thread at runtime. >> >> Is there any way to tell Samza to use multiple processing threads (1 per >> partition)?? >> >> >> Thanks >> Bruno >> >> >