Thanks for the clarification -- very helpful. I'll take a look at those tickets!
On Wed, Mar 2, 2016 at 2:11 PM, Yi Pan <nickpa...@gmail.com> wrote: > Hi, Robert, > > The main reason that ThreadJobFactory and ProcessJobFactory are not > considered "production-ready" is that there is only one container for the > job and all tasks are assigned to the single container. Hence, it is not > easy to scale out of a single host. > > As Rick mentioned, Netflix has put up a patch in SAMZA-41 based on 0.9.1 o > allow static assignment of a subset of partitions to a single ProcessJob, > which allows to launch multiple ProcessJobs in different hosts. We planned > to merge it to 0.10. But it turns out that too much changes have gone into > 0.10 and it became difficult to merge the patch. At this point, we can > still try the following two options: > 1) We can attempt to merge SAMZA-41 to 0.10.1 again, it may take some > effort but would give a stop-gap solution. > 2) We are working on a standalone Samza model (SAMZA-516, SAMZA-881) to > allow users to run Samza w/o depending on YarnJobFactory. This is a > long-term effort and will take some time to flesh out. Please join the > discussion there s.t. we can be more aligned in our effort. > > Hope the above gives you an overall picture on where we are going. > > Thanks a lot! > > -Yi > > On Wed, Mar 2, 2016 at 1:28 PM, Rick Mangi <r...@chartbeat.com> wrote: > > > There was an interesting thread a while back from I believe the netflix > > guys about running ThreadJobFactory in production. > > > > > > > On Mar 2, 2016, at 4:20 PM, Robert Crim <rjc...@gmail.com> wrote: > > > > > > Hi, > > > > > > We're currently working on a solution that allows us to run Samza jobs > on > > > Mesos. This seems to be going well, and something we'd like to move > away > > > from when native Mesos support is added to Samza. > > > > > > While we're developing and testing our scheduler, I'm wondering about > the > > > implications of running tasks with the ThreadJobFactory in > "production". > > > The documentation advise against this, but it's not clear why. > > > > > > If we were using the ThreadJobFactory inside of a docker container on > > Mesos > > > with Marathon for production, would be our main problem? These are not > > > particularly high-load tasks. Aside from not be able to get > find-grained > > > resource scheduling per-task, it seems like the main issue the not > being > > to > > > easily tell when a job stops due to error / exception. > > > > > > In other words, what would be stop-stopping reasons to not use the > > > TreadJobFactory in production? > > > > > > Thanks, > > > Rob > > > > >