I have been thinking lately about the most non-invasive way to add 
multithreading capabilities to ThreadJobFactory, as that is the main method we 
run our jobs in production. Looking at the master branch code in Git, I have 
found the following:
  a.. The best way would be to simply spin up a new thread for each container. 
  b.. The number of containers can already be specified using the configuration 
property job.container.count. 
  c.. I can construct a new SamzaContainer for each containerModel returned 
from coordinator.jobModel.getContainers in ThreadJobFactory. 
  d.. I can pass a list of these containers into ThreadJob constructor 
modifying it to accept an array of Runnables. 
  e.. For each runnable, it would create a new thread and start it in the 
submit method of ThreadJob.
This should start up a new thread for each container and group the tasks using 
the appropriate TaskNameGrouper.

Any ideas on what I might have missed? Are there any other potential solutions? 
Would this be a good patch for Samza in general?

Lukas

Reply via email to