Re: Maximum number of jobs

2015-04-15 Thread jeremy p
Thank you, Chris. I just wrote a separate question, "How to deal with bootstrapping" where I describe the problem in detail. On Wed, Apr 15, 2015 at 1:35 PM, Chris Riccomini wrote: > Hey Jeremy, > > Samza will be fine, but at this scale you need to start worrying about > Kafka and YARN. 1 milli

Re: Maximum number of jobs

2015-04-15 Thread Chris Riccomini
Hey Jeremy, Samza will be fine, but at this scale you need to start worrying about Kafka and YARN. 1 million jobs will likely start to put pressure on YARN's RM due to memory usage and CPU usage for the scheduler. With 1 million jobs, assuming 1 container each, you'll have over 1 million connectio

Maximum number of jobs

2015-04-15 Thread jeremy p
What's the maximum number of Samza jobs I can run simultaneously on a single cluster? Let's say these jobs are very lightweight -- they require little memory or processing power. However, I need a lot of them -- let's say I need to have 1,000,000 running at any given time. Is this reasonable or