Thanks, Bill. Am I correct in understanding that is not possible to parameterize individual Jobs, just Tasks? Therefore, since I don't know the job definitions up front, I will have parameterized Task templates, and generate a new Task every time I need to run a Job?
Is that the recommended route? Our work is very non-uniform so I don't think work-stealing would be efficient for us. -Bryan On Wed, Feb 26, 2014 at 12:49 PM, Bill Farner <wfar...@apache.org> wrote: > Thanks for checking out Aurora! > > My short answer is that Aurora should handle thousands of short-lived > tasks/jobs per day without trouble. (If you proceed with this approach and > encounter performance issues, feel free to file tickets!) The DSL does > have some mechanisms for parameterization. In your case since you probably > don't know all the job definitions upfront, you'll probably want to > parameterize with environment variables. I don't see this described in our > docs, but you there's a little detail at the option declaration [1]. > > Another approach worth considering is work-stealing, using a single job as > your pool of workers. I would find this easier to manage, but it would > only be suitable if your work items are sufficiently-uniform. > > Feel free to continue the discussion! We're also pretty active in our IRC > channel if you'd prefer that medium. > > > [1] > https://github.com/apache/incubator-aurora/blob/master/src/main/python/apache/aurora/client/options.py#L170-L183 > > > -=Bill > > > On Tue, Feb 25, 2014 at 10:11 PM, Bryan Helmkamp <br...@codeclimate.com>wrote: > >> Hello, >> >> I am considering Aurora for a key component of our infrastructure. >> Awesome work being done here. >> >> My question is: How suitable is Aurora for running short-lived tasks? >> >> Background: We (Code Climate) do static analysis of tens of thousands >> of repositories every day. We run a variety of forms of analysis, with >> heterogeneous resource requirements, and thus our interest in Mesos. >> >> Looking at Aurora, a lot of the core features look very helpful to us. >> Where I am getting hung up is figuring out how to model short-lived >> tasks as tasks/jobs. Long-running resource allocations are not really >> an option for us due to the variation in our workloads. >> >> My first thought was to create a Task for each type of analysis we >> run, and then start a new Job with the appropriate Task every time we >> want to run analysis (regulated by a queue). This doesn't seem to work >> though. I can't `aurora create` the same `.aurora` file multiple times >> with different Job names (as far as I can tell). Also there is the >> problem of how to customize each Job slightly (e.g. a payload). >> >> An obvious alternative is to create a unique Task every time we want >> to run work. This would result in tens of thousands of tasks being >> created every day, and from what I can tell Aurora does not intend to >> be used like that. (Please correct me if I am wrong.) >> >> Basically, I would like to hook my job queue up to Aurora to perform >> the actual work. There are a dozen different types of jobs, each with >> different performance requirements. Every time a job runs, it has a >> unique payload containing the definition of the work it should be >> performed. >> >> Can Aurora be used this way? If so, what is the proper way to model >> this with respect to Jobs and Tasks? >> >> Any/all help is appreciated. >> >> Thanks! >> >> -Bryan >> >> -- >> Bryan Helmkamp, Founder, Code Climate >> br...@codeclimate.com / 646-379-1810 / @brynary >> -- Bryan Helmkamp, Founder, Code Climate br...@codeclimate.com / 646-379-1810 / @brynary