Thanks for checking out Aurora! My short answer is that Aurora should handle thousands of short-lived tasks/jobs per day without trouble. (If you proceed with this approach and encounter performance issues, feel free to file tickets!) The DSL does have some mechanisms for parameterization. In your case since you probably don't know all the job definitions upfront, you'll probably want to parameterize with environment variables. I don't see this described in our docs, but you there's a little detail at the option declaration [1].
Another approach worth considering is work-stealing, using a single job as your pool of workers. I would find this easier to manage, but it would only be suitable if your work items are sufficiently-uniform. Feel free to continue the discussion! We're also pretty active in our IRC channel if you'd prefer that medium. [1] https://github.com/apache/incubator-aurora/blob/master/src/main/python/apache/aurora/client/options.py#L170-L183 -=Bill On Tue, Feb 25, 2014 at 10:11 PM, Bryan Helmkamp <br...@codeclimate.com>wrote: > Hello, > > I am considering Aurora for a key component of our infrastructure. > Awesome work being done here. > > My question is: How suitable is Aurora for running short-lived tasks? > > Background: We (Code Climate) do static analysis of tens of thousands > of repositories every day. We run a variety of forms of analysis, with > heterogeneous resource requirements, and thus our interest in Mesos. > > Looking at Aurora, a lot of the core features look very helpful to us. > Where I am getting hung up is figuring out how to model short-lived > tasks as tasks/jobs. Long-running resource allocations are not really > an option for us due to the variation in our workloads. > > My first thought was to create a Task for each type of analysis we > run, and then start a new Job with the appropriate Task every time we > want to run analysis (regulated by a queue). This doesn't seem to work > though. I can't `aurora create` the same `.aurora` file multiple times > with different Job names (as far as I can tell). Also there is the > problem of how to customize each Job slightly (e.g. a payload). > > An obvious alternative is to create a unique Task every time we want > to run work. This would result in tens of thousands of tasks being > created every day, and from what I can tell Aurora does not intend to > be used like that. (Please correct me if I am wrong.) > > Basically, I would like to hook my job queue up to Aurora to perform > the actual work. There are a dozen different types of jobs, each with > different performance requirements. Every time a job runs, it has a > unique payload containing the definition of the work it should be > performed. > > Can Aurora be used this way? If so, what is the proper way to model > this with respect to Jobs and Tasks? > > Any/all help is appreciated. > > Thanks! > > -Bryan > > -- > Bryan Helmkamp, Founder, Code Climate > br...@codeclimate.com / 646-379-1810 / @brynary >