Thanks, Bill.

Am I correct in understanding that is not possible to parameterize
individual Jobs, just Tasks? Therefore, since I don't know the job
definitions up front, I will have parameterized Task templates, and
generate a new Task every time I need to run a Job?

Is that the recommended route?

Our work is very non-uniform so I don't think work-stealing would be
efficient for us.

-Bryan

On Wed, Feb 26, 2014 at 12:49 PM, Bill Farner <wfar...@apache.org> wrote:
> Thanks for checking out Aurora!
>
> My short answer is that Aurora should handle thousands of short-lived
> tasks/jobs per day without trouble.  (If you proceed with this approach and
> encounter performance issues, feel free to file tickets!)  The DSL does
> have some mechanisms for parameterization.  In your case since you probably
> don't know all the job definitions upfront, you'll probably want to
> parameterize with environment variables.  I don't see this described in our
> docs, but you there's a little detail at the option declaration [1].
>
> Another approach worth considering is work-stealing, using a single job as
> your pool of workers.  I would find this easier to manage, but it would
> only be suitable if your work items are sufficiently-uniform.
>
> Feel free to continue the discussion!  We're also pretty active in our IRC
> channel if you'd prefer that medium.
>
>
> [1]
> https://github.com/apache/incubator-aurora/blob/master/src/main/python/apache/aurora/client/options.py#L170-L183
>
>
> -=Bill
>
>
> On Tue, Feb 25, 2014 at 10:11 PM, Bryan Helmkamp <br...@codeclimate.com>wrote:
>
>> Hello,
>>
>> I am considering Aurora for a key component of our infrastructure.
>> Awesome work being done here.
>>
>> My question is: How suitable is Aurora for running short-lived tasks?
>>
>> Background: We (Code Climate) do static analysis of tens of thousands
>> of repositories every day. We run a variety of forms of analysis, with
>> heterogeneous resource requirements, and thus our interest in Mesos.
>>
>> Looking at Aurora, a lot of the core features look very helpful to us.
>> Where I am getting hung up is figuring out how to model short-lived
>> tasks as tasks/jobs. Long-running resource allocations are not really
>> an option for us due to the variation in our workloads.
>>
>> My first thought was to create a Task for each type of analysis we
>> run, and then start a new Job with the appropriate Task every time we
>> want to run analysis (regulated by a queue). This doesn't seem to work
>> though. I can't `aurora create` the same `.aurora` file multiple times
>> with different Job names (as far as I can tell). Also there is the
>> problem of how to customize each Job slightly (e.g. a payload).
>>
>> An obvious alternative is to create a unique Task every time we want
>> to run work. This would result in tens of thousands of tasks being
>> created every day, and from what I can tell Aurora does not intend to
>> be used like that. (Please correct me if I am wrong.)
>>
>> Basically, I would like to hook my job queue up to Aurora to perform
>> the actual work. There are a dozen different types of jobs, each with
>> different performance requirements. Every time a job runs, it has a
>> unique payload containing the definition of the work it should be
>> performed.
>>
>> Can Aurora be used this way? If so, what is the proper way to model
>> this with respect to Jobs and Tasks?
>>
>> Any/all help is appreciated.
>>
>> Thanks!
>>
>> -Bryan
>>
>> --
>> Bryan Helmkamp, Founder, Code Climate
>> br...@codeclimate.com / 646-379-1810 / @brynary
>>



-- 
Bryan Helmkamp, Founder, Code Climate
br...@codeclimate.com / 646-379-1810 / @brynary

Reply via email to