Re: Suitibility of Aurora for one-time tasks

Kevin Sweeney Wed, 26 Feb 2014 12:35:57 -0800

A job is a group of nearly-identical tasks plus some constraints like rack
diversity. The scheduler considers each task within a job equivalently
schedulable, so you can't vary things like resource footprint. It's
perfectly fine to have several jobs with just a single task, as long as
each has a different job key (which is (role, environment, name)).


Another approach is to have a bunch of uniform always-on workers (in
different sizes). This can be expressed as a Service like so:

# workers.aurora
class Profile(Struct):
  queue_name = Required(String)
  resources = Required(Resources)
  instances = Required(Integer)

HIGH_MEM = Resources(cpu = 8.0, ram = 32 * GB, disk = 64 * GB)
HIGH_CPU = Resources(cpu = 16.0, ram = 4 * GB, disk = 64 * GB)

work_forever = Process(name = 'work_forever',
  cmdline = '''
    # TODO: Replace this with something that isn't pseudo-bash
    while true; do
      work_item=`take_from_work_queue {{profile.queue_name}}`
      do_work "$work_item"
      tell_work_queue_finished "{{profile.queue_name}}" "$work_item"
    done
  ''')

task = Task(processes = [work_forever],
*  resources = '{{profile.resources}}, # Note this is static per
queue-name.*
)

service = Service(
  task = task,
  cluster = 'west',
  role = 'service-account-name',
  environment = 'prod',
  name = '{{profile.queue_name}}_processor'
  *instances = '{{profile.instances}}', # Scale here.*
)

jobs = [
  service.bind(profile = Profile(
    resources = HIGH_MEM,
    queue_name = 'graph_traversals',
    instances = 50,
  )),
  service.bind(profile = Profile(
    resources = HIGH_CPU,
    queue_name = 'compilations',
    instances = 200,
  )),
]


On Wed, Feb 26, 2014 at 11:46 AM, Bryan Helmkamp <br...@codeclimate.com>wrote:

> Thanks, Bill.
>
> Am I correct in understanding that is not possible to parameterize
> individual Jobs, just Tasks? Therefore, since I don't know the job
> definitions up front, I will have parameterized Task templates, and
> generate a new Task every time I need to run a Job?
>
> Is that the recommended route?
>
> Our work is very non-uniform so I don't think work-stealing would be
> efficient for us.
>
> -Bryan
>
> On Wed, Feb 26, 2014 at 12:49 PM, Bill Farner <wfar...@apache.org> wrote:
> > Thanks for checking out Aurora!
> >
> > My short answer is that Aurora should handle thousands of short-lived
> > tasks/jobs per day without trouble.  (If you proceed with this approach
> and
> > encounter performance issues, feel free to file tickets!)  The DSL does
> > have some mechanisms for parameterization.  In your case since you
> probably
> > don't know all the job definitions upfront, you'll probably want to
> > parameterize with environment variables.  I don't see this described in
> our
> > docs, but you there's a little detail at the option declaration [1].
> >
> > Another approach worth considering is work-stealing, using a single job
> as
> > your pool of workers.  I would find this easier to manage, but it would
> > only be suitable if your work items are sufficiently-uniform.
> >
> > Feel free to continue the discussion!  We're also pretty active in our
> IRC
> > channel if you'd prefer that medium.
> >
> >
> > [1]
> >
> https://github.com/apache/incubator-aurora/blob/master/src/main/python/apache/aurora/client/options.py#L170-L183
> >
> >
> > -=Bill
> >
> >
> > On Tue, Feb 25, 2014 at 10:11 PM, Bryan Helmkamp <br...@codeclimate.com
> >wrote:
> >
> >> Hello,
> >>
> >> I am considering Aurora for a key component of our infrastructure.
> >> Awesome work being done here.
> >>
> >> My question is: How suitable is Aurora for running short-lived tasks?
> >>
> >> Background: We (Code Climate) do static analysis of tens of thousands
> >> of repositories every day. We run a variety of forms of analysis, with
> >> heterogeneous resource requirements, and thus our interest in Mesos.
> >>
> >> Looking at Aurora, a lot of the core features look very helpful to us.
> >> Where I am getting hung up is figuring out how to model short-lived
> >> tasks as tasks/jobs. Long-running resource allocations are not really
> >> an option for us due to the variation in our workloads.
> >>
> >> My first thought was to create a Task for each type of analysis we
> >> run, and then start a new Job with the appropriate Task every time we
> >> want to run analysis (regulated by a queue). This doesn't seem to work
> >> though. I can't `aurora create` the same `.aurora` file multiple times
> >> with different Job names (as far as I can tell). Also there is the
> >> problem of how to customize each Job slightly (e.g. a payload).
> >>
> >> An obvious alternative is to create a unique Task every time we want
> >> to run work. This would result in tens of thousands of tasks being
> >> created every day, and from what I can tell Aurora does not intend to
> >> be used like that. (Please correct me if I am wrong.)
> >>
> >> Basically, I would like to hook my job queue up to Aurora to perform
> >> the actual work. There are a dozen different types of jobs, each with
> >> different performance requirements. Every time a job runs, it has a
> >> unique payload containing the definition of the work it should be
> >> performed.
> >>
> >> Can Aurora be used this way? If so, what is the proper way to model
> >> this with respect to Jobs and Tasks?
> >>
> >> Any/all help is appreciated.
> >>
> >> Thanks!
> >>
> >> -Bryan
> >>
> >> --
> >> Bryan Helmkamp, Founder, Code Climate
> >> br...@codeclimate.com / 646-379-1810 / @brynary
> >>
>
>
>
> --
> Bryan Helmkamp, Founder, Code Climate
> br...@codeclimate.com / 646-379-1810 / @brynary
>

Re: Suitibility of Aurora for one-time tasks

Reply via email to