For a more dynamic approach to resource utilization you can use something like this:
# dynamic.aurora *# Enqueue each individual work-item with aurora create -E work_item=$work_item -E resource_profile=graph_traversals west/service-account-name/prod/process_$work_item* class Profile(Struct): queue_name = Required(String) resources = Required(Resources) HIGH_MEM = Resources(cpu = 8.0, ram = 32 * GB, disk = 64 * GB) HIGH_CPU = Resources(cpu = 16.0, ram = 4 * GB, disk = 64 * GB) work_on_one_item = Process(name = 'work_on_one_item', cmdline = ''' do_work "{{work_item}}" ''', ) task = Task(processes = [work_on_one_item], resources = '{{resources[{{resource_profile}}]}}') job = Job( task = task, cluster = 'west', role = 'service-account-name', environment = 'prod', name = 'process_{{work_item}}', ) resources = { 'graph_traversals': HIGH_MEM, 'compilations': HIGH_CPU, } jobs = [job.bind(resources = resources)] On Wed, Feb 26, 2014 at 1:08 PM, Bryan Helmkamp <br...@codeclimate.com>wrote: > Sure. Yes, they are shell commands and yes they are provided different > configuration on each run. > > In effect we have a number of different job types that are queued up, > and we need to run as quickly as possible. Each job type has different > resource requirements. Every time we run the job, we provide different > arguments (the "payload"). For example: > > $ ./do_something.sh SOME_ID (Requires 1 CPU and 1GB RAM) > $ ./do_something_else.sh SOME_OTHER_ID (Requires 4 CPU and 4GB RAM) > [... there are about 12 of these ...] > > -Bryan > > On Wed, Feb 26, 2014 at 3:58 PM, Bill Farner <wfar...@apache.org> wrote: > > Can you offer some more details on what the workload execution looks > like? > > Are these shell commands? An application that's provided different > > configuration? > > > > -=Bill > > > > > > On Wed, Feb 26, 2014 at 12:45 PM, Bryan Helmkamp <br...@codeclimate.com > >wrote: > > > >> Thanks, Kevin. The idea of always-on workers of varying sizes is > >> effectively what we have right now in our non-Mesos world. The problem > >> is that sometimes we end up with not enough workers for certain > >> classes of jobs (e.g. High Memory), while part of the cluster sits > >> idle. > >> > >> Conceptually, in my mind we would define approximately a dozen Tasks, > >> one for each type of work we need to perform (with different resource > >> requirements), and then run Jobs, each with a Task and a unique > >> payload, but I don't think this model works with Mesos. It seems we'd > >> need to create a unique Task for every Job. > >> > >> -Bryan > >> > >> On Wed, Feb 26, 2014 at 3:35 PM, Kevin Sweeney <kevi...@apache.org> > wrote: > >> > A job is a group of nearly-identical tasks plus some constraints like > >> rack > >> > diversity. The scheduler considers each task within a job equivalently > >> > schedulable, so you can't vary things like resource footprint. It's > >> > perfectly fine to have several jobs with just a single task, as long > as > >> > each has a different job key (which is (role, environment, name)). > >> > > >> > Another approach is to have a bunch of uniform always-on workers (in > >> > different sizes). This can be expressed as a Service like so: > >> > > >> > # workers.aurora > >> > class Profile(Struct): > >> > queue_name = Required(String) > >> > resources = Required(Resources) > >> > instances = Required(Integer) > >> > > >> > HIGH_MEM = Resources(cpu = 8.0, ram = 32 * GB, disk = 64 * GB) > >> > HIGH_CPU = Resources(cpu = 16.0, ram = 4 * GB, disk = 64 * GB) > >> > > >> > work_forever = Process(name = 'work_forever', > >> > cmdline = ''' > >> > # TODO: Replace this with something that isn't pseudo-bash > >> > while true; do > >> > work_item=`take_from_work_queue {{profile.queue_name}}` > >> > do_work "$work_item" > >> > tell_work_queue_finished "{{profile.queue_name}}" "$work_item" > >> > done > >> > ''') > >> > > >> > task = Task(processes = [work_forever], > >> > * resources = '{{profile.resources}}, # Note this is static per > >> > queue-name.* > >> > ) > >> > > >> > service = Service( > >> > task = task, > >> > cluster = 'west', > >> > role = 'service-account-name', > >> > environment = 'prod', > >> > name = '{{profile.queue_name}}_processor' > >> > *instances = '{{profile.instances}}', # Scale here.* > >> > ) > >> > > >> > jobs = [ > >> > service.bind(profile = Profile( > >> > resources = HIGH_MEM, > >> > queue_name = 'graph_traversals', > >> > instances = 50, > >> > )), > >> > service.bind(profile = Profile( > >> > resources = HIGH_CPU, > >> > queue_name = 'compilations', > >> > instances = 200, > >> > )), > >> > ] > >> > > >> > > >> > On Wed, Feb 26, 2014 at 11:46 AM, Bryan Helmkamp < > br...@codeclimate.com > >> >wrote: > >> > > >> >> Thanks, Bill. > >> >> > >> >> Am I correct in understanding that is not possible to parameterize > >> >> individual Jobs, just Tasks? Therefore, since I don't know the job > >> >> definitions up front, I will have parameterized Task templates, and > >> >> generate a new Task every time I need to run a Job? > >> >> > >> >> Is that the recommended route? > >> >> > >> >> Our work is very non-uniform so I don't think work-stealing would be > >> >> efficient for us. > >> >> > >> >> -Bryan > >> >> > >> >> On Wed, Feb 26, 2014 at 12:49 PM, Bill Farner <wfar...@apache.org> > >> wrote: > >> >> > Thanks for checking out Aurora! > >> >> > > >> >> > My short answer is that Aurora should handle thousands of > short-lived > >> >> > tasks/jobs per day without trouble. (If you proceed with this > >> approach > >> >> and > >> >> > encounter performance issues, feel free to file tickets!) The DSL > >> does > >> >> > have some mechanisms for parameterization. In your case since you > >> >> probably > >> >> > don't know all the job definitions upfront, you'll probably want to > >> >> > parameterize with environment variables. I don't see this > described > >> in > >> >> our > >> >> > docs, but you there's a little detail at the option declaration > [1]. > >> >> > > >> >> > Another approach worth considering is work-stealing, using a single > >> job > >> >> as > >> >> > your pool of workers. I would find this easier to manage, but it > >> would > >> >> > only be suitable if your work items are sufficiently-uniform. > >> >> > > >> >> > Feel free to continue the discussion! We're also pretty active in > our > >> >> IRC > >> >> > channel if you'd prefer that medium. > >> >> > > >> >> > > >> >> > [1] > >> >> > > >> >> > >> > https://github.com/apache/incubator-aurora/blob/master/src/main/python/apache/aurora/client/options.py#L170-L183 > >> >> > > >> >> > > >> >> > -=Bill > >> >> > > >> >> > > >> >> > On Tue, Feb 25, 2014 at 10:11 PM, Bryan Helmkamp < > >> br...@codeclimate.com > >> >> >wrote: > >> >> > > >> >> >> Hello, > >> >> >> > >> >> >> I am considering Aurora for a key component of our infrastructure. > >> >> >> Awesome work being done here. > >> >> >> > >> >> >> My question is: How suitable is Aurora for running short-lived > tasks? > >> >> >> > >> >> >> Background: We (Code Climate) do static analysis of tens of > thousands > >> >> >> of repositories every day. We run a variety of forms of analysis, > >> with > >> >> >> heterogeneous resource requirements, and thus our interest in > Mesos. > >> >> >> > >> >> >> Looking at Aurora, a lot of the core features look very helpful to > >> us. > >> >> >> Where I am getting hung up is figuring out how to model > short-lived > >> >> >> tasks as tasks/jobs. Long-running resource allocations are not > really > >> >> >> an option for us due to the variation in our workloads. > >> >> >> > >> >> >> My first thought was to create a Task for each type of analysis we > >> >> >> run, and then start a new Job with the appropriate Task every > time we > >> >> >> want to run analysis (regulated by a queue). This doesn't seem to > >> work > >> >> >> though. I can't `aurora create` the same `.aurora` file multiple > >> times > >> >> >> with different Job names (as far as I can tell). Also there is the > >> >> >> problem of how to customize each Job slightly (e.g. a payload). > >> >> >> > >> >> >> An obvious alternative is to create a unique Task every time we > want > >> >> >> to run work. This would result in tens of thousands of tasks being > >> >> >> created every day, and from what I can tell Aurora does not > intend to > >> >> >> be used like that. (Please correct me if I am wrong.) > >> >> >> > >> >> >> Basically, I would like to hook my job queue up to Aurora to > perform > >> >> >> the actual work. There are a dozen different types of jobs, each > with > >> >> >> different performance requirements. Every time a job runs, it has > a > >> >> >> unique payload containing the definition of the work it should be > >> >> >> performed. > >> >> >> > >> >> >> Can Aurora be used this way? If so, what is the proper way to > model > >> >> >> this with respect to Jobs and Tasks? > >> >> >> > >> >> >> Any/all help is appreciated. > >> >> >> > >> >> >> Thanks! > >> >> >> > >> >> >> -Bryan > >> >> >> > >> >> >> -- > >> >> >> Bryan Helmkamp, Founder, Code Climate > >> >> >> br...@codeclimate.com / 646-379-1810 / @brynary > >> >> >> > >> >> > >> >> > >> >> > >> >> -- > >> >> Bryan Helmkamp, Founder, Code Climate > >> >> br...@codeclimate.com / 646-379-1810 / @brynary > >> >> > >> > >> > >> > >> -- > >> Bryan Helmkamp, Founder, Code Climate > >> br...@codeclimate.com / 646-379-1810 / @brynary > >> > > > > -- > Bryan Helmkamp, Founder, Code Climate > br...@codeclimate.com / 646-379-1810 / @brynary >