And after a bit of code spelunking the semantics you want already exist (just undocumented). Updated the ticket to update the documentation.
On Wed, Feb 26, 2014 at 6:00 PM, Kevin Sweeney <kevi...@apache.org> wrote: > The example I gave is somewhat syntactically invalid due to coding via > email, but that's more or less what the interface will look like. I also > filed https://issues.apache.org/jira/browse/AURORA-236 for more > first-class support of the semantics I think you want (though currently you > can fake it by setting max_failures to a very high number). > > > On Wed, Feb 26, 2014 at 5:33 PM, Bryan Helmkamp <br...@codeclimate.com>wrote: > >> Thanks, Kevin. That pretty much looks like exactly what I need. >> >> -Bryan >> >> On Wed, Feb 26, 2014 at 8:16 PM, Kevin Sweeney <kevi...@apache.org> >> wrote: >> > For a more dynamic approach to resource utilization you can use >> something >> > like this: >> > >> > # dynamic.aurora >> > *# Enqueue each individual work-item with aurora create -E >> > work_item=$work_item -E resource_profile=graph_traversals >> > west/service-account-name/prod/process_$work_item* >> > class Profile(Struct): >> > queue_name = Required(String) >> > resources = Required(Resources) >> > >> > HIGH_MEM = Resources(cpu = 8.0, ram = 32 * GB, disk = 64 * GB) >> > HIGH_CPU = Resources(cpu = 16.0, ram = 4 * GB, disk = 64 * GB) >> > >> > work_on_one_item = Process(name = 'work_on_one_item', >> > cmdline = ''' >> > do_work "{{work_item}}" >> > ''', >> > ) >> > >> > task = Task(processes = [work_on_one_item], >> > resources = '{{resources[{{resource_profile}}]}}') >> > >> > job = Job( >> > task = task, >> > cluster = 'west', >> > role = 'service-account-name', >> > environment = 'prod', >> > name = 'process_{{work_item}}', >> > ) >> > >> > resources = { >> > 'graph_traversals': HIGH_MEM, >> > 'compilations': HIGH_CPU, >> > } >> > >> > jobs = [job.bind(resources = resources)] >> > >> > >> > >> > On Wed, Feb 26, 2014 at 1:08 PM, Bryan Helmkamp <br...@codeclimate.com >> >wrote: >> > >> >> Sure. Yes, they are shell commands and yes they are provided different >> >> configuration on each run. >> >> >> >> In effect we have a number of different job types that are queued up, >> >> and we need to run as quickly as possible. Each job type has different >> >> resource requirements. Every time we run the job, we provide different >> >> arguments (the "payload"). For example: >> >> >> >> $ ./do_something.sh SOME_ID (Requires 1 CPU and 1GB RAM) >> >> $ ./do_something_else.sh SOME_OTHER_ID (Requires 4 CPU and 4GB RAM) >> >> [... there are about 12 of these ...] >> >> >> >> -Bryan >> >> >> >> On Wed, Feb 26, 2014 at 3:58 PM, Bill Farner <wfar...@apache.org> >> wrote: >> >> > Can you offer some more details on what the workload execution looks >> >> like? >> >> > Are these shell commands? An application that's provided different >> >> > configuration? >> >> > >> >> > -=Bill >> >> > >> >> > >> >> > On Wed, Feb 26, 2014 at 12:45 PM, Bryan Helmkamp < >> br...@codeclimate.com >> >> >wrote: >> >> > >> >> >> Thanks, Kevin. The idea of always-on workers of varying sizes is >> >> >> effectively what we have right now in our non-Mesos world. The >> problem >> >> >> is that sometimes we end up with not enough workers for certain >> >> >> classes of jobs (e.g. High Memory), while part of the cluster sits >> >> >> idle. >> >> >> >> >> >> Conceptually, in my mind we would define approximately a dozen >> Tasks, >> >> >> one for each type of work we need to perform (with different >> resource >> >> >> requirements), and then run Jobs, each with a Task and a unique >> >> >> payload, but I don't think this model works with Mesos. It seems >> we'd >> >> >> need to create a unique Task for every Job. >> >> >> >> >> >> -Bryan >> >> >> >> >> >> On Wed, Feb 26, 2014 at 3:35 PM, Kevin Sweeney <kevi...@apache.org> >> >> wrote: >> >> >> > A job is a group of nearly-identical tasks plus some constraints >> like >> >> >> rack >> >> >> > diversity. The scheduler considers each task within a job >> equivalently >> >> >> > schedulable, so you can't vary things like resource footprint. >> It's >> >> >> > perfectly fine to have several jobs with just a single task, as >> long >> >> as >> >> >> > each has a different job key (which is (role, environment, name)). >> >> >> > >> >> >> > Another approach is to have a bunch of uniform always-on workers >> (in >> >> >> > different sizes). This can be expressed as a Service like so: >> >> >> > >> >> >> > # workers.aurora >> >> >> > class Profile(Struct): >> >> >> > queue_name = Required(String) >> >> >> > resources = Required(Resources) >> >> >> > instances = Required(Integer) >> >> >> > >> >> >> > HIGH_MEM = Resources(cpu = 8.0, ram = 32 * GB, disk = 64 * GB) >> >> >> > HIGH_CPU = Resources(cpu = 16.0, ram = 4 * GB, disk = 64 * GB) >> >> >> > >> >> >> > work_forever = Process(name = 'work_forever', >> >> >> > cmdline = ''' >> >> >> > # TODO: Replace this with something that isn't pseudo-bash >> >> >> > while true; do >> >> >> > work_item=`take_from_work_queue {{profile.queue_name}}` >> >> >> > do_work "$work_item" >> >> >> > tell_work_queue_finished "{{profile.queue_name}}" >> "$work_item" >> >> >> > done >> >> >> > ''') >> >> >> > >> >> >> > task = Task(processes = [work_forever], >> >> >> > * resources = '{{profile.resources}}, # Note this is static per >> >> >> > queue-name.* >> >> >> > ) >> >> >> > >> >> >> > service = Service( >> >> >> > task = task, >> >> >> > cluster = 'west', >> >> >> > role = 'service-account-name', >> >> >> > environment = 'prod', >> >> >> > name = '{{profile.queue_name}}_processor' >> >> >> > *instances = '{{profile.instances}}', # Scale here.* >> >> >> > ) >> >> >> > >> >> >> > jobs = [ >> >> >> > service.bind(profile = Profile( >> >> >> > resources = HIGH_MEM, >> >> >> > queue_name = 'graph_traversals', >> >> >> > instances = 50, >> >> >> > )), >> >> >> > service.bind(profile = Profile( >> >> >> > resources = HIGH_CPU, >> >> >> > queue_name = 'compilations', >> >> >> > instances = 200, >> >> >> > )), >> >> >> > ] >> >> >> > >> >> >> > >> >> >> > On Wed, Feb 26, 2014 at 11:46 AM, Bryan Helmkamp < >> >> br...@codeclimate.com >> >> >> >wrote: >> >> >> > >> >> >> >> Thanks, Bill. >> >> >> >> >> >> >> >> Am I correct in understanding that is not possible to >> parameterize >> >> >> >> individual Jobs, just Tasks? Therefore, since I don't know the >> job >> >> >> >> definitions up front, I will have parameterized Task templates, >> and >> >> >> >> generate a new Task every time I need to run a Job? >> >> >> >> >> >> >> >> Is that the recommended route? >> >> >> >> >> >> >> >> Our work is very non-uniform so I don't think work-stealing >> would be >> >> >> >> efficient for us. >> >> >> >> >> >> >> >> -Bryan >> >> >> >> >> >> >> >> On Wed, Feb 26, 2014 at 12:49 PM, Bill Farner < >> wfar...@apache.org> >> >> >> wrote: >> >> >> >> > Thanks for checking out Aurora! >> >> >> >> > >> >> >> >> > My short answer is that Aurora should handle thousands of >> >> short-lived >> >> >> >> > tasks/jobs per day without trouble. (If you proceed with this >> >> >> approach >> >> >> >> and >> >> >> >> > encounter performance issues, feel free to file tickets!) The >> DSL >> >> >> does >> >> >> >> > have some mechanisms for parameterization. In your case since >> you >> >> >> >> probably >> >> >> >> > don't know all the job definitions upfront, you'll probably >> want to >> >> >> >> > parameterize with environment variables. I don't see this >> >> described >> >> >> in >> >> >> >> our >> >> >> >> > docs, but you there's a little detail at the option declaration >> >> [1]. >> >> >> >> > >> >> >> >> > Another approach worth considering is work-stealing, using a >> single >> >> >> job >> >> >> >> as >> >> >> >> > your pool of workers. I would find this easier to manage, but >> it >> >> >> would >> >> >> >> > only be suitable if your work items are sufficiently-uniform. >> >> >> >> > >> >> >> >> > Feel free to continue the discussion! We're also pretty >> active in >> >> our >> >> >> >> IRC >> >> >> >> > channel if you'd prefer that medium. >> >> >> >> > >> >> >> >> > >> >> >> >> > [1] >> >> >> >> > >> >> >> >> >> >> >> >> >> >> https://github.com/apache/incubator-aurora/blob/master/src/main/python/apache/aurora/client/options.py#L170-L183 >> >> >> >> > >> >> >> >> > >> >> >> >> > -=Bill >> >> >> >> > >> >> >> >> > >> >> >> >> > On Tue, Feb 25, 2014 at 10:11 PM, Bryan Helmkamp < >> >> >> br...@codeclimate.com >> >> >> >> >wrote: >> >> >> >> > >> >> >> >> >> Hello, >> >> >> >> >> >> >> >> >> >> I am considering Aurora for a key component of our >> infrastructure. >> >> >> >> >> Awesome work being done here. >> >> >> >> >> >> >> >> >> >> My question is: How suitable is Aurora for running short-lived >> >> tasks? >> >> >> >> >> >> >> >> >> >> Background: We (Code Climate) do static analysis of tens of >> >> thousands >> >> >> >> >> of repositories every day. We run a variety of forms of >> analysis, >> >> >> with >> >> >> >> >> heterogeneous resource requirements, and thus our interest in >> >> Mesos. >> >> >> >> >> >> >> >> >> >> Looking at Aurora, a lot of the core features look very >> helpful to >> >> >> us. >> >> >> >> >> Where I am getting hung up is figuring out how to model >> >> short-lived >> >> >> >> >> tasks as tasks/jobs. Long-running resource allocations are not >> >> really >> >> >> >> >> an option for us due to the variation in our workloads. >> >> >> >> >> >> >> >> >> >> My first thought was to create a Task for each type of >> analysis we >> >> >> >> >> run, and then start a new Job with the appropriate Task every >> >> time we >> >> >> >> >> want to run analysis (regulated by a queue). This doesn't >> seem to >> >> >> work >> >> >> >> >> though. I can't `aurora create` the same `.aurora` file >> multiple >> >> >> times >> >> >> >> >> with different Job names (as far as I can tell). Also there >> is the >> >> >> >> >> problem of how to customize each Job slightly (e.g. a >> payload). >> >> >> >> >> >> >> >> >> >> An obvious alternative is to create a unique Task every time >> we >> >> want >> >> >> >> >> to run work. This would result in tens of thousands of tasks >> being >> >> >> >> >> created every day, and from what I can tell Aurora does not >> >> intend to >> >> >> >> >> be used like that. (Please correct me if I am wrong.) >> >> >> >> >> >> >> >> >> >> Basically, I would like to hook my job queue up to Aurora to >> >> perform >> >> >> >> >> the actual work. There are a dozen different types of jobs, >> each >> >> with >> >> >> >> >> different performance requirements. Every time a job runs, it >> has >> >> a >> >> >> >> >> unique payload containing the definition of the work it >> should be >> >> >> >> >> performed. >> >> >> >> >> >> >> >> >> >> Can Aurora be used this way? If so, what is the proper way to >> >> model >> >> >> >> >> this with respect to Jobs and Tasks? >> >> >> >> >> >> >> >> >> >> Any/all help is appreciated. >> >> >> >> >> >> >> >> >> >> Thanks! >> >> >> >> >> >> >> >> >> >> -Bryan >> >> >> >> >> >> >> >> >> >> -- >> >> >> >> >> Bryan Helmkamp, Founder, Code Climate >> >> >> >> >> br...@codeclimate.com / 646-379-1810 / @brynary >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> -- >> >> >> >> Bryan Helmkamp, Founder, Code Climate >> >> >> >> br...@codeclimate.com / 646-379-1810 / @brynary >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> -- >> >> >> Bryan Helmkamp, Founder, Code Climate >> >> >> br...@codeclimate.com / 646-379-1810 / @brynary >> >> >> >> >> >> >> >> >> >> >> -- >> >> Bryan Helmkamp, Founder, Code Climate >> >> br...@codeclimate.com / 646-379-1810 / @brynary >> >> >> >> >> >> -- >> Bryan Helmkamp, Founder, Code Climate >> br...@codeclimate.com / 646-379-1810 / @brynary >> > >