Lemur is a tool to launch hadoop jobs locally or on EMR, based on a configuration file referred to as a jobdef. The jobdef file describes your EMR cluster, local environment, pre- and post-actions (aka hooks) and zero or more steps (jobs). Lemur reads your jobdef, at the end of your jobdef you execute (fire! ...) to make things happen. Lemur is implemented as an internal DSL.
What Lemur is Not: Lemur is NOT a replacement for the Ruby elastic-mapreduce cli. There is some overlap, but I did not attempt to reproduce functionality (like --list, --terminate, etc) that works perfectly well in that tool. Also, Lemur is not a job scheduler (a la Oozie, Quartz or Azkaban). You might decide to use one of those tools to trigger Lemur which knows how to run your job. A simple jobdef: (add-validators (val-opts :required :numeric :num-days)) (add-hooks (when-local-test) (diff-test-data ["RESULTS" "results"])) (defcluster marcs-cluster :num-instances 1 :master-instance-type "m1.large" :my-root "/Users/mlimotte/projects/lemur/tmp" :upload ["./marcs-input.txt" :to "${data-uri}/input.txt"] :test-uri "${my-root}/work" :keypair "my-keypair" :jar-src-path "${my-root}/lemur-test-0.0.1-SNAPSHOT-standalone.jar") (defstep marcs-step :main-class "lemur_test.marcs" :args.days #(:num-days %) :args.data-uri true) (fire! marcs-cluster marcs-step) This tool was developed at The Climate Corporation (TCC). TCC has chosen to release this project with an Apache 2.0 license. Blog post: http://entxtech.blogspot.com/2012/05/lemur-declarative-launching-of-hadoop.html Github Project: https://github.com/TheClimateCorporation/lemur Marc Limotte -- You received this message because you are subscribed to the Google Groups "Clojure" group. To post to this group, send email to clojure@googlegroups.com Note that posts from new members are moderated - please be patient with your first post. To unsubscribe from this group, send email to clojure+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/clojure?hl=en