Hey all, I'd like to announce Pallet-Hadoop<https://github.com/pallet/pallet-hadoop/tree/master>, a layer built on top of Pallet <https://github.com/pallet/pallet> that allows users to describe a Hadoop cluster configuration as a nested clojure map. Here's a cluster with one master node and two slave nodes with some custom properties, all 64 bit machines with at least 4 gigs of RAM, running Ubuntu 10.10:
(def example-cluster (cluster-spec :private {:jobtracker (node-group [:jobtracker :namenode]) :slaves (slave-group 2)} :base-machine-spec {:os-family :ubuntu :os-version-matches "10.10" :os-64-bit true :min-ram (* 4 1024)} :base-props {:hdfs-site {:dfs.data.dir "/mnt/dfs/data" :dfs.name.dir "/mnt/dfs/name"} :mapred-site {:mapred.task.timeout 300000 :mapred.reduce.tasks 3}})) Thanks to Pallet's flexibility and use of jclouds<https://github.com/jclouds/jclouds>, the cluster description can be written without reference to any specific cloud provider, and can be used to boot machines on any of the major cloud providers <https://github.com/jclouds/jclouds#readme> (or on local virtual machines!) with a simple change of credentials. This example project <https://github.com/pallet/pallet-hadoop-example> contains everything you need to get started; it walks through all steps necessary to boot a cluster and run the canonical word count example on Amazon's EC2 platform. The project wiki <https://github.com/pallet/pallet-hadoop/wiki> contains a lot more detail on the design and flexibility of the data structures involved. Future plans include intelligent default settings that adjust based on the specs of the cluster, and the ability to run Cascalog<https://github.com/nathanmarz/cascalog> queries on these distributed clusters from Cake and Leiningen. I'd love to hear what you all think about this! Huge thanks to Hugo Duncan<http://hugoduncan.org/>for getting this started, and to Toni Batchelli <http://tbatchelli.org/> for his excellent work on this project and its foundation, Pallet's new Hadoop crate<https://github.com/pallet/pallet-apache-crates> . Cheers, Sam -- You received this message because you are subscribed to the Google Groups "Clojure" group. To post to this group, send email to clojure@googlegroups.com Note that posts from new members are moderated - please be patient with your first post. To unsubscribe from this group, send email to clojure+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/clojure?hl=en