>
>
>
>These are all good ideas. The other trick -which has been discussed
>recently in the context of the Platform Scheduler- is to run HDFS across
>all nodes, but switch the workload of the cluster between Hadoop jobs
>(MR, Graph, Hamster), and other work (Grid jobs). That way the
>filesystem is just a very large FS for anything. If some grid jobs don't
>use the HDFS, the nodes can still serve up their data.

This used to be called Hadoop On Demand (HoD), which used to deploy a
mapreduce cluster on-demand, using torque to allocate nodes. :-)

- milind

---
Milind Bhandarkar
Greenplum Labs, EMC
(Disclaimer: Opinions expressed in this email are those of the author, and
do not necessarily represent the views of any organization, past or
present, the author might be affiliated with.)

Reply via email to