Alec Ten Harmsel <alec <at> alectenharmsel.com> writes:

> As far as HDFS goes, I would only set that up if you will use it for
> Hadoop or related tools. It's highly specific, and the performance is
> not good unless you're doing a massively parallel read (what it was
> designed for). I can elaborate why if anyone is actually interested.

Acutally, from my research and my goal (one really big scientific simulation
running constantly). Many folks are recommending to skip Hadoop/HDFS all
together and go straight to mesos/spark. RDD (in-memory)  cluster calculations
are at the heart of my needs. The opposite end of the spectrum, loads
of small files and small apps; I dunno about, but, I'm all ears.
In the end, my (3) node scientific cluster will morph and support
the typical myriad  of networked applications, but I can take
a few years to figure that out, or just copy what smart guys like
you and joost do.....


> We use Lustre for our high performance general storage. I don't have any
> numbers, but I'm pretty sure it is *really* fast (10Gbit/s over IB
> sounds familiar, but don't quote me on that).

AT Umich, you guys should test the FhGFS/btrfs combo. The folks 
at UCI swear about it, although they are only publishing a wee bit.
(you know, water cooler gossip)...... Surely the Wolverines do not
want those californians getting up on them?

Are you guys planning a mesos/spark test? 

> > Personally, I would read up on these and see how they work. Then,
> > based on that, decide if they are likely to assist in the specific
> > situation you are interested in.

It's a ton of reading. It's not apples-to-apple_cider type of reading.
My head hurts.....


I'm leaning to  DFS/LFS

(2)  Luster/btrfs      and     FhGFS/btrfs

Thoughts/comments?

James



Reply via email to