Ski> I am in the market for a new NAS system (30TB usable for end user
Ski> use - not virtual machines) and our 2 finalists (EMC and NetApp)
Ski> have taken very different approaches to solving the problem.  I
Ski> am wondering which solution you would be most comfortable with.

I'm curious about your needs here.  Do you need 30Tb in one volume?
Or are you going to carve it up into multiple volumes and then share
them to the customers?  And do you expect to have to grow/shrink
volumes?

I ask this because we're a Netapp shop and the 16Tb limit on aggregate
size in OnTap 7.x, and the consequent limit on volume size, really
really really kills us in terms of managing our disk usage.  As
project grow, I'd love to just be able to allocate them space, and as
the project shrinks, I'd just shrink their dedicated volume in
response.

But since Volumes can't span aggregates, it's not a good way to work.  

This has made me look quiet closely at Isilon's (now EMC) product.  It
looks really good, but obviously it's not perfect.  But having just a
single volume image that you can grow on demand by just tossing new
hardware into the mix and the underlying OS moves the data around for
you... tempting.

And the fact that you can mix and match their SATA and faster storage
arrays to make a nice layered setup.  But!! I haven't a clue how well
their NDMP support for backups is.

Ski> $Vendor1(more expensive): Use 256GB flash cache and 72 600GB 15K
Ski> SAS disks (second option would have 96 450GB SAS disks if we felt
Ski> we needed more IOPs).  Minimal dependence on algorithms moving
Ski> data between storage mediums.

Ummm, just to make sure here, is the 30Tb you're asking for the RAW
storage count, or the useable storage count?  Netapp (and I assume
EMC) are terrible at telling you what the useable storage will be for
a system once you:

A) add disks into a raid group (dual parity overhead)
B) add raidgroups into an aggregate (5% snapshot by default)
C) create a volume (20% overhead for snap reserve by default)

So very quickly that 16Tb you started out with ends up around 12Tb and
the boss is going WTF, where's all the disk space I paid for?  

So be careful to specify that they quote you 30Tb of NFS/CIFS free
space once a standard setup is completed with  their setup.

Ski> $Vendor2 (much less expensive): Use 300GB flash cache, 8 - 100GB
Ski> SSDs, 16 - 600GB 15K SAS, and 16 - 2TB 7.2K SATA disks.  This
Ski> depends then a lot on their technology for moving hot blocks to
Ski> faster storage mediums.

Ski> My environment does have a lot of data (e.g. student portfolio
Ski> data) that is rarely touched so $Vendor2 may be a good fit.  My
Ski> concern is that a user will be working in a directory for 2 - 3
Ski> weeks, get used to a certain level of response and performance,
Ski> then go to a directory that is on the slow disks and see a huge
Ski> slowdown to the extent they think the system is broken.

It's not quite clear to me what your usage scenario is.  And possibly
the killer isn't so much the storage, but the client OS issues and the
need to train your users to split up their data a bit better so you
don't have 20,000 entries in a directory!  

But from the sound of it, it's space, not performance that you need
for a set amount of $$$.  Then there's the hidden cost of managing
it.  I like how Netapp does stuff, and I can't talk to Isilon or EMC
Celerra product.  

Ski> With our current NAS this is a big problem especially when the
Ski> Mac clients open up a directory that has 20000 folders in it as
Ski> Macs need all the meta-data on the 20000 folders before they will
Ski> display anything in the GUI other than a beach ball.

What is your current NAS?  I suspect most NAS boxes are going to have
trouble sending you the directory info for 20,000 items quickly, and
that it's the CLIENT which is showing the issue here, not the NAs
server.

Segmenting your data out better would help the users and your
performance quite a bit I'm sure.

Ski> Has anyone had experiences with NAS systems that rely a lot on
Ski> different storage mediums and migrating data between them to get
Ski> performance?  Appreciate your thoughts and ideas on this.

We're currently using CommVault for backups and simple HSM of our
Netapps, which makes things decent, but not quite as simple as I'd
like.  And it's not as quick as I'd like either, since the scanning
process takes forever, but that's a function of the number of files we
have on our filers.  


I was actually just thinking about storage management issues the other
day because my users are, as always, filling up the disks and I've
been hounding them to change their process to better reflect reality.
I.e. they should have a simple, repeatable way to cleanup as they go
along so that they don't end up with Terabytes of files they can't
easily cleanup because they have no structure, etc.

Which leads me to tools.  I wish the storage vendors would add in some
better reporting tools for helping manage disk space.  One nice tool
would be a simple list of the largest 1024 files in a volume.  Easy to
keep track, not alot of space in terms of memory or inodes, but
absolutely vital when you need to quickly do a cleanup.  

Per-user du tree reports would also be nice, but a pain to keep upto
date without impacting performance too much.  Esp once you get into
millions of files in a 20Tb volume.

Hmm... I smell a LISA paper here somewhere, so I'll shutup for now.
:]

John
_______________________________________________
Tech mailing list
Tech@lists.lopsa.org
https://lists.lopsa.org/cgi-bin/mailman/listinfo/tech
This list provided by the League of Professional System Administrators
 http://lopsa.org/

Reply via email to