We are looking for a replacement enterprise file system to handle storage
needs for our campus. For the past 10 years, we have been happily using DFS
(the distributed file system component of DCE), but unfortunately IBM
killed off that product and we have been running without support for over a
year now. We have looked at a variety of possible options, none of which
have proven fruitful. We are currently investigating the possibility of a
Solaris 10/ZFS implementation. I have done a fair amount of reading and
perusal of the mailing list archives, but I apologize in advance if I ask
anything I should have already found in a FAQ or other repository.

Basically, we are looking to provide initially 5 TB of usable storage,
potentially scaling up to 25-30TB of usable storage after successful
initial deployment. We would have approximately 50,000 user home
directories and perhaps 1000 shared group storage directories. Access to
this storage would be via NFSv4 for our UNIX infrastructure, and CIFS for
those annoying Windows systems you just can't seem to get rid of ;).

I read that initial versions of ZFS had scalability issues with such a
large number of file systems, resulting in extremely long boot times and
other problems. Supposedly a lot of those problems have been fixed in the
latest versions of OpenSolaris, and many of the fixes have been backported
to the official Solaris 10 update 4? Will that version of Solaris
reasonably support 50 odd thousand ZFS file systems?

I saw a couple of threads in the mailing list archives regarding NFS not
transitioning file system boundaries, requiring each and every ZFS
filesystem (50 thousand-ish in my case) to be exported and mounted on the
client separately. While that might be feasible with an automounter, it
doesn't really seem desirable or efficient. It would be much nicer to
simply have one mount point on the client with all the home directories
available underneath it. I was wondering whether or not that would be
possible with the NFSv4 pseudo-root feature. I saw one posting that
indicated it might be, but it wasn't clear whether or not that was a
current feature or something yet to be implemented. I have no requirements
to support legacy NFSv2/3 systems, so a solution only available via NFSv4
would be acceptable.

I was planning to provide CIFS services via Samba. I noticed a posting a
while back from a Sun engineer working on integrating NFSv4/ZFS ACL support
into Samba, but I'm not sure if that was ever completed and shipped either
in the Sun version or pending inclusion in the official version, does
anyone happen to have an update on that? Also, I saw a patch proposing a
different implementation of shadow copies that better supported ZFS
snapshots, any thoughts on that would also be appreciated.

Is there any facility for managing ZFS remotely? We have a central identity
management system that automatically provisions resources as necessary for
users, as well as providing an interface for helpdesk staff to modify
things such as quota. I'd be willing to implement some type of web service
on the actual server if there is no native remote management; in that case,
is there any way to directly configure ZFS via a programmatic API, as
opposed to running binaries and parsing the output? Some type of perl
module would be perfect.

We need high availability, so are looking at Sun Cluster. That seems to add
an extra layer of complexity <sigh>, but there's no way I'll get signoff on
a solution without redundancy. It would appear that ZFS failover is
supported with the latest version of Solaris/Sun Cluster? I was speaking
with a Sun SE who claimed that ZFS would actually operate active/active in
a cluster, simultaneously writable by both nodes. From what I had read, ZFS
is not a cluster file system, and would only operate in the active/passive
failover capacity. Any comments?

The SE also told me that Sun Cluster requires hardware raid, which
conflicts with the general recommendation to feed ZFS raw disk. It seems
such a configuration would either require configuring zdevs directly on the
raid LUNs, losing ZFS self-healing and checksum correction features, or
losing space to not only the hardware raid level, but a partially redundant
ZFS level as well. What is the general consensus on the best way to deploy
ZFS under a cluster using hardware raid?

Any other thoughts/comments on the feasibility or practicality of a
large-scale ZFS deployment like this?

Thanks much...


-- 
Paul B. Henson  |  (909) 979-6361  |  http://www.csupomona.edu/~henson/
Operating Systems and Network Analyst  |  [EMAIL PROTECTED]
California State Polytechnic University  |  Pomona CA 91768
_______________________________________________
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Reply via email to