I'm adding a new storage device to our SAN, and this is a good opportunity to
reexamine our config and make changes. I'm looking for recommendation
about storage configuration decisions.

Environment:

        RHCS-clustered Linux NFS servers with SAN attached storage, serving
        ~45 Linux clients (scientific computing cluster and stand-alone 
        computational servers)--mainly CentOS 5.4, some CentOS 4.8.

        Multi-vendor SAN (currently EMC and Nexsan, likely to include other
        vendors in the future)

        ~12TB (raw) in use, ~18TB (raw) being added. About 6TB from legacy
        devices (direct-attached SCSI, SOHO NAS, etc.) who's data will be
        migrated to the new 18TB SAN storage.

        Current LUNs are 168GB to 2.5TB in size, depending on the size of
        underlying RAID groups or individual disks.

        Current Linux filesystems use ext3fs. Ext4fs is being tested outside
        of production.

        4 major conceptual uses for storage: home directories, project space,
        scratch space and archival. Currently about 2~6 TB each, will expand
        by 1~3TB/year.

        Capacity and managment are much greater priorities than performance

Goals:
        Provide end-users with increased space in the 4 categories (home,
        project, scratch, and archive).

        Select technology that allows for expansion of space in those categories
        as new physical storage is added in the future.

        Minimize future downtime due to regular maintenace (ie., fsck may take
        ~1.5TB/hr)

Concerns:
        Within our environment, allocation has been a greater problem than 
        performance--individual volumes are be full, while others have
        lots of space. A configuration where storage can be flexibility
        allocated to individuals or projects is desireable. 

        A unified presentation to end-users is desirable. We're using automount
        extensively now, and this produces a confusion among end-users
        (typically resulting in a desperate question about why "my project
        directory is missing").

        I've had very bad experience with automount in a mixed environment
        (Linux automount clients mounting from Linux, Solaris and Irix NFS
        servers) and would prefer to avoid automount...but the non-Linux
        NFS servers are going away soon, so this isn't an absolute requirement.

        As filesystems grow, I have concerns about management--fsck time becomes
        excessive, and I've heard annecdotes about filesystem stability over
        certain sizes (8TB, etc.). For this reason, a solution that virtually
        consolidates separate back-end filesystems of moderate sizes may be
        preferable (sounds like automount!).


Some possible choices for managing the environment (with pros and cons) are:
        
        * union mounts
             Virtually combine separate filesystems by overlaying directory
             trees. Data sources are ranked by priority, to determine which
             directory takes precendece. There are kernel and user-land (FUSE)
             implementations.

                - lack of personal experience
                - overhead (~12% for the kernel version, possibly more for the
                        FUSE version)
                - complexity of NFS client configuration, possible limitations
                        with NFS exports
                - not in current RHEL5/CentOS5 distributions
                + simple NFS server configuration
                + clean presentation to end-users
                + flexible use of back-end space
                + virtual consolidation of small back-end filesystems
        

        * LVM
            Logically group unformatted storage volumes (LUNS) into virtual
            LUNs, formatting those as filesystems. Logical volumes can grow
            in size (depending on available storage and the capability of
            the filesystem on the logical volume). Using
            LVM would likely mean having 4 large virtual volumes (home,
            project, scratch, and archive directories), each made of from LUNs
            from different storage devices.

                - difficult to address performance issues when a filesystem is
                        made up of LUNs from different storage devices
                        potentially using different RAID levels, etc.
                - complexity on NFS server
                - potential filesystem issues at large sizes (fsck time,
                        ext3fs stability over 8TB, etc.)
                + existing familiarity with LVM
                + simple presentation to user
                + simple management on NFS clients (ie., a single
                        mount point per data type)
                

        * automount
             Similar to unionfs, individual filesystems would provide separate
             directories.

                - historical poor experience with automount
                - managment of automount maps
                - end-user confusion
                - smaller filesystems = allocation problems
                + simplicity on NFS server
                + widespread use
                + flexible use of back-end space
                + virtual consolidation of small back-end filesystems


        * Hybrid
                The technologies can be combined--such as using LVM to manage
                smaller (~1TB) logical volumes, with data from those presented
                to NFS clients via automount. This would allow for filesystem
                growth and allocation from specific devices--at the expense of  
                increased complexity.

So, if you've managed to read this far...is there anything that I'm missing? 
Any suggested alternatives?

Thanks,

Mark


_______________________________________________
Discuss mailing list
Discuss@lopsa.org
http://lopsa.org/cgi-bin/mailman/listinfo/discuss
This list provided by the League of Professional System Administrators
 http://lopsa.org/

Reply via email to