On 9/7/2012 3:16 PM, Bob Proulx wrote: > Agreed. But for me it isn't about the fsck time. It is about the > size of the problem. If you have full 100G filesystem and there is a > problem then you have a 100G problem. It is painful. But you can > handle it. If you have a full 10T filesystem and there is a problem > then there is a *HUGE* problem. It is so much more than painful.
This depends entirely on the nature of the problem. Most filesystem problems are relatively easy to fix even on 100TB+ filesystems, sometimes with some data loss, often with only a file or few being lost or put in lost+found. If you have a non-redundant hardware device failure that roasts your FS, then you replace the hardware make a new FS, and restore from D2D or tape. That's not painful, that's procedure. > Therefore when practical I like to compartmentalize things so that > there is isolation between problems. Whether the problem is due to a > hardware failure, a software failure or a human failure. All of which > are possible. Having compartmentalization makes dealing with the > problem easier and smaller. Sounds like you're mostly trying to mitigate human error. When you identify that solution, let me know, then patent it. ;) >> Whjat? Are you talking crash recovery boot time "fsck"? With any >> modern journaled FS log recovery is instantaneous. If you're talking >> about an actual structure check, XFS is pretty quick regardless of inode >> count as the check is done in parallel. I can't speak to EXTx as I >> don't use them. > > You should try an experiment and set up a terabyte ext3 and ext4 > filesystem and then perform a few crash recovery reboots of the > system. It will change your mind. :-) As I've never used EXT3/4 and thus have no opinion, it'd be a bit difficult to change my mind. That said, putting root on a 1TB filesystem is a brain dead move, regardless of FS flavor. A Linux server doesn't need more than 5GB of space for root. With /var, /home/ and /bigdata on other filesystems, crash recovery fsck should be quick. > XFS has one unfortunate missing feature. You can't resize a > filesystem to be smaller. You can resize them larger. But not > smaller. This is a missing feature that I miss as compared to other > filesystems. If you ever need to shrink a server filesystem: "you're doing IT wrong". > Unfortunately I have some recent FUD concerning xfs. I have had some > recent small idle xfs filesystems trigger kernel watchdog timer > recoveries recently. Emphasis on idle. If this is the bug I'm thinking of, "Idle" has nothing to do with the problem, which was fixed in 3.1 and backported to 3.0. The fix didn't hit Debian 2.6.32. I'm not a Debian kernel dev, ask them why--likely too old. Upgrading to the BPO 3.2 kernel should fix this and give you some nice additional performance enhancements. 2.6.32 is ancient BTW, released almost 3 years ago. That's 51 in Linux development years. ;) If you're going to recommend to someone against XFS, please qualify/clarify that you're referring to 3 year old XFS, not the current release. > Definitely XFS can handle large filesystems. And definitely when > there is a good version of everything all around it has been a very > good and reliable performer for me. I wish my recent bad experiences > were resolved. The fix is quick and simple, install BPO 3.2. Why haven't you already? > But for large filesystems such as that I think you need a very good > and careful administrator to manage the disk farm. And that includes > disk use policies as much as it includes managing kernel versions and > disk hardware. Huge problems of any sort need more careful management. Say I have a 1.7TB filesystem and a 30TB filesystem. How do you feel the two should be managed differently, or that the 30TB filesystem needs kid gloves? >> When using correctly architected reliable hardware there's no reason one >> can't use a single 500TB XFS filesystem. > > Although I am sure it would work I would hate to have to deal with a > problem that large when there is a need for disaster recovery. I > guess that is why *I* don't manage storage farms that are that large. :-) The only real difference at this scale is that your backup medium is tape, not disk, and you have much phatter pipes to the storage host. A 500TB filesystem will reside on over 1000 disk drives. It isn't going to be transactional or primary storage, but nearline or archival storage. It takes a tape silo and intelligent software to back it up, but a full restore after catastrophe doesn't have (many) angry users breathing down your neck. On the other hand, managing a 7TB transactional filesystem residing on 48x 300GB SAS drives in a concatenated RADI10 setup, housing, say, corporate mailboxes for 10,000 employees, including the CxOs, is a much trickier affair. If you wholesale lose this filesystem and must do a full restore, you are red meat, and everyone is going to take a bite out of your ass. And you very well may get a pink slip depending on the employer. Size may matter WRT storage/filesystem management, but it's the type of data you're storing and the workload that's more meaningful. -- Stan -- To UNSUBSCRIBE, email to debian-user-requ...@lists.debian.org with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org Archive: http://lists.debian.org/504c3922.30...@hardwarefreak.com