----- Цитат от Bron Gondwana (br...@fastmail.fm), на 10.10.2011 в 01:50 ----- 

> On Mon, Oct 10, 2011 at 01:33:31AM +0300, karave...@mail.bg wrote: 
>> Nice setup. And thanks for your work on Cyrus. We are 
>> looking also to move the metadata on SSDs but we have not 
>> found yet cost effective devices - we need at least a pair of 
>> 250G disk for 20-30T spool on a server. 
> 
> You can move cyrus.cache to data now, that's the whole 
> point, because it doesn't need to be mmaped in so much. 
> 
Thanks for the info. 

>> Setting a higher number of allocation groups per XFS 
>> filesystem helps a lot for the concurrency. My rule of 
>> thumb (learnt from databases) is: 
>> number of spindles + 2 * number of CPUs. 
>> You have done the same with multiple filesystems. 
>> 
>> About the fsck times. We experienced a couple of power 
>> failures and XFS comes up in 30-45 minutes (30T in 
>> RAID5 of 12 SATA disks). If the server is shut down 
>> correctly in comes up in a second. 
> 
> Interesting - is that 30-45 minutes actually a proper 
> fsck, or just a log replay? 
> 

I think some kind recovery procedure internal to xfs. The 
XFS log is 2G, so I think it is not just replaying. 

> More interestingly, what's your disaster recovery plan 
> for when you lose multiple disks? Our design is 
> heavily influenced by having lost 3 disks in a RAID6 
> within 12 hours. It took a week to get everyone back 
> from backups, just because of the IO rate limits of 
> the backup server. 

Ouch! You had really bad luck. I do not know how long 
it will take for us to recover from backups. My estimate is 
2-3 weeks if one servers fails. We are looking for better 
options. Your partitioning is a better plan here - smaller 
probability the 2 failing disks to come form one array, 
faster recovery time, etc. 

> 
>> We know that RAID5 is not the best option for write 
>> scalability, but the controller write cache helps a lot. 
> 
> Yeah, we did RAID5 for a while - but it turned out we 
> were still being write limited more than disk space 
> limited, so the last RAID5s are being phased out for 
> more RAID1. 
> 
> Bron. 
> 

-- 
Luben Karavelov


--
Luben Karavelov

Reply via email to