On 23 Jul 2010, at 09:18, Andrew Gabriel <andrew.gabr...@oracle.com> wrote:

> Thomas Burgess wrote:
>> 
>> On Fri, Jul 23, 2010 at 3:11 AM, Sigbjorn Lie <sigbj...@nixtra.com 
>> <mailto:sigbj...@nixtra.com>> wrote:
>> 
>>    Hi,
>> 
>>    I've been searching around on the Internet to fine some help with
>>    this, but have been
>>    unsuccessfull so far.
>> 
>>    I have some performance issues with my file server. I have an
>>    OpenSolaris server with a Pentium D
>>    3GHz CPU, 4GB of memory, and a RAIDZ1 over 4 x Seagate
>>    (ST31500341AS) 1,5TB SATA drives.
>> 
>>    If I compile or even just unpack a tar.gz archive with source code
>>    (or any archive with lots of
>>    small files), on my Linux client onto a NFS mounted disk to the
>>    OpenSolaris server, it's extremely
>>    slow compared to unpacking this archive on the locally on the
>>    server. A 22MB .tar.gz file
>>    containng 7360 files takes 9 minutes and 12seconds to unpack over NFS.
>> 
>>    Unpacking the same file locally on the server is just under 2
>>    seconds. Between the server and
>>    client I have a gigabit network, which at the time of testing had
>>    no other significant load. My
>>    NFS mount options are: "rw,hard,intr,nfsvers=3,tcp,sec=sys".
>> 
>>    Any suggestions to why this is?
>> 
>> 
>>    Regards,
>>    Sigbjorn
>> 
>> 
>> as someone else said, adding an ssd log device can help hugely.  I saw about 
>> a 500% nfs write increase by doing this.
>> I've heard of people getting even more.
> 
> Another option if you don't care quite so much about data security in the 
> event of an unexpected system outage would be to use Robert Milkowski and 
> Neil Perrin's zil synchronicity [PSARC/2010/108] changes with sync=disabled, 
> when the changes work their way into an available build. The risk is that if 
> the file server goes down unexpectedly, it might come back up having lost 
> some seconds worth of changes which it told the client (lied) that it had 
> committed to disk, when it hadn't, and this violates the NFS protocol. That 
> might be OK if you are using it to hold source that's being built, where you 
> can kick off a build again if the server did go down in the middle of it. 
> Wouldn't be a good idea for some other applications though (although Linux 
> ran this way for many years, seemingly without many complaints). Note that 
> there's no increased risk of the zpool going bad - it's just that after the 
> reboot, filesystems with sync=disabled will look like they were rewound by 
> some secon
 ds (possibly up to 30 seconds).

That's assuming you know it happened and that you need  to restart the build 
(ideally with a make clean). All the NFS client knows is that the NFS server 
went away for some time. It still assumes nothing was lost. I can imagine cases 
where the build might continue to completion but with partially corrupted 
files. It's unlikely, but conceivable. Of course, databases like dbm, MySQL or 
Oracle would go blithely on up the swanee with silent data corruption.

The fact that people run unsafe systems seemingly without complaint for years 
assumes that they know silent data corruption when they see^H^H^Hhear it ... 
which, of course, they didn't ... because it is silent ... or having 
encountered corrupted data, that they have the faintest idea where it came 
from. In my day to day work I still find many people that have been 
(apparently) very lucky.

Feel free to play fast and loose with your own data, but I won't with mine, 
thanks! ;)
_______________________________________________
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Reply via email to