Le 5 août 2010 à 19:49, Ross Walker a écrit :

> On Aug 5, 2010, at 11:10 AM, Roch <roch.bourbonn...@sun.com> wrote:
> 
>> 
>> Ross Walker writes:
>>> On Aug 4, 2010, at 12:04 PM, Roch <roch.bourbonn...@sun.com> wrote:
>>> 
>>>> 
>>>> Ross Walker writes:
>>>>> On Aug 4, 2010, at 9:20 AM, Roch <roch.bourbonn...@sun.com> wrote:
>>>>> 
>>>>>> 
>>>>>> 
>>>>>> Ross Asks: 
>>>>>> So on that note, ZFS should disable the disks' write cache,
>>>>>> not enable them  despite ZFS's COW properties because it
>>>>>> should be resilient. 
>>>>>> 
>>>>>> No, because ZFS builds resiliency on top of unreliable parts. it's able 
>>>>>> to deal
>>>>>> with contained failures (lost state) of the disk write cache. 
>>>>>> 
>>>>>> It can then export LUNS that have WC enabled or
>>>>>> disabled. But if we enable the WC on the exported LUNS, then
>>>>>> the consumer of these LUNS must be able to say the same.
>>>>>> The discussion at that level then needs to focus on failure groups.
>>>>>> 
>>>>>> 
>>>>>> Ross also Said :
>>>>>> I asked this question earlier, but got no answer: while an
>>>>>> iSCSI target is presented WCE does it respect the flush
>>>>>> command? 
>>>>>> 
>>>>>> Yes. I would like to say "obviously" but it's been anything
>>>>>> but.
>>>>> 
>>>>> Sorry to probe further, but can you expand on but...
>>>>> 
>>>>> Just if we had a bunch of zvols exported via iSCSI to another Solaris
>>>>> box which used them to form another zpool and had WCE turned on would
>>>>> it be reliable? 
>>>>> 
>>>> 
>>>> Nope. That's because all the iSCSI are in the same fault
>>>> domain as they share a unified back-end cache. What works,
>>>> in principle, is mirroring SCSI channels hosted on 
>>>> different storage controllers (or N SCSI channels on N
>>>> controller in a raid group).
>>>> 
>>>> Which is why keeping the WC set to the default, is really
>>>> better in general.
>>> 
>>> Well I was actually talking about two backend Solaris storage servers 
>>> serving up storage over iSCSI to a front-end Solaris server serving ZFS 
>>> over NFS, so I have redundancy there, but want the storage to be 
>>> performant, so I want the iSCSI to have WCE, yet I want it to be reliable 
>>> and have it honor cache flush requests from the front-end NFS server.
>>> 
>>> Does that make sense? Is it possible?
>>> 
>> 
>> Well in response to a commit (say after a file creation) then the
>> front end server will end up sending flush write caches on
>> both side of the iscsi mirror which will reach the backend server
>> which will flush disk write caches. This will all work but
>> probably  not unleash performance the way you would like it
>> to.
> 
> 
> 
>> If you setup to have the backend server not honor the
>> backend disk flush write caches, then the 2 backend pools become at
>> risk of corruption, mostly because the ordering of IOs
>> around the ueberblock updates. If you have faith, then you
>> could consider that you won't hit 2 backend pool corruption
>> together and rely on the frontend resilvering to rebuild the
>> corrupted backend.
> 
> So you are saying setting WCE disables cache flush on the target and setting 
> WCD forces a flush for every WRITE?

Nope. Setting WC either way has not implication on the response to a flush 
request. We flush the cache in response to a request to do so,
unless one sets the unsupported zfs_nocacheflush, if set then the pool is at 
risk
> 
> How about a way to enable WCE on the target, yet still perform cache flush 
> when the initiator requests one, like a real SCSI target should do, or is 
> that just not possible with ZVOLs today?
> 
I hope I've cleared that up. Not sure what I said that implicated otherwise.

But if you honor the flush write cache request all the way to the disk device, 
then 1, 2 or 3 layers of ZFS won't make a dent in the performance of NFS tar x. 
Only a device accepting low latency writes which survives power outtage can do 
that.

-r


> -Ross
> 

_______________________________________________
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Reply via email to