On Aug 5, 2010, at 2:24 PM, Roch Bourbonnais <roch.bourbonn...@sun.com> wrote:

> 
> Le 5 août 2010 à 19:49, Ross Walker a écrit :
> 
>> On Aug 5, 2010, at 11:10 AM, Roch <roch.bourbonn...@sun.com> wrote:
>> 
>>> 
>>> Ross Walker writes:
>>>> On Aug 4, 2010, at 12:04 PM, Roch <roch.bourbonn...@sun.com> wrote:
>>>> 
>>>>> 
>>>>> Ross Walker writes:
>>>>>> On Aug 4, 2010, at 9:20 AM, Roch <roch.bourbonn...@sun.com> wrote:
>>>>>> 
>>>>>>> 
>>>>>>> 
>>>>>>> Ross Asks: 
>>>>>>> So on that note, ZFS should disable the disks' write cache,
>>>>>>> not enable them  despite ZFS's COW properties because it
>>>>>>> should be resilient. 
>>>>>>> 
>>>>>>> No, because ZFS builds resiliency on top of unreliable parts. it's able 
>>>>>>> to deal
>>>>>>> with contained failures (lost state) of the disk write cache. 
>>>>>>> 
>>>>>>> It can then export LUNS that have WC enabled or
>>>>>>> disabled. But if we enable the WC on the exported LUNS, then
>>>>>>> the consumer of these LUNS must be able to say the same.
>>>>>>> The discussion at that level then needs to focus on failure groups.
>>>>>>> 
>>>>>>> 
>>>>>>> Ross also Said :
>>>>>>> I asked this question earlier, but got no answer: while an
>>>>>>> iSCSI target is presented WCE does it respect the flush
>>>>>>> command? 
>>>>>>> 
>>>>>>> Yes. I would like to say "obviously" but it's been anything
>>>>>>> but.
>>>>>> 
>>>>>> Sorry to probe further, but can you expand on but...
>>>>>> 
>>>>>> Just if we had a bunch of zvols exported via iSCSI to another Solaris
>>>>>> box which used them to form another zpool and had WCE turned on would
>>>>>> it be reliable? 
>>>>>> 
>>>>> 
>>>>> Nope. That's because all the iSCSI are in the same fault
>>>>> domain as they share a unified back-end cache. What works,
>>>>> in principle, is mirroring SCSI channels hosted on 
>>>>> different storage controllers (or N SCSI channels on N
>>>>> controller in a raid group).
>>>>> 
>>>>> Which is why keeping the WC set to the default, is really
>>>>> better in general.
>>>> 
>>>> Well I was actually talking about two backend Solaris storage servers 
>>>> serving up storage over iSCSI to a front-end Solaris server serving ZFS 
>>>> over NFS, so I have redundancy there, but want the storage to be 
>>>> performant, so I want the iSCSI to have WCE, yet I want it to be reliable 
>>>> and have it honor cache flush requests from the front-end NFS server.
>>>> 
>>>> Does that make sense? Is it possible?
>>>> 
>>> 
>>> Well in response to a commit (say after a file creation) then the
>>> front end server will end up sending flush write caches on
>>> both side of the iscsi mirror which will reach the backend server
>>> which will flush disk write caches. This will all work but
>>> probably  not unleash performance the way you would like it
>>> to.
>> 
>> 
>> 
>>> If you setup to have the backend server not honor the
>>> backend disk flush write caches, then the 2 backend pools become at
>>> risk of corruption, mostly because the ordering of IOs
>>> around the ueberblock updates. If you have faith, then you
>>> could consider that you won't hit 2 backend pool corruption
>>> together and rely on the frontend resilvering to rebuild the
>>> corrupted backend.
>> 
>> So you are saying setting WCE disables cache flush on the target and setting 
>> WCD forces a flush for every WRITE?
> 
> Nope. Setting WC either way has not implication on the response to a flush 
> request. We flush the cache in response to a request to do so,
> unless one sets the unsupported zfs_nocacheflush, if set then the pool is at 
> risk
>> 
>> How about a way to enable WCE on the target, yet still perform cache flush 
>> when the initiator requests one, like a real SCSI target should do, or is 
>> that just not possible with ZVOLs today?
>> 
> I hope I've cleared that up. Not sure what I said that implicated otherwise.
> 
> But if you honor the flush write cache request all the way to the disk 
> device, then 1, 2 or 3 layers of ZFS won't make a dent in the performance of 
> NFS tar x. 
> Only a device accepting low latency writes which survives power outtage can 
> do that.

Understood and thanks for the clarification, if the NFS synchronicity has too 
much of a negative impact then that can be alleviated through an SSD or NVRAM 
slog device on the head server.

-Ross

_______________________________________________
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Reply via email to