Re: [ceph-users] XFS and nobarriers on Intel SSD

Jan Schermer Mon, 07 Sep 2015 04:05:14 -0700

Take a look at this:

http://monolight.cc/2011/06/barriers-caches-filesystems/ 
<http://monolight.cc/2011/06/barriers-caches-filesystems/>


LSI's answer just makes no sense to me...

Jan

> On 07 Sep 2015, at 11:07, Jan Schermer <j...@schermer.cz> wrote:
> 
> Are you absolutely sure there's nothing in dmesg before this? There seems to 
> be something missing. Is this from dmesg or a different log? There should be 
> something before that. Usually if a drive drops out there is I/O error 
> (itself caused by a timed-out SCSI command), and then the error recovery 
> kicks in and emits such messages. But this message by itself just should not 
> be there. Or is that with the debugging already enabled? In that case it's a 
> red herring, not _the_ problem.
> Synchronize cache is a completely ordinary command  - you _want_ it in there 
> absolutely. The only case when you could avoid it is if you trust the 
> capacitors on the drives _and_ the OS to order the requests right (a bold 
> assumption IMO) by disabling barriers (btw that will not work for a journal 
> on block device).
> 
> How often does this happen? You could try recording the events with "btrace" 
> so you know what the block device is really doing from the kernel block 
> device perspective.
> In any case, this command should be harmless and is expected to occur quite 
> often, so LSI telling you "don't do it" is like Ford telling me "your brakes 
> are broken so don't use them when driving".
> 
> I'm getting real angry at LSI. We have problems with them as well and their 
> support is just completely uselesss. And for the record, _ALL_ the drives I 
> tested are faster on Intel SAS than on LSI (2308) and often faster on a 
> regular SATA AHCI then on their "high throughput" HBAs.
> The drivers have barely documented parameters and if you google a bit you'll 
> find many people having problems with them (not limited to linux).
> 
> I'll definitely avoid LSI HBAs in the future if I can.
> 
> Feel free to mail me off-list, I'm very interested in your issue because I 
> have the same combination (LSI + Intels) in my cluster right now. Seem to 
> work fine though.
> 
> Jan
> 
> 
>> On 05 Sep 2015, at 01:04, Richard Bade <hitr...@gmail.com 
>> <mailto:hitr...@gmail.com>> wrote:
>> 
>> Hi Jan,
>> Thanks for your response.
>> How exactly do you know this is the cause? This is usually just an effect of 
>> something going wrong and part of error recovery process.
>> Preceding this event should be the real error/root cause...
>> We have been working with LSI/Avago to resolve this. We get a bunch of these 
>> type log events:
>> 
>> 2015-09-04T14:58:59.169677+12:00 <server_name> ceph-osd: - ceph-osd:  
>> 2015-09-04 14:58:59.168444 7fbc5ec71700  0 log [WRN] : slow request 
>> 30.894936 seconds old, received at 2015-09-04 14:58:28.272976: 
>> osd_op(client.42319583.0:1185218039 
>> rbd_data.1d8a5a92eb141f2.00000000000056a0 [read 3579392~8192] 4.f9f016cb 
>> ack+read e66603) v4 currently no flag points reached
>> 
>> Followed by the task abort I mentioned:
>>  sd 11:0:4:0: attempting task abort! scmd(ffff8804c07d0480)
>>  sd 11:0:4:0: [sdf] CDB: 
>>  Write(10): 2a 00 24 6f 01 a8 00 00 08 00
>>  scsi target11:0:4: handle(0x000d), sas_address(0x4433221104000000), phy(4)
>>  scsi target11:0:4: enclosure_logical_id(0x5003048000000000), slot(4)
>>  sd 11:0:4:0: task abort: SUCCESS scmd(ffff8804c07d0480)
>> 
>> LSI had us enable debugging on our card and send them many logs and 
>> debugging data. Their response was:
>> Please do not send in the Synchronize cache command(35h). That’s the one 
>> causing the drive from not responding to Read/write commands quick enough.
>> A Synchronize cache command instructs the ATA device to flush the cache 
>> contents to medium and so while the disk is in the process of doing it, it’s 
>> probably causing the read/write commands to take longer time to complete.
>> LSI/Avago believe this to be the root cause of the IO delay based on the 
>> debugging info.
>> 
>> and from what I've seen it is not necessary with fast drives (such as S3700).
>> While I agree with you that it should not be necessary as the S3700's should 
>> be very fast, our current experience does not show this to be the case.
>> 
>> Just a little more about our setup. We're using Ceph Firefly (0.80.10) on 
>> Ubuntu 14.04. We see this same thing on every S3700/10 on four hosts. We do 
>> not see this happening on the spinning disks in the same cluster but 
>> different pool on similar hardware.
>> 
>> If you know of any other reason this may be happening, we would appreciate 
>> it. Otherwise we will need to continue investigating the possibility of 
>> setting nobarriers.
>> 
>> Regards,
>> Richard
>> 
>> On 5 September 2015 at 09:32, Jan Schermer <j...@schermer.cz 
>> <mailto:j...@schermer.cz>> wrote:
>>>> We are seeing some significant I/O delays on the disks causing a “SCSI 
>>>> Task Abort” from the OS. This seems to be triggered by the drive receiving 
>>>> a “Synchronize cache command”.
>>>> 
>>>> 
>> 
>> 
>> How exactly do you know this is the cause? This is usually just an effect of 
>> something going wrong and part of error recovery process.
>> Preceding this event should be the real error/root cause...
>> 
>> It is _supposedly_ safe to disable barriers in this scenario, but IMO the 
>> assumptions behind that are deeply flawed, and from what I've seen it is not 
>> necessary with fast drives (such as S3700).
>> 
>> Take a look in the mailing list archives, I elaborated on this quite a bit 
>> in the past, including my experience with Kingston drives + XFS + LSI (and 
>> the effect is present even on Intels, but because they are much faster it 
>> shouldn't cause any real problems).
>> 
>> Jan
>> 
>> 
>>> On 04 Sep 2015, at 21:55, Richard Bade <hitr...@gmail.com 
>>> <mailto:hitr...@gmail.com>> wrote:
>>> 
>>> Hi Everyone,
>>> 
>>> We have a Ceph pool that is entirely made up of Intel S3700/S3710 
>>> enterprise SSD's.
>>> 
>>> We are seeing some significant I/O delays on the disks causing a “SCSI Task 
>>> Abort” from the OS. This seems to be triggered by the drive receiving a 
>>> “Synchronize cache command”.
>>> 
>>> My current thinking is that setting nobarriers in XFS will stop the drive 
>>> receiving a sync command and therefore stop the I/O delay associated with 
>>> it.
>>> 
>>> In the XFS FAQ it looks like the recommendation is that if you have a 
>>> Battery Backed raid controller you should set nobarriers for performance 
>>> reasons.
>>> 
>>> Our LSI card doesn’t have battery backed cache as it’s configured in HBA 
>>> mode (IT) rather than Raid (IR). Our Intel s37xx SSD’s do have a capacitor 
>>> backed cache though.
>>> 
>>> So is it recommended that barriers are turned off as the drive has a safe 
>>> cache (I am confident that the cache will write out to disk on power 
>>> failure)?
>>> 
>>> Has anyone else encountered this issue?
>>> 
>>> Any info or suggestions about this would be appreciated. 
>>> 
>>> Regards,
>>> 
>>> Richard
>>> 
>>> _______________________________________________
>>> ceph-users mailing list
>>> ceph-users@lists.ceph.com <mailto:ceph-users@lists.ceph.com>
>>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com 
>>> <http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com>
>> 
>> 
>

_______________________________________________
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] XFS and nobarriers on Intel SSD

Reply via email to