Take a look at this: http://monolight.cc/2011/06/barriers-caches-filesystems/ <http://monolight.cc/2011/06/barriers-caches-filesystems/>
LSI's answer just makes no sense to me... Jan > On 07 Sep 2015, at 11:07, Jan Schermer <j...@schermer.cz> wrote: > > Are you absolutely sure there's nothing in dmesg before this? There seems to > be something missing. Is this from dmesg or a different log? There should be > something before that. Usually if a drive drops out there is I/O error > (itself caused by a timed-out SCSI command), and then the error recovery > kicks in and emits such messages. But this message by itself just should not > be there. Or is that with the debugging already enabled? In that case it's a > red herring, not _the_ problem. > Synchronize cache is a completely ordinary command - you _want_ it in there > absolutely. The only case when you could avoid it is if you trust the > capacitors on the drives _and_ the OS to order the requests right (a bold > assumption IMO) by disabling barriers (btw that will not work for a journal > on block device). > > How often does this happen? You could try recording the events with "btrace" > so you know what the block device is really doing from the kernel block > device perspective. > In any case, this command should be harmless and is expected to occur quite > often, so LSI telling you "don't do it" is like Ford telling me "your brakes > are broken so don't use them when driving". > > I'm getting real angry at LSI. We have problems with them as well and their > support is just completely uselesss. And for the record, _ALL_ the drives I > tested are faster on Intel SAS than on LSI (2308) and often faster on a > regular SATA AHCI then on their "high throughput" HBAs. > The drivers have barely documented parameters and if you google a bit you'll > find many people having problems with them (not limited to linux). > > I'll definitely avoid LSI HBAs in the future if I can. > > Feel free to mail me off-list, I'm very interested in your issue because I > have the same combination (LSI + Intels) in my cluster right now. Seem to > work fine though. > > Jan > > >> On 05 Sep 2015, at 01:04, Richard Bade <hitr...@gmail.com >> <mailto:hitr...@gmail.com>> wrote: >> >> Hi Jan, >> Thanks for your response. >> How exactly do you know this is the cause? This is usually just an effect of >> something going wrong and part of error recovery process. >> Preceding this event should be the real error/root cause... >> We have been working with LSI/Avago to resolve this. We get a bunch of these >> type log events: >> >> 2015-09-04T14:58:59.169677+12:00 <server_name> ceph-osd: - ceph-osd: >> 2015-09-04 14:58:59.168444 7fbc5ec71700 0 log [WRN] : slow request >> 30.894936 seconds old, received at 2015-09-04 14:58:28.272976: >> osd_op(client.42319583.0:1185218039 >> rbd_data.1d8a5a92eb141f2.00000000000056a0 [read 3579392~8192] 4.f9f016cb >> ack+read e66603) v4 currently no flag points reached >> >> Followed by the task abort I mentioned: >> sd 11:0:4:0: attempting task abort! scmd(ffff8804c07d0480) >> sd 11:0:4:0: [sdf] CDB: >> Write(10): 2a 00 24 6f 01 a8 00 00 08 00 >> scsi target11:0:4: handle(0x000d), sas_address(0x4433221104000000), phy(4) >> scsi target11:0:4: enclosure_logical_id(0x5003048000000000), slot(4) >> sd 11:0:4:0: task abort: SUCCESS scmd(ffff8804c07d0480) >> >> LSI had us enable debugging on our card and send them many logs and >> debugging data. Their response was: >> Please do not send in the Synchronize cache command(35h). That’s the one >> causing the drive from not responding to Read/write commands quick enough. >> A Synchronize cache command instructs the ATA device to flush the cache >> contents to medium and so while the disk is in the process of doing it, it’s >> probably causing the read/write commands to take longer time to complete. >> LSI/Avago believe this to be the root cause of the IO delay based on the >> debugging info. >> >> and from what I've seen it is not necessary with fast drives (such as S3700). >> While I agree with you that it should not be necessary as the S3700's should >> be very fast, our current experience does not show this to be the case. >> >> Just a little more about our setup. We're using Ceph Firefly (0.80.10) on >> Ubuntu 14.04. We see this same thing on every S3700/10 on four hosts. We do >> not see this happening on the spinning disks in the same cluster but >> different pool on similar hardware. >> >> If you know of any other reason this may be happening, we would appreciate >> it. Otherwise we will need to continue investigating the possibility of >> setting nobarriers. >> >> Regards, >> Richard >> >> On 5 September 2015 at 09:32, Jan Schermer <j...@schermer.cz >> <mailto:j...@schermer.cz>> wrote: >>>> We are seeing some significant I/O delays on the disks causing a “SCSI >>>> Task Abort” from the OS. This seems to be triggered by the drive receiving >>>> a “Synchronize cache command”. >>>> >>>> >> >> >> How exactly do you know this is the cause? This is usually just an effect of >> something going wrong and part of error recovery process. >> Preceding this event should be the real error/root cause... >> >> It is _supposedly_ safe to disable barriers in this scenario, but IMO the >> assumptions behind that are deeply flawed, and from what I've seen it is not >> necessary with fast drives (such as S3700). >> >> Take a look in the mailing list archives, I elaborated on this quite a bit >> in the past, including my experience with Kingston drives + XFS + LSI (and >> the effect is present even on Intels, but because they are much faster it >> shouldn't cause any real problems). >> >> Jan >> >> >>> On 04 Sep 2015, at 21:55, Richard Bade <hitr...@gmail.com >>> <mailto:hitr...@gmail.com>> wrote: >>> >>> Hi Everyone, >>> >>> We have a Ceph pool that is entirely made up of Intel S3700/S3710 >>> enterprise SSD's. >>> >>> We are seeing some significant I/O delays on the disks causing a “SCSI Task >>> Abort” from the OS. This seems to be triggered by the drive receiving a >>> “Synchronize cache command”. >>> >>> My current thinking is that setting nobarriers in XFS will stop the drive >>> receiving a sync command and therefore stop the I/O delay associated with >>> it. >>> >>> In the XFS FAQ it looks like the recommendation is that if you have a >>> Battery Backed raid controller you should set nobarriers for performance >>> reasons. >>> >>> Our LSI card doesn’t have battery backed cache as it’s configured in HBA >>> mode (IT) rather than Raid (IR). Our Intel s37xx SSD’s do have a capacitor >>> backed cache though. >>> >>> So is it recommended that barriers are turned off as the drive has a safe >>> cache (I am confident that the cache will write out to disk on power >>> failure)? >>> >>> Has anyone else encountered this issue? >>> >>> Any info or suggestions about this would be appreciated. >>> >>> Regards, >>> >>> Richard >>> >>> _______________________________________________ >>> ceph-users mailing list >>> ceph-users@lists.ceph.com <mailto:ceph-users@lists.ceph.com> >>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com >>> <http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com> >> >> >
_______________________________________________ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com