Re: [ceph-users] Read errors on OSD

2017-06-01 Thread Oliver Humpage
> On 1 Jun 2017, at 14:38, Steve Taylor wrote: > > I saw this on several servers, and it took a while to track down as you can > imagine. Same symptoms you're reporting. Thanks, that’s very useful info. We’re using separate Adaptec controllers, but will double check firmware on them. Who know

Re: [ceph-users] Read errors on OSD

2017-06-01 Thread Steve Taylor
I've seen similar issues in the past with 4U Supermicro servers populated with spinning disks. In my case it turned out to be a specific firmware+BIOS combination on the disk controller card that was buggy. I fixed it by updating the firmware and BIOS on the card to the latest versions. I saw t

Re: [ceph-users] Read errors on OSD

2017-06-01 Thread Oliver Humpage
> On 1 Jun 2017, at 11:55, Matthew Vernon wrote: > > You don't say what's in kern.log - we've had (rotating) disks that were > throwing read errors but still saying they were OK on SMART. Fair point. There was nothing correlating to the time that ceph logged an error this morning, which is wh

Re: [ceph-users] Read errors on OSD

2017-06-01 Thread Matthew Vernon
Hi, On 01/06/17 10:38, Oliver Humpage wrote: These read errors are all on Samsung 850 Pro 2TB disks (journals are on separate enterprise SSDs). The SMART status on all of them are similar and show nothing out of the ordinary. Has anyone else experienced anything similar? Is this just a curse o