Re: [ceph-users] PG inconsistency

2014-11-10 Thread Craig Lewis
For #1, it depends what you mean by fast. I wouldn't worry about it taking 15 minutes. If you mark the old OSD out, ceph will start remapping data immediately, including a bunch of PGs on unrelated OSDs. Once you replace the disk, and put the same OSDID back in the same host, the CRUSH map will

Re: [ceph-users] PG inconsistency

2014-11-09 Thread GuangYang
Thanks Sage! > Date: Fri, 7 Nov 2014 02:19:06 -0800 > From: s...@newdream.net > To: yguan...@outlook.com > CC: ceph-de...@vger.kernel.org; ceph-users@lists.ceph.com > Subject: Re: PG inconsistency > > On Thu, 6 Nov 2014, GuangYang wrote: >> Hello Cephers, >

Re: [ceph-users] PG inconsistency

2014-11-07 Thread Sage Weil
On Thu, 6 Nov 2014, GuangYang wrote: > Hello Cephers, > Recently we observed a couple of inconsistencies in our Ceph cluster, > there were two major patterns leading to inconsistency as I observed: 1) > EIO to read the file, 2) the digest is inconsistent (for EC) even there > is no read error).

Re: [ceph-users] PG inconsistency

2014-11-06 Thread Dan van der Ster
IIRC, the EIO we had also correlated with a SMART status that showed the disk was bad enough for a warranty replacement -- so yes, I replaced the disk in these cases. Cheers, Dan On Thu Nov 06 2014 at 2:44:08 PM GuangYang wrote: > Thanks Dan. By "killed/formatted/replaced the OSD", did you repl

Re: [ceph-users] PG inconsistency

2014-11-06 Thread Irek Fasikhov
Thu Nov 06 2014 at 16:44:09, GuangYang : > Thanks Dan. By "killed/formatted/replaced the OSD", did you replace the > disk? Not an filesystem expert here, but would like to understand the > underlying what happened behind the EIO and does that reveal something > (e.g. hardware issue). > > In our ca

Re: [ceph-users] PG inconsistency

2014-11-06 Thread GuangYang
We are using v0.80.4. Just would like to ask for general suggestion here :) Thanks, Guang > From: malm...@gmail.com > Date: Thu, 6 Nov 2014 13:46:12 + > Subject: Re: [ceph-users] PG inconsistency > To: yguan...@outlook.com; ceph-de...@vge

Re: [ceph-users] PG inconsistency

2014-11-06 Thread Irek Fasikhov
What is your version of the ceph? 0.80.0 - 0.80.3 https://github.com/ceph/ceph/commit/7557a8139425d1705b481d7f010683169fd5e49b Thu Nov 06 2014 at 16:24:21, GuangYang : > Hello Cephers, > Recently we observed a couple of inconsistencies in our Ceph cluster, > there were two major patterns leading

Re: [ceph-users] PG inconsistency

2014-11-06 Thread GuangYang
Thanks Dan. By "killed/formatted/replaced the OSD", did you replace the disk? Not an filesystem expert here, but would like to understand the underlying what happened behind the EIO and does that reveal something (e.g. hardware issue). In our case, we are using 6TB drive so that there are lot of

Re: [ceph-users] PG inconsistency

2014-11-06 Thread Dan van der Ster
Hi, I've only ever seen (1), EIO to read a file. In this case I've always just killed / formatted / replaced that OSD completely -- that moves the PG to a new master and the new replication "fixes" the inconsistency. This way, I've never had to pg repair. I don't know if this is a best or even good