For #1, it depends what you mean by fast. I wouldn't worry about it taking
15 minutes.
If you mark the old OSD out, ceph will start remapping data immediately,
including a bunch of PGs on unrelated OSDs. Once you replace the disk, and
put the same OSDID back in the same host, the CRUSH map will
Thanks Sage!
> Date: Fri, 7 Nov 2014 02:19:06 -0800
> From: s...@newdream.net
> To: yguan...@outlook.com
> CC: ceph-de...@vger.kernel.org; ceph-users@lists.ceph.com
> Subject: Re: PG inconsistency
>
> On Thu, 6 Nov 2014, GuangYang wrote:
>> Hello Cephers,
>
On Thu, 6 Nov 2014, GuangYang wrote:
> Hello Cephers,
> Recently we observed a couple of inconsistencies in our Ceph cluster,
> there were two major patterns leading to inconsistency as I observed: 1)
> EIO to read the file, 2) the digest is inconsistent (for EC) even there
> is no read error).
IIRC, the EIO we had also correlated with a SMART status that showed the
disk was bad enough for a warranty replacement -- so yes, I replaced the
disk in these cases.
Cheers, Dan
On Thu Nov 06 2014 at 2:44:08 PM GuangYang wrote:
> Thanks Dan. By "killed/formatted/replaced the OSD", did you repl
Thu Nov 06 2014 at 16:44:09, GuangYang :
> Thanks Dan. By "killed/formatted/replaced the OSD", did you replace the
> disk? Not an filesystem expert here, but would like to understand the
> underlying what happened behind the EIO and does that reveal something
> (e.g. hardware issue).
>
> In our ca
We are using v0.80.4. Just would like to ask for general suggestion here :)
Thanks,
Guang
> From: malm...@gmail.com
> Date: Thu, 6 Nov 2014 13:46:12 +
> Subject: Re: [ceph-users] PG inconsistency
> To: yguan...@outlook.com; ceph-de...@vge
What is your version of the ceph?
0.80.0 - 0.80.3
https://github.com/ceph/ceph/commit/7557a8139425d1705b481d7f010683169fd5e49b
Thu Nov 06 2014 at 16:24:21, GuangYang :
> Hello Cephers,
> Recently we observed a couple of inconsistencies in our Ceph cluster,
> there were two major patterns leading
Thanks Dan. By "killed/formatted/replaced the OSD", did you replace the disk?
Not an filesystem expert here, but would like to understand the underlying what
happened behind the EIO and does that reveal something (e.g. hardware issue).
In our case, we are using 6TB drive so that there are lot of
Hi,
I've only ever seen (1), EIO to read a file. In this case I've always just
killed / formatted / replaced that OSD completely -- that moves the PG to a
new master and the new replication "fixes" the inconsistency. This way,
I've never had to pg repair. I don't know if this is a best or even good