[ceph-users] How to recover from active+clean+inconsistent+failed_repair?

2020-11-01 Thread Sagara Wijetunga
Hi all I have a Ceph cluster (Nautilus 14.2.11) with 3 Ceph nodes. A crash happened and all 3 Ceph nodes went down. One (1) PG turned "active+clean+inconsistent", I tried to repair it. After the repair, now shows "active+clean+inconsistent+failed_repair" for the PG in the question and cannot bri

[ceph-users] Re: How to recover from active+clean+inconsistent+failed_repair?

2020-11-01 Thread Frank Schilder
I think this happens when a PG has 3 different copies and cannot decide which one is correct. You might have hit a very rare case. You should start with the scrub errors, check which PGs and which copies (OSDs) are affected. It sounds almost like all 3 scrub errors are on the same PG. You might

[ceph-users] Re: How to recover from active+clean+inconsistent+failed_repair?

2020-11-01 Thread Sagara Wijetunga
Hi Frank Thanks for the reply. > I think this happens when a PG has 3 different copies and cannot decide which > one is correct. You might have hit a very rare case. You should start with > the scrub errors, check which PGs and which copies (OSDs) are affected. It > sounds almost like all 3 sc

[ceph-users] Re: How to recover from active+clean+inconsistent+failed_repair?

2020-11-01 Thread Frank Schilder
Hi Sagara, looks like your situation is more complex. Before doing anything potentially destructive, you need to investigate some more. A possible interpretation (numbering just for the example): OSD 0 PG at version 1 OSD 1 PG at version 2 OSD 2 PG has scrub error Depending on the version of t

[ceph-users] Re: How to recover from active+clean+inconsistent+failed_repair?

2020-11-01 Thread Frank Schilder
sorry: *badblocks* can force remappings of broken sectors (non-destructive read-write check) = Frank Schilder AIT Risø Campus Bygning 109, rum S14 From: Frank Schilder Sent: 01 November 2020 14:35:35 To: Sagara Wijetunga; ceph-users@ceph.

[ceph-users] read latency

2020-11-01 Thread Tony Liu
Hi, AWIK, the read latency primarily depends on HW latency, not much can be tuned in SW. Is that right? I ran a fio random read with iodepth 1 within a VM backed by Ceph with HDD OSD and here is what I got. = read: IOPS=282, BW=1130KiB/s (1157kB/s)(33.1MiB/30001msec) slat (

[ceph-users] Re: read latency

2020-11-01 Thread Vladimir Prokofev
Not exactly. You can also tune network/software. Network - go for lower latency interfaces. If you have 10G go to 25G or 100G. 40G will not do though, afaik they're just 4x10G so their latency is the same as in 10G. Software - it's closely tied to your network card queues and processor cores. In sh

[ceph-users] Re: read latency

2020-11-01 Thread Tony Liu
Another confusing about read vs. random read. My understanding is that, when fio does read, it reads from the test file sequentially. When it does random read, it reads from the test file randomly. That file read inside VM comes down to volume read handed by RBD client who distributes read to PG an