Hi,

I have a Mimic 13.2.6 cluster which is throwing an error on a PG that
it's inconsistent.

PG_DAMAGED Possible data damage: 1 pg inconsistent
    pg 21.e6d is active+clean+inconsistent, acting [988,508,825]

I checked 'list-inconsistent-obj' (See below) and it shows:

selected_object_info: "data_digest": "0xf4342d4a
all osds: "data_digest": "0x224d6ca6"

This looks like issue 24994 [0], but this is Mimic 13.2.6 and not
Luminous 12.2.7

I also tried to download the object:

rados -p <pool> get rb.0.304abf.238e1f29.00000001cc48
rb.0.304abf.238e1f29.00000001cc48

That doesn't work. This blocks for ever and causes osd.988 to report a
slow request.

I don't want to repair the PG at the moment as this might be a bug.

Right now I'm thinking of restarting all three OSDs of that PG as this
might get things moving again.

But how could this happen with all three OSDs reporting the same data
digest?

Ideas?

Wido


{
  "epoch": 1351206,
  "inconsistents": [
    {
      "object": {
        "name": "rb.0.304abf.238e1f29.00000001cc48",
        "nspace": "",
        "locator": "",
        "snap": "head",
        "version": 28429940
      },
      "errors": [],
      "union_shard_errors": [
        "data_digest_mismatch_info"
      ],
      "selected_object_info": {
        "oid": {
          "oid": "rb.0.304abf.238e1f29.00000001cc48",
          "key": "",
          "snapid": -2,
          "hash": 367865453,
          "max": 0,
          "pool": 21,
          "namespace": ""
        },
        "version": "1246902'28520530",
        "prior_version": "1240840'28429940",
        "last_reqid": "osd.736.0:3448586",
        "user_version": 28429940,
        "size": 4194304,
        "mtime": "2019-08-12 05:24:02.672911",
        "local_mtime": "2019-08-12 05:24:02.673401",
        "lost": 0,
        "flags": [
          "dirty",
          "data_digest",
          "omap_digest"
        ],
        "truncate_seq": 0,
        "truncate_size": 0,
        "data_digest": "0xf4342d4a",
        "omap_digest": "0xffffffff",
        "expected_object_size": 4194304,
        "expected_write_size": 4194304,
        "alloc_hint_flags": 0,
        "manifest": {
          "type": 0
        },
        "watchers": {}
      },
      "shards": [
        {
          "osd": 508,
          "primary": false,
          "errors": [
            "data_digest_mismatch_info"
          ],
          "size": 4194304,
          "omap_digest": "0xffffffff",
          "data_digest": "0x224d6ca6"
        },
        {
          "osd": 825,
          "primary": false,
          "errors": [
            "data_digest_mismatch_info"
          ],
          "size": 4194304,
          "omap_digest": "0xffffffff",
          "data_digest": "0x224d6ca6"
        },
        {
          "osd": 988,
          "primary": true,
          "errors": [
            "data_digest_mismatch_info"
          ],
          "size": 4194304,
          "omap_digest": "0xffffffff",
          "data_digest": "0x224d6ca6"
        }
      ]
    }
  ]
}

[0]: https://tracker.ceph.com/issues/24994
_______________________________________________
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

Reply via email to