[ceph-users] Re: ceph-ansible LARGE OMAP in RGW pool

Danish Khan Tue, 25 Mar 2025 22:31:50 -0700

Dear Frédéric,

Unfortunately, I am still using *Octopus* version and these commands are
showing unrecognized.


Versioning is also not enabled on the bucket.

I tried running :
radosgw-admin bucket check --bucket=<bucket> --fix

which run for few minutes giving lot of output, which contained below lines
for most of the objects:
WARNING: unable to find head object data pool for
"<bucket>:wp-content/uploads/sites/74/2025/03/mutation-house-no-629.pdf",
not updating version pool/epoch

Is this issue fixable in octopus or should I plan to upgrade ceph cluster
till Quincy version?

Regards,
Danish


On Wed, Mar 26, 2025 at 2:41 AM Frédéric Nass <
frederic.n...@univ-lorraine.fr> wrote:

> Hi Danish,
>
> Can you specify the version of Ceph used and whether versioning is enabled
> on this bucket?
>
> There are 2 ways to clean up orphan entries in a bucket index that I'm
> aware of :
>
> - One (the preferable way) is to rely on radosgw-admin command to check
> and hopefully fix the issue, cleaning up the index from orphan entries or
> even rebuilding the index entirely if necessary.
>
> There's been new radosgw-admin commands coded recently [1] to cleanup
> leftover OLH index entries and unlinked instance objects within versioned
> buckets.
>
> If this bucket is versioned, I would advise you try  and run the new check
> / fix commands mentioned in this [2] release note :
>
> radosgw-admin bucket check unlinked [--fix]
>
> radosgw-admin bucket check olh [--fix]
>
> - Another one (as a second chance) is to act at the rados layer,
> identifying in which shard the orphan index entry is listed (listomapkeys)
> and remove it from the specified shard (rmomapkey). I could elaborate on
> that later if needed.
>
> Regards,
> Frédéric.
>
> [1] https://tracker.ceph.com/issues/62075
> [2] https://ceph.io/en/news/blog/2023/v18-2-1-reef-released/
>
>
>
> ------------------------------
> *De :* Danish Khan <danish52....@gmail.com>
> *Envoyé :* mardi 25 mars 2025 17:16
> *À :* Frédéric Nass
> *Cc:* ceph-users
> *Objet :* Re: [ceph-users] ceph-ansible LARGE OMAP in RGW pool
>
> Hi Frédéric,
>
> Thank you for replying.
>
> I followed the steps mentioned in https://tracker.ceph.com/issues/62845
> and was able to trim all the errors.
>
> Everything seemed to be working fine until the same error appeared again.
>
> I am still assuming the main culprit of this issue is one missing
> object and all the errors are showing this object only.
>
> I am able to list this object using s3cmd tool but I am unable to perform
> any action on this object, I am even unable to delete it, overwrite it or
> get it.
>
> I tried stopping all RGWs one by one and even tried after stopping all the
> RGWS but recovery is still not getting completed.
>
> And the LARGE OMAP is now only increasing.
>
> Is there a way I can delete it from index or from ceph end directly from
> pool so that it don't try to recover it?
>
> Regards,
> Danish
>
>
>
> On Tue, Mar 25, 2025 at 11:29 AM Frédéric Nass <
> frederic.n...@univ-lorraine.fr> wrote:
>
>> Hi Danish,
>>
>> While reviewing the backports for upcoming v18.2.5, I came across this
>> [1]. Could be your issue.
>>
>> Can you try the suggested workaround (--marker=9) and report back?
>>
>> Regards,
>> Frédéric.
>>
>> [1] https://tracker.ceph.com/issues/62845
>>
>> ------------------------------
>> *De :* Danish Khan <danish52....@gmail.com>
>> *Envoyé :* vendredi 14 mars 2025 23:11
>> *À :* Frédéric Nass
>> *Cc:* ceph-users
>> *Objet :* Re: [ceph-users] ceph-ansible LARGE OMAP in RGW pool
>>
>> Dear Frédéric,
>>
>> 1/ Identify the shards with the most sync errors log entries:
>>
>> I have identified the shard which is causing the issue is shard 31, but
>> almost all the error shows only one object of a bucket. And the object
>> exists in the master zone. but I'm not sure why the replication site is
>> unable to sync it.
>>
>> 2/ For each shard, list every sync error log entry along with their ids:
>>
>> radosgw-admin sync error list --shard-id=X
>>
>> The output of this command shows same shard and same objects mostly
>> (shard 31 and object
>> /plugins/plugins/yellow-pencil-visual-theme-customizer/images/cursor.png)
>>
>> 3/ Remove them **except the last one** with:
>>
>> radosgw-admin sync error trim --shard-id=X
>> --marker=1_1682101321.201434_8669.1
>> Trimming did remove a few entries from the error log. But still there are
>> many error logs for the same object which I am unable to trim.
>>
>> Now the trim command is executing successfully but not doing anything.
>>
>> I am still getting error about the object which is not syncing in radosgw
>> log:
>>
>> 2025-03-15T03:05:48.060+0530 7fee2affd700  0
>> RGW-SYNC:data:sync:shard[80]:entry[mbackup:70134e66-872072ee2d32.2205852207.1:48]:bucket_sync_sources[target=:[]):source_bucket=:[]):source_zone=872072ee2d32]:bucket[mbackup:70134e66-872072ee2d32.2205852207.1:48<-mod-backup:70134e66-872072ee2d32.2205852207.1:48]:full_sync[mod-backup:70134e66-872072ee2d32.2205852207.1:48]:entry[wp-content/plugins/plugins/yellow-pencil-visual-theme-customizer/images/cursor.png]:
>> ERROR: failed to sync object:
>> mbackup:70134e66-872072ee2d32.2205852207.1:48/wp-content/plugins/plugins/yellow-pencil-visual-theme-customizer/images/cursor.png
>>
>> I am getting this error from appox two months, And if I remember
>> correctly, we are getting LARGE OMAP warning from then only.
>>
>> I will try to delete this object from the Master zone on Monday and will
>> see if this fixes the issue.
>>
>> Do you have any other suggestions on this, which I should consider?
>>
>> Regards,
>> Danish
>>
>>
>>
>>
>>
>>
>> On Thu, Mar 13, 2025 at 6:07 PM Frédéric Nass <
>> frederic.n...@univ-lorraine.fr> wrote:
>>
>>> Hi Danish,
>>>
>>> Can you access this KB article [1]? A free developer account should
>>> allow you to.
>>>
>>> It pretty much describes what you're facing and suggests to trim the
>>> sync error log of recovering shards. Actually, every log entry **except the
>>> last one**.
>>>
>>> 1/ Identify the shards with the most sync errors log entries:
>>>
>>> radosgw-admin sync error list --max-entries=1000000 | grep shard_id |
>>> sort -n | uniq -c | sort -h
>>>
>>> 2/ For each shard, list every sync error log entry along with their ids:
>>>
>>> radosgw-admin sync error list --shard-id=X
>>>
>>> 3/ Remove them **except the last one** with:
>>>
>>> radosgw-admin sync error trim --shard-id=X
>>> --marker=1_1682101321.201434_8669.1
>>>
>>> the --marker above being the log entry id.
>>>
>>> Are the replication threads running on the same RGWs that S3 clients are
>>> using?
>>>
>>> If so, using dedicated RGWs for the sync job might help you avoid
>>> non-recovering shards in the future, as described in Matthew's post [2]
>>>
>>> Regards,
>>> Frédéric.
>>>
>>> [1] https://access.redhat.com/solutions/7023912
>>> [2] https://www.spinics.net/lists/ceph-users/msg83988.html
>>>
>>> ----- Le 12 Mar 25, à 11:15, Danish Khan danish52....@gmail.com a écrit
>>> :
>>>
>>> > Dear All,
>>> >
>>> > My ceph cluster is giving Large OMAP warning from approx 2-3 Months. I
>>> > tried a few things like :
>>> > *Deep scrub of PGs*
>>> > *Compact OSDs*
>>> > *Trim log*
>>> > But these didn't work out.
>>> >
>>> > I guess the main issue is that 4 shards in replication site are always
>>> > recovering from 2-3 months.
>>> >
>>> > Any suggestions are highly appreciated.
>>> >
>>> > Sync status:
>>> > root@drhost1:~# radosgw-admin sync status
>>> >          realm e259e0a92 (object-storage)
>>> >      zonegroup 7a8606d2 (staas)
>>> >           zone c8022ad1 (repstaas)
>>> >  metadata sync syncing
>>> >                full sync: 0/64 shards
>>> >                incremental sync: 64/64 shards
>>> >                metadata is caught up with master
>>> >      data sync source: 2072ee2d32 (masterstaas)
>>> >                        syncing
>>> >                        full sync: 0/128 shards
>>> >                        incremental sync: 128/128 shards
>>> >                        data is behind on 3 shards
>>> >                        behind shards: [7,90,100]
>>> >                        oldest incremental change not applied:
>>> > 2025-03-12T13:14:10.268469+0530 [7]
>>> >                        4 shards are recovering
>>> >                        recovering shards: [31,41,55,80]
>>> >
>>> >
>>> > Master site:
>>> > 1. *root@master1:~# for obj in $(rados ls -p masterstaas.rgw.log); do
>>> echo
>>> > "$(rados listomapkeys -p masterstaas.rgw.log $obj | wc -l) $obj";done |
>>> > sort -nr | head -10*
>>> > 1225387 data_log.91
>>> > 1225065 data_log.86
>>> > 1224662 data_log.87
>>> > 1224448 data_log.92
>>> > 1224018 data_log.89
>>> > 1222156 data_log.93
>>> > 1201489 data_log.83
>>> > 1174125 data_log.90
>>> > 363498 data_log.84
>>> > 258709 data_log.6
>>> >
>>> >
>>> > 2. *root@master1:~# for obj in data_log.91 data_log.86 data_log.87
>>> > data_log.92 data_log.89 data_log.93 data_log.83 data_log.90; do rados
>>> stat
>>> > -p masterstaas.rgw.log $obj; done*
>>> > masterstaas.rgw.log/data_log.91 mtime 2025-02-24T15:09:25.000000+0530,
>>> size
>>> > 0
>>> > masterstaas.rgw.log/data_log.86 mtime 2025-02-24T15:01:25.000000+0530,
>>> size
>>> > 0
>>> > masterstaas.rgw.log/data_log.87 mtime 2025-02-24T15:02:25.000000+0530,
>>> size
>>> > 0
>>> > masterstaas.rgw.log/data_log.92 mtime 2025-02-24T15:11:01.000000+0530,
>>> size
>>> > 0
>>> > masterstaas.rgw.log/data_log.89 mtime 2025-02-24T14:54:55.000000+0530,
>>> size
>>> > 0
>>> > masterstaas.rgw.log/data_log.93 mtime 2025-02-24T14:53:25.000000+0530,
>>> size
>>> > 0
>>> > masterstaas.rgw.log/data_log.83 mtime 2025-02-24T14:16:21.000000+0530,
>>> size
>>> > 0
>>> > masterstaas.rgw.log/data_log.90 mtime 2025-02-24T15:05:25.000000+0530,
>>> size
>>> > 0
>>> >
>>> > *3. ceph cluster log :*
>>> > 2025-02-22T04:18:27.324886+0530 osd.173 (osd.173) 19 : cluster [WRN]
>>> Large
>>> > omap object found. Object: 124:b2ddf551:::data_log.93:head PG:
>>> 124.8aafbb4d
>>> > (124.d) Key count: 1218170 Size (bytes): 297085860
>>> > 2025-02-22T04:18:28.735886+0530 osd.65 (osd.65) 308 : cluster [WRN]
>>> Large
>>> > omap object found. Object: 124:f2081d70:::data_log.92:head PG:
>>> 124.eb8104f
>>> > (124.f) Key count: 1220420 Size (bytes): 295240028
>>> > 2025-02-22T04:18:30.668884+0530 mon.master1 (mon.0) 7974038 : cluster
>>> [WRN]
>>> > Health check update: 3 large omap objects (LARGE_OMAP_OBJECTS)
>>> > 2025-02-22T04:18:31.127585+0530 osd.18 (osd.18) 224 : cluster [WRN]
>>> Large
>>> > omap object found. Object: 124:d1061236:::data_log.86:head PG:
>>> 124.6c48608b
>>> > (124.b) Key count: 1221047 Size (bytes): 295398274
>>> > 2025-02-22T04:18:33.189561+0530 osd.37 (osd.37) 32665 : cluster [WRN]
>>> Large
>>> > omap object found. Object: 124:9a2e04b7:::data_log.87:head PG:
>>> 124.ed207459
>>> > (124.19) Key count: 1220584 Size (bytes): 295290366
>>> > 2025-02-22T04:18:35.007117+0530 osd.77 (osd.77) 135 : cluster [WRN]
>>> Large
>>> > omap object found. Object: 124:6b9e929a:::data_log.89:head PG:
>>> 124.594979d6
>>> > (124.16) Key count: 1219913 Size (bytes): 295127816
>>> > 2025-02-22T04:18:36.189141+0530 mon.master1 (mon.0) 7974039 : cluster
>>> [WRN]
>>> > Health check update: 5 large omap objects (LARGE_OMAP_OBJECTS)
>>> > 2025-02-22T04:18:36.340247+0530 osd.112 (osd.112) 259 : cluster [WRN]
>>> Large
>>> > omap object found. Object: 124:0958bece:::data_log.83:head PG:
>>> 124.737d1a90
>>> > (124.10) Key count: 1200406 Size (bytes): 290280292
>>> > 2025-02-22T04:18:38.523766+0530 osd.73 (osd.73) 1064 : cluster [WRN]
>>> Large
>>> > omap object found. Object: 124:fddd971f:::data_log.91:head PG:
>>> 124.f8e9bbbf
>>> > (124.3f) Key count: 1221183 Size (bytes): 295425320
>>> > 2025-02-22T04:18:42.619926+0530 osd.92 (osd.92) 285 : cluster [WRN]
>>> Large
>>> > omap object found. Object: 124:7dc404fa:::data_log.90:head PG:
>>> 124.5f2023be
>>> > (124.3e) Key count: 1169895 Size (bytes): 283025576
>>> > 2025-02-22T04:18:44.242655+0530 mon.master1 (mon.0) 7974043 : cluster
>>> [WRN]
>>> > Health check update: 8 large omap objects (LARGE_OMAP_OBJECTS)
>>> >
>>> > Replica site:
>>> > 1. *for obj in $(rados ls -p repstaas.rgw.log); do echo "$(rados
>>> > listomapkeys -p repstaas.rgw.log $obj | wc -l) $obj";done | sort -nr |
>>> head
>>> > -10*
>>> >
>>> > 432850 data_log.91
>>> > 432384 data_log.87
>>> > 432323 data_log.93
>>> > 431783 data_log.86
>>> > 431510 data_log.92
>>> > 427959 data_log.89
>>> > 414522 data_log.90
>>> > 407571 data_log.83
>>> > 151015 data_log.84
>>> > 109790 data_log.4
>>> >
>>> >
>>> > 2. *ceph cluster log:*
>>> > grep -ir "Large omap object found" /var/log/ceph/
>>> > /var/log/ceph/ceph-mon.drhost1.log:2025-03-12T14:49:59.997+0530
>>> > 7fc4ad544700  0 log_channel(cluster) log [WRN] :     Search the
>>> cluster log
>>> > for 'Large omap object found' for more details.
>>> > /var/log/ceph/ceph.log:2025-03-12T14:49:02.078108+0530 osd.10 (osd.10)
>>> 21 :
>>> > cluster [WRN] Large omap object found. Object:
>>> > 6:b2ddf551:::data_log.93:head PG: 6.8aafbb4d (6.d) Key count: 432323
>>> Size
>>> > (bytes): 105505884
>>> > /var/log/ceph/ceph.log:2025-03-12T14:49:02.389288+0530 osd.48 (osd.48)
>>> 37 :
>>> > cluster [WRN] Large omap object found. Object:
>>> > 6:d1061236:::data_log.86:head PG: 6.6c48608b (6.b) Key count: 431782
>>> Size
>>> > (bytes): 104564674
>>> > /var/log/ceph/ceph.log:2025-03-12T14:49:07.166954+0530 osd.24 (osd.24)
>>> 24 :
>>> > cluster [WRN] Large omap object found. Object:
>>> > 6:0958bece:::data_log.83:head PG: 6.737d1a90 (6.10) Key count: 407571
>>> Size
>>> > (bytes): 98635522
>>> > /var/log/ceph/ceph.log:2025-03-12T14:49:09.100110+0530 osd.63 (osd.63)
>>> 5 :
>>> > cluster [WRN] Large omap object found. Object:
>>> > 6:9a2e04b7:::data_log.87:head PG: 6.ed207459 (6.19) Key count: 432384
>>> Size
>>> > (bytes): 104712350
>>> > /var/log/ceph/ceph.log:2025-03-12T14:49:08.703760+0530 osd.59 (osd.59)
>>> 11 :
>>> > cluster [WRN] Large omap object found. Object:
>>> > 6:6b9e929a:::data_log.89:head PG: 6.594979d6 (6.16) Key count: 427959
>>> Size
>>> > (bytes): 103773777
>>> > /var/log/ceph/ceph.log:2025-03-12T14:49:11.126132+0530 osd.40 (osd.40)
>>> 24 :
>>> > cluster [WRN] Large omap object found. Object:
>>> > 6:f2081d70:::data_log.92:head PG: 6.eb8104f (6.f) Key count: 431508
>>> Size
>>> > (bytes): 104520406
>>> > /var/log/ceph/ceph.log:2025-03-12T14:49:13.799473+0530 osd.43 (osd.43)
>>> 61 :
>>> > cluster [WRN] Large omap object found. Object:
>>> > 6:fddd971f:::data_log.91:head PG: 6.f8e9bbbf (6.1f) Key count: 432850
>>> Size
>>> > (bytes): 104418869
>>> > /var/log/ceph/ceph.log:2025-03-12T14:49:14.398480+0530 osd.3 (osd.3)
>>> 55 :
>>> > cluster [WRN] Large omap object found. Object:
>>> > 6:7dc404fa:::data_log.90:head PG: 6.5f2023be (6.1e) Key count: 414521
>>> Size
>>> > (bytes): 100396561
>>> > /var/log/ceph/ceph.log:2025-03-12T14:50:00.000484+0530 mon.drhost1
>>> (mon.0)
>>> > 207423 : cluster [WRN]     Search the cluster log for 'Large omap
>>> object
>>> > found' for more details.
>>> >
>>> > Regards,
>>> > Danish
>>> > _______________________________________________
>>> > ceph-users mailing list -- ceph-users@ceph.io
>>> > To unsubscribe send an email to ceph-users-le...@ceph.io
>>
>> ------------------------------
> *De :* Danish Khan <danish52....@gmail.com>
> *Envoyé :* mardi 25 mars 2025 17:16
> *À :* Frédéric Nass
> *Cc:* ceph-users
> *Objet :* Re: [ceph-users] ceph-ansible LARGE OMAP in RGW pool
>
>
> Hi Frédéric,
>
> Thank you for replying.
>
> I followed the steps mentioned in https://tracker.ceph.com/issues/62845
> and was able to trim all the errors.
>
> Everything seemed to be working fine until the same error appeared again.
>
> I am still assuming the main culprit of this issue is one missing
> object and all the errors are showing this object only.
>
> I am able to list this object using s3cmd tool but I am unable to perform
> any action on this object, I am even unable to delete it, overwrite it or
> get it.
>
> I tried stopping all RGWs one by one and even tried after stopping all the
> RGWS but recovery is still not getting completed.
>
> And the LARGE OMAP is now only increasing.
>
> Is there a way I can delete it from index or from ceph end directly from
> pool so that it don't try to recover it?
>
> Regards,
> Danish
>
>
>
> On Tue, Mar 25, 2025 at 11:29 AM Frédéric Nass <
> frederic.n...@univ-lorraine.fr> wrote:
>
>> Hi Danish,
>>
>> While reviewing the backports for upcoming v18.2.5, I came across this
>> [1]. Could be your issue.
>>
>> Can you try the suggested workaround (--marker=9) and report back?
>>
>> Regards,
>> Frédéric.
>>
>> [1] https://tracker.ceph.com/issues/62845
>>
>> ------------------------------
>> *De :* Danish Khan <danish52....@gmail.com>
>> *Envoyé :* vendredi 14 mars 2025 23:11
>> *À :* Frédéric Nass
>> *Cc:* ceph-users
>> *Objet :* Re: [ceph-users] ceph-ansible LARGE OMAP in RGW pool
>>
>>
>> Dear Frédéric,
>>
>> 1/ Identify the shards with the most sync errors log entries:
>>
>> I have identified the shard which is causing the issue is shard 31, but
>> almost all the error shows only one object of a bucket. And the object
>> exists in the master zone. but I'm not sure why the replication site is
>> unable to sync it.
>>
>> 2/ For each shard, list every sync error log entry along with their ids:
>>
>> radosgw-admin sync error list --shard-id=X
>>
>> The output of this command shows same shard and same objects mostly
>> (shard 31 and object
>> /plugins/plugins/yellow-pencil-visual-theme-customizer/images/cursor.png)
>>
>> 3/ Remove them **except the last one** with:
>>
>> radosgw-admin sync error trim --shard-id=X
>> --marker=1_1682101321.201434_8669.1
>> Trimming did remove a few entries from the error log. But still there are
>> many error logs for the same object which I am unable to trim.
>>
>> Now the trim command is executing successfully but not doing anything.
>>
>> I am still getting error about the object which is not syncing in radosgw
>> log:
>>
>> 2025-03-15T03:05:48.060+0530 7fee2affd700  0
>> RGW-SYNC:data:sync:shard[80]:entry[mbackup:70134e66-872072ee2d32.2205852207.1:48]:bucket_sync_sources[target=:[]):source_bucket=:[]):source_zone=872072ee2d32]:bucket[mbackup:70134e66-872072ee2d32.2205852207.1:48<-mod-backup:70134e66-872072ee2d32.2205852207.1:48]:full_sync[mod-backup:70134e66-872072ee2d32.2205852207.1:48]:entry[wp-content/plugins/plugins/yellow-pencil-visual-theme-customizer/images/cursor.png]:
>> ERROR: failed to sync object:
>> mbackup:70134e66-872072ee2d32.2205852207.1:48/wp-content/plugins/plugins/yellow-pencil-visual-theme-customizer/images/cursor.png
>>
>> I am getting this error from appox two months, And if I remember
>> correctly, we are getting LARGE OMAP warning from then only.
>>
>> I will try to delete this object from the Master zone on Monday and will
>> see if this fixes the issue.
>>
>> Do you have any other suggestions on this, which I should consider?
>>
>> Regards,
>> Danish
>>
>>
>>
>>
>>
>>
>> On Thu, Mar 13, 2025 at 6:07 PM Frédéric Nass <
>> frederic.n...@univ-lorraine.fr> wrote:
>>
>>> Hi Danish,
>>>
>>> Can you access this KB article [1]? A free developer account should
>>> allow you to.
>>>
>>> It pretty much describes what you're facing and suggests to trim the
>>> sync error log of recovering shards. Actually, every log entry **except the
>>> last one**.
>>>
>>> 1/ Identify the shards with the most sync errors log entries:
>>>
>>> radosgw-admin sync error list --max-entries=1000000 | grep shard_id |
>>> sort -n | uniq -c | sort -h
>>>
>>> 2/ For each shard, list every sync error log entry along with their ids:
>>>
>>> radosgw-admin sync error list --shard-id=X
>>>
>>> 3/ Remove them **except the last one** with:
>>>
>>> radosgw-admin sync error trim --shard-id=X
>>> --marker=1_1682101321.201434_8669.1
>>>
>>> the --marker above being the log entry id.
>>>
>>> Are the replication threads running on the same RGWs that S3 clients are
>>> using?
>>>
>>> If so, using dedicated RGWs for the sync job might help you avoid
>>> non-recovering shards in the future, as described in Matthew's post [2]
>>>
>>> Regards,
>>> Frédéric.
>>>
>>> [1] https://access.redhat.com/solutions/7023912
>>> [2] https://www.spinics.net/lists/ceph-users/msg83988.html
>>>
>>> ----- Le 12 Mar 25, à 11:15, Danish Khan danish52....@gmail.com a écrit
>>> :
>>>
>>> > Dear All,
>>> >
>>> > My ceph cluster is giving Large OMAP warning from approx 2-3 Months. I
>>> > tried a few things like :
>>> > *Deep scrub of PGs*
>>> > *Compact OSDs*
>>> > *Trim log*
>>> > But these didn't work out.
>>> >
>>> > I guess the main issue is that 4 shards in replication site are always
>>> > recovering from 2-3 months.
>>> >
>>> > Any suggestions are highly appreciated.
>>> >
>>> > Sync status:
>>> > root@drhost1:~# radosgw-admin sync status
>>> >          realm e259e0a92 (object-storage)
>>> >      zonegroup 7a8606d2 (staas)
>>> >           zone c8022ad1 (repstaas)
>>> >  metadata sync syncing
>>> >                full sync: 0/64 shards
>>> >                incremental sync: 64/64 shards
>>> >                metadata is caught up with master
>>> >      data sync source: 2072ee2d32 (masterstaas)
>>> >                        syncing
>>> >                        full sync: 0/128 shards
>>> >                        incremental sync: 128/128 shards
>>> >                        data is behind on 3 shards
>>> >                        behind shards: [7,90,100]
>>> >                        oldest incremental change not applied:
>>> > 2025-03-12T13:14:10.268469+0530 [7]
>>> >                        4 shards are recovering
>>> >                        recovering shards: [31,41,55,80]
>>> >
>>> >
>>> > Master site:
>>> > 1. *root@master1:~# for obj in $(rados ls -p masterstaas.rgw.log); do
>>> echo
>>> > "$(rados listomapkeys -p masterstaas.rgw.log $obj | wc -l) $obj";done |
>>> > sort -nr | head -10*
>>> > 1225387 data_log.91
>>> > 1225065 data_log.86
>>> > 1224662 data_log.87
>>> > 1224448 data_log.92
>>> > 1224018 data_log.89
>>> > 1222156 data_log.93
>>> > 1201489 data_log.83
>>> > 1174125 data_log.90
>>> > 363498 data_log.84
>>> > 258709 data_log.6
>>> >
>>> >
>>> > 2. *root@master1:~# for obj in data_log.91 data_log.86 data_log.87
>>> > data_log.92 data_log.89 data_log.93 data_log.83 data_log.90; do rados
>>> stat
>>> > -p masterstaas.rgw.log $obj; done*
>>> > masterstaas.rgw.log/data_log.91 mtime 2025-02-24T15:09:25.000000+0530,
>>> size
>>> > 0
>>> > masterstaas.rgw.log/data_log.86 mtime 2025-02-24T15:01:25.000000+0530,
>>> size
>>> > 0
>>> > masterstaas.rgw.log/data_log.87 mtime 2025-02-24T15:02:25.000000+0530,
>>> size
>>> > 0
>>> > masterstaas.rgw.log/data_log.92 mtime 2025-02-24T15:11:01.000000+0530,
>>> size
>>> > 0
>>> > masterstaas.rgw.log/data_log.89 mtime 2025-02-24T14:54:55.000000+0530,
>>> size
>>> > 0
>>> > masterstaas.rgw.log/data_log.93 mtime 2025-02-24T14:53:25.000000+0530,
>>> size
>>> > 0
>>> > masterstaas.rgw.log/data_log.83 mtime 2025-02-24T14:16:21.000000+0530,
>>> size
>>> > 0
>>> > masterstaas.rgw.log/data_log.90 mtime 2025-02-24T15:05:25.000000+0530,
>>> size
>>> > 0
>>> >
>>> > *3. ceph cluster log :*
>>> > 2025-02-22T04:18:27.324886+0530 osd.173 (osd.173) 19 : cluster [WRN]
>>> Large
>>> > omap object found. Object: 124:b2ddf551:::data_log.93:head PG:
>>> 124.8aafbb4d
>>> > (124.d) Key count: 1218170 Size (bytes): 297085860
>>> > 2025-02-22T04:18:28.735886+0530 osd.65 (osd.65) 308 : cluster [WRN]
>>> Large
>>> > omap object found. Object: 124:f2081d70:::data_log.92:head PG:
>>> 124.eb8104f
>>> > (124.f) Key count: 1220420 Size (bytes): 295240028
>>> > 2025-02-22T04:18:30.668884+0530 mon.master1 (mon.0) 7974038 : cluster
>>> [WRN]
>>> > Health check update: 3 large omap objects (LARGE_OMAP_OBJECTS)
>>> > 2025-02-22T04:18:31.127585+0530 osd.18 (osd.18) 224 : cluster [WRN]
>>> Large
>>> > omap object found. Object: 124:d1061236:::data_log.86:head PG:
>>> 124.6c48608b
>>> > (124.b) Key count: 1221047 Size (bytes): 295398274
>>> > 2025-02-22T04:18:33.189561+0530 osd.37 (osd.37) 32665 : cluster [WRN]
>>> Large
>>> > omap object found. Object: 124:9a2e04b7:::data_log.87:head PG:
>>> 124.ed207459
>>> > (124.19) Key count: 1220584 Size (bytes): 295290366
>>> > 2025-02-22T04:18:35.007117+0530 osd.77 (osd.77) 135 : cluster [WRN]
>>> Large
>>> > omap object found. Object: 124:6b9e929a:::data_log.89:head PG:
>>> 124.594979d6
>>> > (124.16) Key count: 1219913 Size (bytes): 295127816
>>> > 2025-02-22T04:18:36.189141+0530 mon.master1 (mon.0) 7974039 : cluster
>>> [WRN]
>>> > Health check update: 5 large omap objects (LARGE_OMAP_OBJECTS)
>>> > 2025-02-22T04:18:36.340247+0530 osd.112 (osd.112) 259 : cluster [WRN]
>>> Large
>>> > omap object found. Object: 124:0958bece:::data_log.83:head PG:
>>> 124.737d1a90
>>> > (124.10) Key count: 1200406 Size (bytes): 290280292
>>> > 2025-02-22T04:18:38.523766+0530 osd.73 (osd.73) 1064 : cluster [WRN]
>>> Large
>>> > omap object found. Object: 124:fddd971f:::data_log.91:head PG:
>>> 124.f8e9bbbf
>>> > (124.3f) Key count: 1221183 Size (bytes): 295425320
>>> > 2025-02-22T04:18:42.619926+0530 osd.92 (osd.92) 285 : cluster [WRN]
>>> Large
>>> > omap object found. Object: 124:7dc404fa:::data_log.90:head PG:
>>> 124.5f2023be
>>> > (124.3e) Key count: 1169895 Size (bytes): 283025576
>>> > 2025-02-22T04:18:44.242655+0530 mon.master1 (mon.0) 7974043 : cluster
>>> [WRN]
>>> > Health check update: 8 large omap objects (LARGE_OMAP_OBJECTS)
>>> >
>>> > Replica site:
>>> > 1. *for obj in $(rados ls -p repstaas.rgw.log); do echo "$(rados
>>> > listomapkeys -p repstaas.rgw.log $obj | wc -l) $obj";done | sort -nr |
>>> head
>>> > -10*
>>> >
>>> > 432850 data_log.91
>>> > 432384 data_log.87
>>> > 432323 data_log.93
>>> > 431783 data_log.86
>>> > 431510 data_log.92
>>> > 427959 data_log.89
>>> > 414522 data_log.90
>>> > 407571 data_log.83
>>> > 151015 data_log.84
>>> > 109790 data_log.4
>>> >
>>> >
>>> > 2. *ceph cluster log:*
>>> > grep -ir "Large omap object found" /var/log/ceph/
>>> > /var/log/ceph/ceph-mon.drhost1.log:2025-03-12T14:49:59.997+0530
>>> > 7fc4ad544700  0 log_channel(cluster) log [WRN] :     Search the
>>> cluster log
>>> > for 'Large omap object found' for more details.
>>> > /var/log/ceph/ceph.log:2025-03-12T14:49:02.078108+0530 osd.10 (osd.10)
>>> 21 :
>>> > cluster [WRN] Large omap object found. Object:
>>> > 6:b2ddf551:::data_log.93:head PG: 6.8aafbb4d (6.d) Key count: 432323
>>> Size
>>> > (bytes): 105505884
>>> > /var/log/ceph/ceph.log:2025-03-12T14:49:02.389288+0530 osd.48 (osd.48)
>>> 37 :
>>> > cluster [WRN] Large omap object found. Object:
>>> > 6:d1061236:::data_log.86:head PG: 6.6c48608b (6.b) Key count: 431782
>>> Size
>>> > (bytes): 104564674
>>> > /var/log/ceph/ceph.log:2025-03-12T14:49:07.166954+0530 osd.24 (osd.24)
>>> 24 :
>>> > cluster [WRN] Large omap object found. Object:
>>> > 6:0958bece:::data_log.83:head PG: 6.737d1a90 (6.10) Key count: 407571
>>> Size
>>> > (bytes): 98635522
>>> > /var/log/ceph/ceph.log:2025-03-12T14:49:09.100110+0530 osd.63 (osd.63)
>>> 5 :
>>> > cluster [WRN] Large omap object found. Object:
>>> > 6:9a2e04b7:::data_log.87:head PG: 6.ed207459 (6.19) Key count: 432384
>>> Size
>>> > (bytes): 104712350
>>> > /var/log/ceph/ceph.log:2025-03-12T14:49:08.703760+0530 osd.59 (osd.59)
>>> 11 :
>>> > cluster [WRN] Large omap object found. Object:
>>> > 6:6b9e929a:::data_log.89:head PG: 6.594979d6 (6.16) Key count: 427959
>>> Size
>>> > (bytes): 103773777
>>> > /var/log/ceph/ceph.log:2025-03-12T14:49:11.126132+0530 osd.40 (osd.40)
>>> 24 :
>>> > cluster [WRN] Large omap object found. Object:
>>> > 6:f2081d70:::data_log.92:head PG: 6.eb8104f (6.f) Key count: 431508
>>> Size
>>> > (bytes): 104520406
>>> > /var/log/ceph/ceph.log:2025-03-12T14:49:13.799473+0530 osd.43 (osd.43)
>>> 61 :
>>> > cluster [WRN] Large omap object found. Object:
>>> > 6:fddd971f:::data_log.91:head PG: 6.f8e9bbbf (6.1f) Key count: 432850
>>> Size
>>> > (bytes): 104418869
>>> > /var/log/ceph/ceph.log:2025-03-12T14:49:14.398480+0530 osd.3 (osd.3)
>>> 55 :
>>> > cluster [WRN] Large omap object found. Object:
>>> > 6:7dc404fa:::data_log.90:head PG: 6.5f2023be (6.1e) Key count: 414521
>>> Size
>>> > (bytes): 100396561
>>> > /var/log/ceph/ceph.log:2025-03-12T14:50:00.000484+0530 mon.drhost1
>>> (mon.0)
>>> > 207423 : cluster [WRN]     Search the cluster log for 'Large omap
>>> object
>>> > found' for more details.
>>> >
>>> > Regards,
>>> > Danish
>>> > _______________________________________________
>>> > ceph-users mailing list -- ceph-users@ceph.io
>>> > To unsubscribe send an email to ceph-users-le...@ceph.io
>>>
>>
_______________________________________________
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

[ceph-users] Re: ceph-ansible LARGE OMAP in RGW pool

Reply via email to