[ceph-users] Ceph Pacific bluefs enospc bug with newly created OSDs

2023-06-20 Thread Carsten Grommel
Hi all, we are experiencing the “bluefs enospc bug” again after redeploying all OSDs of our Pacific Cluster. I know that our cluster is a bit too utilized at the moment with 87.26 % raw usage but still this should not happen afaik. We never hat this problem with previous ceph versions and right

[ceph-users] Re: OpenStack (cinder) volumes retyping on Ceph back-end

2023-06-20 Thread Eugen Block
I can confirm, in a virtual test openstack environment (Wallaby) with ceph quincy I did a retype of an attached volume (root disk of a VM). Retyping works, the volume is copied to the other back end pool, but the IO is still going to the old pool/image although they already have been remove

[ceph-users] Re: OpenStack (cinder) volumes retyping on Ceph back-end

2023-06-20 Thread Andrea Martra
Hello, thank you for the confirmation. I reported the problem on the openstack-discuss mailing. Thanks, Andrea Il 20/06/23 10:15, Eugen Block ha scritto: I can confirm, in a virtual test openstack environment (Wallaby) with ceph quincy I did a retype of an attached volume (root disk of a VM).

[ceph-users] X large objects found in pool 'XXX.rgw.buckets.index'

2023-06-20 Thread Gilles Mocellin
Hello, I still have large OMAP objects since a year. These objects are probably from an ancient bucket that has been removed. So I cannot use bilog trim. Depp-scrub dos nothing. Also, even if I don't have a huge cluster (my Object Storage pools is only arounde 10TB), the rgw-orphan-list is too

[ceph-users] radosgw new zonegroup hammers master with metadata sync

2023-06-20 Thread Boris Behrens
Hi, yesterday I added a new zonegroup and it looks like it seems to cycle over the same requests over and over again. In the log of the main zone I see these requests: 2023-06-20T09:48:37.979+ 7f8941fb3700 1 beast: 0x7f8a602f3700: fd00:2380:0:24::136 - - [2023-06-20T09:48:37.979941+] "GET

[ceph-users] osd memory target not work

2023-06-20 Thread farhad kh
when set osd_memory_target for limitation usage memory for osd disk ,This value is expected to be set for the OSD container .But with the docker stats command, this value is not seen Is my perception of this process wrong? --- [root@opcsdfpsbpp0201 ~]# ceph orch ps | grep osd.12 osd.12

[ceph-users] Re: Ceph Pacific bluefs enospc bug with newly created OSDs

2023-06-20 Thread Igor Fedotov
Hi Carsten, first of all Quincy does have a fix for the issue, see https://tracker.ceph.com/issues/53466 (and its Quincy counterpart https://tracker.ceph.com/issues/58588) Could you please share a bit more info on OSD disk layout? SSD or HDD? Standalone or shared DB volume? I presume the lat

[ceph-users] Re: osd memory target not work

2023-06-20 Thread Mark Nelson
Hi Farhad, I wrote the underlying osd memory target code.  OSDs won't always use all of the memory if there is nothing driving a need. Primarily the driver of memory usage will be the meta and data caches needing more memory to keep the hit rates high.  If you perform some reads/writes acros

[ceph-users] Recover OSDs from folder /var/lib/ceph/uuid/removed

2023-06-20 Thread Malte Stroem
Hello, is it possible to recover an OSD if it was removed? The systemd service was removed but the block device is still listed under lsblk and the config files are still available under /var/lib/ceph/uuid/removed It is a containerized cluster. So I think we need to add the cephx entries, u

[ceph-users] Re: radosgw new zonegroup hammers master with metadata sync

2023-06-20 Thread Casey Bodley
hi Boris, we've been investigating reports of excessive polling from metadata sync. i just opened https://tracker.ceph.com/issues/61743 to track this. restarting the secondary zone radosgws should help as a temporary workaround On Tue, Jun 20, 2023 at 5:57 AM Boris Behrens wrote: > > Hi, > yeste

[ceph-users] Re: Recover OSDs from folder /var/lib/ceph/uuid/removed

2023-06-20 Thread Malte Stroem
Well, things I would do: - add the keyring to ceph auth ceph auth add osd.XX osd 'allow *' mon 'allow rwx' -i /var/lib/ceph/uuid(osd.XX/keyring - add OSD to crush ceph osd crush set osd.XX 1.0 root=default ... - create systemd service systemctl enable ceph-u...@osd.xx.service Is there som

[ceph-users] Re: RGW STS Token Forbidden error since upgrading to Quincy 17.2.6

2023-06-20 Thread Austin Axworthy
Hi Pritha, I have increased the debug logs and pasted the output below. I have 2 users, austin and test. Austin is the owner user on the buckets, and I am trying to assume the role with the test user. I have also tried to assume the role of austin with the same user, but still get the same forb

[ceph-users] 1 PG stucked in "active+undersized+degraded for long time

2023-06-20 Thread siddhit . renake
Hello All, Ceph version: 14.2.5-382-g8881d33957 (8881d33957b54b101eae9c7627b351af10e87ee8) nautilus (stable) Issue: 1 PG stucked in "active+undersized+degraded for long time Degraded data redundancy: 44800/8717052637 objects degraded (0.001%), 1 pg degraded, 1 pg undersized #ceph pg dump_stuck

[ceph-users] Re: radosgw new zonegroup hammers master with metadata sync

2023-06-20 Thread Boris
Hi Casey, already did restart all RGW instances. Only helped for 2 minutes. We now stopped the new site. I will remove and recreate it later. As twi other sites don't have the problem I currently think I made a mistake in the process. Mit freundlichen Grüßen - Boris Behrens > Am 20.06.202

[ceph-users] Re: Starting v17.2.5 RGW SSE with default key (likely others) no longer works

2023-06-20 Thread Jayanth Reddy
Thanks, Casey for the response. I'll track the fix there. Thanks, Jayanth Reddy ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io

[ceph-users] Re: [rgw multisite] Perpetual behind

2023-06-20 Thread kchheda3
Hi Yixin, we had faced similar issue, and this was the tracker https://tracker.ceph.com/issues/57562, that has all the details ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io

[ceph-users] OSDs cannot join cluster anymore

2023-06-20 Thread Malte Stroem
Hello, we removed some nodes from our cluster. This worked without problems. Now, lots of OSDs do not want to join the cluster anymore if we reboot one of the still available nodes. It always runs into timeouts: --> ceph-volume lvm activate successful for osd ID: XX monclient(hunting): authe

[ceph-users] Re: [rgw multisite] Perpetual behind

2023-06-20 Thread kchheda3
And as per the tracker, the issue was merged to quincy and is available in 17.2.6 (looking at the release notes), so you might want to upgrade your cluster and re run your tests. Note, the existing issue will not go away post upgrading to 17.2.6, you will have to manually sync the buckets that a

[ceph-users] Error while adding host : Error EINVAL: Traceback (most recent call last): File /usr/share/ceph/mgr/mgr_module.py, line 1756, in _handle_command

2023-06-20 Thread Adiga, Anantha
Hi, I am seeing this error after an offline was deleted and while adding the host again. Thereafter, I have removed the /var/lib/cep folder and removed the ceph quincy image in the offline host. What is the cause of this issue and the solution. root@fl31ca104ja0201:/home/general# cephadm sh

[ceph-users] Re: Error while adding host : Error EINVAL: Traceback (most recent call last): File /usr/share/ceph/mgr/mgr_module.py, line 1756, in _handle_command

2023-06-20 Thread Adam King
There was a cephadm bug that wasn't fixed by the time 17.2.6 came out (I'm assuming that's the version being used here, although it may have been present in some slightly earlier quincy versions) that caused this misleading error to be printed out when adding a host failed. There's a tracker for it

[ceph-users] Re: Error while adding host : Error EINVAL: Traceback (most recent call last): File /usr/share/ceph/mgr/mgr_module.py, line 1756, in _handle_command

2023-06-20 Thread Adiga, Anantha
Hi Adam, Thank you for the details. I see that the cephadm on the Ceph cluster is different from the host that is being added. I will go thru the ticket and the logs. Also the cluster is on Ubuntu Focal and the new host is on Ubuntu Jammy The utility: cephadm 16.2.1

[ceph-users] Re: radosgw new zonegroup hammers master with metadata sync

2023-06-20 Thread Boris Behrens
I recreated the site and the problem still persists. I've upped the logging and saw this for a lot of buckets (i've stopped the debug log after some seconds). 2023-06-20T23:32:29.365+ 7fcaab7fe700 20 get_system_obj_state: rctx=0x7fcaab7f9320 obj=dc3.rgw.meta:root:s3bucket-fra2 state=0x7fcba05a

[ceph-users] [question] Put with "tagging" is slowly?

2023-06-20 Thread Louis Koo
2023-06-21T02:48:50.754+ 7f1cd5b84700 1 beast: 0x7f1c4b26e630: 10.x.x.83 - xx [21/Jun/2023:02:48:47.653 +] "PUT /zhucan/deb/content/vol-26/chap-41/3a917ec7-02b3-4b45-8c0c-be32f4914708.bytes?tagging HTTP/1.1" 200 0 - "aws-sdk-java/1.12.299 Linux/3.10.0-1127.el7.x86_64 OpenJDK_64-Bit_Ser

[ceph-users] alerts in dashboard

2023-06-20 Thread Ben
Hi, I got many critical alerts in ceph dashboard. Meanwhile the cluster shows health ok status. See attached screenshot for detail. My questions are, are they real alerts? How to get rid of them? Thanks Ben ___ ceph-users mailing list -- ceph-users@cep