Hi all,
we are experiencing the “bluefs enospc bug” again after redeploying all OSDs of
our Pacific Cluster.
I know that our cluster is a bit too utilized at the moment with 87.26 % raw
usage but still this should not happen afaik.
We never hat this problem with previous ceph versions and right
I can confirm, in a virtual test openstack environment (Wallaby) with
ceph quincy I did a retype of an attached volume (root disk of a VM).
Retyping works, the volume is copied to the other back end pool, but
the IO is still going to the old pool/image although they already have
been remove
Hello,
thank you for the confirmation.
I reported the problem on the openstack-discuss mailing.
Thanks,
Andrea
Il 20/06/23 10:15, Eugen Block ha scritto:
I can confirm, in a virtual test openstack environment (Wallaby) with
ceph quincy I did a retype of an attached volume (root disk of a VM).
Hello,
I still have large OMAP objects since a year.
These objects are probably from an ancient bucket that has been removed.
So I cannot use bilog trim. Depp-scrub dos nothing.
Also, even if I don't have a huge cluster (my Object Storage pools is
only arounde 10TB), the rgw-orphan-list is too
Hi,
yesterday I added a new zonegroup and it looks like it seems to cycle over
the same requests over and over again.
In the log of the main zone I see these requests:
2023-06-20T09:48:37.979+ 7f8941fb3700 1 beast: 0x7f8a602f3700:
fd00:2380:0:24::136 - - [2023-06-20T09:48:37.979941+] "GET
when set osd_memory_target for limitation usage memory for osd disk ,This
value is expected to be set for the OSD container .But with the docker stats
command, this value is not seen Is my perception of this process wrong?
---
[root@opcsdfpsbpp0201 ~]# ceph orch ps | grep osd.12
osd.12
Hi Carsten,
first of all Quincy does have a fix for the issue, see
https://tracker.ceph.com/issues/53466 (and its Quincy counterpart
https://tracker.ceph.com/issues/58588)
Could you please share a bit more info on OSD disk layout?
SSD or HDD? Standalone or shared DB volume? I presume the lat
Hi Farhad,
I wrote the underlying osd memory target code. OSDs won't always use
all of the memory if there is nothing driving a need. Primarily the
driver of memory usage will be the meta and data caches needing more
memory to keep the hit rates high. If you perform some reads/writes
acros
Hello,
is it possible to recover an OSD if it was removed?
The systemd service was removed but the block device is still listed under
lsblk
and the config files are still available under
/var/lib/ceph/uuid/removed
It is a containerized cluster. So I think we need to add the cephx
entries, u
hi Boris,
we've been investigating reports of excessive polling from metadata
sync. i just opened https://tracker.ceph.com/issues/61743 to track
this. restarting the secondary zone radosgws should help as a
temporary workaround
On Tue, Jun 20, 2023 at 5:57 AM Boris Behrens wrote:
>
> Hi,
> yeste
Well, things I would do:
- add the keyring to ceph auth
ceph auth add osd.XX osd 'allow *' mon 'allow rwx' -i
/var/lib/ceph/uuid(osd.XX/keyring
- add OSD to crush
ceph osd crush set osd.XX 1.0 root=default ...
- create systemd service
systemctl enable ceph-u...@osd.xx.service
Is there som
Hi Pritha,
I have increased the debug logs and pasted the output below. I have 2 users,
austin and test. Austin is the owner user on the buckets, and I am trying to
assume the role with the test user. I have also tried to assume the role of
austin with the same user, but still get the same forb
Hello All,
Ceph version: 14.2.5-382-g8881d33957 (8881d33957b54b101eae9c7627b351af10e87ee8)
nautilus (stable)
Issue:
1 PG stucked in "active+undersized+degraded for long time
Degraded data redundancy: 44800/8717052637 objects degraded (0.001%), 1 pg
degraded, 1 pg undersized
#ceph pg dump_stuck
Hi Casey,
already did restart all RGW instances. Only helped for 2 minutes. We now
stopped the new site.
I will remove and recreate it later.
As twi other sites don't have the problem I currently think I made a mistake in
the process.
Mit freundlichen Grüßen
- Boris Behrens
> Am 20.06.202
Thanks, Casey for the response. I'll track the fix there.
Thanks,
Jayanth Reddy
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io
Hi Yixin,
we had faced similar issue, and this was the tracker
https://tracker.ceph.com/issues/57562, that has all the details
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io
Hello,
we removed some nodes from our cluster. This worked without problems.
Now, lots of OSDs do not want to join the cluster anymore if we reboot
one of the still available nodes.
It always runs into timeouts:
--> ceph-volume lvm activate successful for osd ID: XX
monclient(hunting): authe
And as per the tracker, the issue was merged to quincy and is available in
17.2.6 (looking at the release notes), so you might want to upgrade your
cluster and re run your tests.
Note, the existing issue will not go away post upgrading to 17.2.6, you will
have to manually sync the buckets that a
Hi,
I am seeing this error after an offline was deleted and while adding the host
again. Thereafter, I have removed the /var/lib/cep folder and removed the ceph
quincy image in the offline host. What is the cause of this issue and the
solution.
root@fl31ca104ja0201:/home/general# cephadm sh
There was a cephadm bug that wasn't fixed by the time 17.2.6 came out (I'm
assuming that's the version being used here, although it may have been
present in some slightly earlier quincy versions) that caused this
misleading error to be printed out when adding a host failed. There's a
tracker for it
Hi Adam,
Thank you for the details. I see that the cephadm on the Ceph cluster is
different from the host that is being added. I will go thru the ticket and the
logs. Also the cluster is on Ubuntu Focal and the new host is on Ubuntu Jammy
The utility:
cephadm 16.2.1
I recreated the site and the problem still persists.
I've upped the logging and saw this for a lot of buckets (i've stopped the
debug log after some seconds).
2023-06-20T23:32:29.365+ 7fcaab7fe700 20 get_system_obj_state:
rctx=0x7fcaab7f9320 obj=dc3.rgw.meta:root:s3bucket-fra2
state=0x7fcba05a
2023-06-21T02:48:50.754+ 7f1cd5b84700 1 beast: 0x7f1c4b26e630: 10.x.x.83 -
xx [21/Jun/2023:02:48:47.653 +] "PUT
/zhucan/deb/content/vol-26/chap-41/3a917ec7-02b3-4b45-8c0c-be32f4914708.bytes?tagging
HTTP/1.1" 200 0 - "aws-sdk-java/1.12.299 Linux/3.10.0-1127.el7.x86_64
OpenJDK_64-Bit_Ser
Hi,
I got many critical alerts in ceph dashboard. Meanwhile the cluster shows
health ok status.
See attached screenshot for detail. My questions are, are they real alerts?
How to get rid of them?
Thanks
Ben
___
ceph-users mailing list -- ceph-users@cep
24 matches
Mail list logo