date:20240319

[ceph-users] Re: Adding new OSD's - slow_ops and other issues.

2024-03-19 Thread Eugen Block

Hi Jesper, could you please provide more details about the cluster (the usual like 'ceph osd tree', 'ceph osd df', 'ceph versions')? I find it unusual to enable maintenance mode to add OSDs, is there a specific reason? And why adding OSDs manually with 'ceph orch osd add', why not have a sp

[ceph-users] Re: CephFS space usage

2024-03-19 Thread Igor Fedotov

Hi Thorn, given the amount of files at CephFS volume I presume you don't have severe write load against it. Is that correct? If so we can assume that the numbers you're sharing are mostly refer to your experiment. At peak I can see bytes_used increase = 629,461,893,120 bytes (45978612027392

[ceph-users] Re: CephFS space usage

2024-03-19 Thread Alexander E. Patrakov

Hello Thorne, Here is one more suggestion on how to debug this. Right now, there is uncertainty on whether there is really a disk space leak or if something simply wrote new data during the test. If you have at least three OSDs you can reassign, please set their CRUSH device class to something di

[ceph-users] Return value from cephadm host-maintenance?

2024-03-19 Thread Daniel Brown

Possibly a naive question, and possibly seemingly trivial, but is there any good reason to return a “1” on success for cephadm host-maintenance enter and exit: ~$ sudo cephadm host-maintenance enter --fsid -XX--X Inferring config /var/lib/ceph/-XX--Xconfig/c

[ceph-users] Re: MDS_CLIENT_LATE_RELEASE, MDS_SLOW_METADATA_IO, and MDS_SLOW_REQUEST errors and slow osd_ops despite hardware being fine

2024-03-19 Thread Ivan Clayson

Hello Gregory and Nathan, Having a look at our resource utilization, there doesn't seem to be a CPU or memory bottleneck as there is plenty of both available for the host which has the blocked OSD as well for the MDS' host. We've had a repeated of this problem today where the OSD logging slow

[ceph-users] OSD does not die when disk has failures

2024-03-19 Thread Daniel Schreiber

Hi, in our cluster (17.2.6) disks fail from time to time. Block devices are HDD, DB devices are NVME. However, the OSD process does not reliably die. That leads to blocked client IO for all requests for which the OSD with the broken disk is the primary OSD. All pools on these OSDs are EC pool

[ceph-users] Re: Return value from cephadm host-maintenance?

2024-03-19 Thread John Mulligan

On Tuesday, March 19, 2024 7:32:47 AM EDT Daniel Brown wrote: > Possibly a naive question, and possibly seemingly trivial, but is there any > good reason to return a “1” on success for cephadm host-maintenance enter > and exit: No, I doubt that was intentional. The function is written in a way

[ceph-users] Re: MDS_CLIENT_LATE_RELEASE, MDS_SLOW_METADATA_IO, and MDS_SLOW_REQUEST errors and slow osd_ops despite hardware being fine

2024-03-19 Thread Enrico Bocchi

Hello Ivan, Do you observe any spikes in the memory utilization of the MDS when the lock happens? Particularly in buffer_anon? We are observing some rdlock issues which are leading to a spinlock on the MDS but it does not seem to be related to a hanging operation on an OSD. Cheers, Enrico O

[ceph-users] Re: OSD does not die when disk has failures

2024-03-19 Thread Igor Fedotov

Hi Daniel, translating EIO to upper layers rather than crashing an OSD is a valid default behavior. One can alter this by setting bluestore_fail_eio parameter to true. Thanks, Igor On 3/19/2024 2:50 PM, Daniel Schreiber wrote: Hi, in our cluster (17.2.6) disks fail from time to time. Blo

[ceph-users] RGW: Cannot write to bucket anymore

2024-03-19 Thread Malte Stroem

Hello, there is one bucket for a user in our Ceph cluster who is suddenly not able to write to one of his buckets. Reading works fine. All other buckets work fine. If we copy the bucket to another bucket on the same cluster, the error stays. Writing is not possible in the new bucket, too.

[ceph-users] Re: OSD does not die when disk has failures

2024-03-19 Thread Robert Sander

Hi, On 3/19/24 13:00, Igor Fedotov wrote: translating EIO to upper layers rather than crashing an OSD is a valid default behavior. One can alter this by setting bluestore_fail_eio parameter to true. What benefit lies in this behavior when in the end client IO stalls? Regards -- Robert Sand

[ceph-users] Leaked clone objects

2024-03-19 Thread Frédéric Nass

Hello, Over the last few weeks, we have observed a abnormal increase of a pool's data usage (by a factor of 2). It turns out that we are hit by this bug [1]. In short, if you happened to take pool snapshots and removed them by using the following command 'ceph osd pool rmsnap

[ceph-users] Re: RGW: Cannot write to bucket anymore

2024-03-19 Thread Robin H. Johnson

On Tue, Mar 19, 2024 at 01:19:34PM +0100, Malte Stroem wrote: > I checked the policies, lifecycle and versioning. > > Nothing. The user has FULL_CONTROL. Same settings for the user's other > buckets he can still write to. > > Wenn setting debugging to higher numbers all I can see is something li

[ceph-users] Re: CephFS space usage

2024-03-19 Thread Thorne Lawler

Igor, Those files are VM disk images, and they're under constant heavy use, so yes- there/is/ constant severe write load against this disk. Apart from writing more test files into the filesystems, there must be Ceph diagnostic tools to describe what those objects are being used for, surely?

[ceph-users] Re: CephFS space usage

2024-03-19 Thread Thorne Lawler

Alexander, Thank you, but as I said to Igor: The 5.5TB of files on this filesystem are virtual machine disks. They are under constant, heavy write load. There is no way to turn this off. On 19/03/2024 9:36 pm, Alexander E. Patrakov wrote: Hello Thorne, Here is one more suggestion on how to

[ceph-users] Re: CephFS space usage

2024-03-19 Thread Anthony D'Atri

> Those files are VM disk images, and they're under constant heavy use, so yes- > there/is/ constant severe write load against this disk. Why are you using CephFS for an RBD application? ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscrib

[ceph-users] Are we logging IRC channels?

2024-03-19 Thread Alvaro Soto

Hi Community!!! Are we logging IRC channels? I ask this because a lot of people only use Slack, and the Slack we use doesn't have a subscription, so messages are lost after 90 days (I believe) I believe it's important to keep track of the technical knowledge we see each day over IRC+Slack Cheers!

[ceph-users] Re: Are we logging IRC channels?

2024-03-19 Thread Mark Nelson

A long time ago Wido used to have a bot logging IRC afaik, but I think that's been gone for some time. Mark On 3/19/24 19:36, Alvaro Soto wrote: Hi Community!!! Are we logging IRC channels? I ask this because a lot of people only use Slack, and the Slack we use doesn't have a subscription, s

[ceph-users] Re: Adding new OSD's - slow_ops and other issues.

[ceph-users] Re: CephFS space usage

[ceph-users] Re: CephFS space usage

[ceph-users] Return value from cephadm host-maintenance?

[ceph-users] Re: MDS_CLIENT_LATE_RELEASE, MDS_SLOW_METADATA_IO, and MDS_SLOW_REQUEST errors and slow osd_ops despite hardware being fine

[ceph-users] OSD does not die when disk has failures

[ceph-users] Re: Return value from cephadm host-maintenance?

[ceph-users] Re: MDS_CLIENT_LATE_RELEASE, MDS_SLOW_METADATA_IO, and MDS_SLOW_REQUEST errors and slow osd_ops despite hardware being fine

[ceph-users] Re: OSD does not die when disk has failures

[ceph-users] RGW: Cannot write to bucket anymore

[ceph-users] Re: OSD does not die when disk has failures

[ceph-users] Leaked clone objects

[ceph-users] Re: RGW: Cannot write to bucket anymore

[ceph-users] Re: CephFS space usage

[ceph-users] Re: CephFS space usage

[ceph-users] Re: CephFS space usage

[ceph-users] Are we logging IRC channels?

[ceph-users] Re: Are we logging IRC channels?

18 matches

Site Navigation

Mail list logo

Footer information