Hi Jesper,
could you please provide more details about the cluster (the usual
like 'ceph osd tree', 'ceph osd df', 'ceph versions')?
I find it unusual to enable maintenance mode to add OSDs, is there a
specific reason?
And why adding OSDs manually with 'ceph orch osd add', why not have a
sp
Hi Thorn,
given the amount of files at CephFS volume I presume you don't have
severe write load against it. Is that correct?
If so we can assume that the numbers you're sharing are mostly refer to
your experiment. At peak I can see bytes_used increase = 629,461,893,120
bytes (45978612027392
Hello Thorne,
Here is one more suggestion on how to debug this. Right now, there is
uncertainty on whether there is really a disk space leak or if
something simply wrote new data during the test.
If you have at least three OSDs you can reassign, please set their
CRUSH device class to something di
Possibly a naive question, and possibly seemingly trivial, but is there any
good reason to return a “1” on success for cephadm host-maintenance enter and
exit:
~$ sudo cephadm host-maintenance enter --fsid -XX--X Inferring
config /var/lib/ceph/-XX--Xconfig/c
Hello Gregory and Nathan,
Having a look at our resource utilization, there doesn't seem to be a
CPU or memory bottleneck as there is plenty of both available for the
host which has the blocked OSD as well for the MDS' host.
We've had a repeated of this problem today where the OSD logging slow
Hi,
in our cluster (17.2.6) disks fail from time to time. Block devices are
HDD, DB devices are NVME. However, the OSD process does not reliably
die. That leads to blocked client IO for all requests for which the OSD
with the broken disk is the primary OSD. All pools on these OSDs are EC
pool
On Tuesday, March 19, 2024 7:32:47 AM EDT Daniel Brown wrote:
> Possibly a naive question, and possibly seemingly trivial, but is there any
> good reason to return a “1” on success for cephadm host-maintenance enter
> and exit:
No, I doubt that was intentional. The function is written in a way
Hello Ivan,
Do you observe any spikes in the memory utilization of the MDS when the
lock happens? Particularly in buffer_anon?
We are observing some rdlock issues which are leading to a spinlock on
the MDS but it does not seem to be related to a hanging operation on an OSD.
Cheers,
Enrico
O
Hi Daniel,
translating EIO to upper layers rather than crashing an OSD is a valid
default behavior. One can alter this by setting bluestore_fail_eio
parameter to true.
Thanks,
Igor
On 3/19/2024 2:50 PM, Daniel Schreiber wrote:
Hi,
in our cluster (17.2.6) disks fail from time to time. Blo
Hello,
there is one bucket for a user in our Ceph cluster who is suddenly not
able to write to one of his buckets.
Reading works fine.
All other buckets work fine.
If we copy the bucket to another bucket on the same cluster, the error
stays. Writing is not possible in the new bucket, too.
Hi,
On 3/19/24 13:00, Igor Fedotov wrote:
translating EIO to upper layers rather than crashing an OSD is a valid
default behavior. One can alter this by setting bluestore_fail_eio
parameter to true.
What benefit lies in this behavior when in the end client IO stalls?
Regards
--
Robert Sand
Hello,
Over the last few weeks, we have observed a abnormal increase of a pool's data
usage (by a factor of 2). It turns out that we are hit by this bug [1].
In short, if you happened to take pool snapshots and removed them by using the
following command
'ceph osd pool rmsnap
On Tue, Mar 19, 2024 at 01:19:34PM +0100, Malte Stroem wrote:
> I checked the policies, lifecycle and versioning.
>
> Nothing. The user has FULL_CONTROL. Same settings for the user's other
> buckets he can still write to.
>
> Wenn setting debugging to higher numbers all I can see is something li
Igor,
Those files are VM disk images, and they're under constant heavy use, so
yes- there/is/ constant severe write load against this disk.
Apart from writing more test files into the filesystems, there must be
Ceph diagnostic tools to describe what those objects are being used for,
surely?
Alexander,
Thank you, but as I said to Igor: The 5.5TB of files on this filesystem
are virtual machine disks. They are under constant, heavy write load.
There is no way to turn this off.
On 19/03/2024 9:36 pm, Alexander E. Patrakov wrote:
Hello Thorne,
Here is one more suggestion on how to
> Those files are VM disk images, and they're under constant heavy use, so yes-
> there/is/ constant severe write load against this disk.
Why are you using CephFS for an RBD application?
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscrib
Hi Community!!!
Are we logging IRC channels? I ask this because a lot of people only use
Slack, and the Slack we use doesn't have a subscription, so messages are
lost after 90 days (I believe)
I believe it's important to keep track of the technical knowledge we see
each day over IRC+Slack
Cheers!
A long time ago Wido used to have a bot logging IRC afaik, but I think
that's been gone for some time.
Mark
On 3/19/24 19:36, Alvaro Soto wrote:
Hi Community!!!
Are we logging IRC channels? I ask this because a lot of people only use
Slack, and the Slack we use doesn't have a subscription, s
18 matches
Mail list logo