On 25/05/2023 01.40, 胡 玮文 wrote:
> Hi Hector,
>
> Not related to fragmentation. But I see you mentioned CephFS, and your OSDs
> are at high utilization. Is your pool NEAR FULL? CephFS write performance is
> severely degraded if the pool is NEAR FULL. Buffered write will be disabled,
> and every
Hi Patrick,
Thanks for the instructions. We started the MDS recovery scan with below cmds
following the link below. The first bit of scan extens has finished and we're
waiting on scan inodes. Probably we shouldn't interrupt the process. Once this
procedure failed, I'll follow your steps and let
On Wed, May 17, 2023 at 9:26 PM Henning Achterrath
wrote:
> Hi all,
>
> we did a major update from Pacific to Quincy (17.2.5) a month ago
> without any problems.
>
> Now we have tried a minor update from 17.2.5 to 17.2.6 (ceph orch
> upgrade). It stucks at mds upgrade phase. At this point the clu
Hi Henning,
I think the increasing strays_created is normal. This is a counter that is
monotonically increasing when any file is deleted. And is only reset when the
MDS is restarted.
The num_strays is the actual number of strays in your system, and they are not
necessarily reside in memory.
W
Hi everyone,
Just a reminder that we will be starting at the next hour, 17:00 UTC.
https://ceph.io/en/community/tech-talks/
On Tue, May 16, 2023 at 10:19 AM Mike Perez wrote:
> Hello everyone,
>
> Join us on May 24th at 17:00 UTC for a long overdue Ceph Tech Talk! This
> month, Yuval Lifshitz
Hi Hector,
Not related to fragmentation. But I see you mentioned CephFS, and your OSDs are
at high utilization. Is your pool NEAR FULL? CephFS write performance is
severely degraded if the pool is NEAR FULL. Buffered write will be disabled,
and every single write() system call needs to wait for
Hi Team,
I'm writing to bring to your attention an issue we have encountered with the
"mtime" (modification time) behavior for directories in the Ceph filesystem.
Upon observation, we have noticed that when the mtime of a directory (let's
say: dir1) is explicitly changed in CephFS, subsequent a
There was a memory issue with standby-replay that may have been resolved
since and fix is in 16.2.10 (not sure), the suggestion at the time was to
avoid standby-replay.
Perhaps a dev can chime in on that status. Your MDSs look pretty inactive.
I would consider scaling them down (potentially to sin
On 5/24/23 14:03, Emmanuel Jaep wrote:
Hi,
I inherited a ceph fs cluster. Even if I have years of experience in
systems management, I fail to grasp the complete logic of it fully.
From what I found on the web, the documentation is either too "high level"
or too detailed.
Is this a setup based
Hi,
using standby-replay daemons is something to test as it can have a
negative impact, it really depends on the actual workload. We stopped
using standby-replay in all clusters we (help) maintain, in one
specific case with many active MDSs and a high load the failover time
decreased and
So I guess, I'll end up doing:
ceph fs set cephfs max_mds 4
ceph fs set cephfs allow_standby_replay true
On Wed, May 24, 2023 at 4:13 PM Hector Martin wrote:
> Hi,
>
> On 24/05/2023 22.02, Emmanuel Jaep wrote:
> > Hi Hector,
> >
> > thank you very much for the detailed explanation and link to th
On 24/05/2023 22.07, Mark Nelson wrote:
> Yep, bluestore fragmentation is an issue. It's sort of a natural result
> of using copy-on-write and never implementing any kind of
> defragmentation scheme. Adam and I have been talking about doing it
> now, probably piggybacking on scrub or other ope
Hi,
On 24/05/2023 22.02, Emmanuel Jaep wrote:
> Hi Hector,
>
> thank you very much for the detailed explanation and link to the
> documentation.
>
> Given our current situation (7 active MDSs and 1 standby MDS):
> RANK STATE MDS ACTIVITY DNSINOS DIRS CAPS
> 0active
On Wed, May 24, 2023 at 4:26 AM Stefan Kooman wrote:
>
> On 5/22/23 20:24, Patrick Donnelly wrote:
>
> >
> > The original script is here:
> > https://github.com/ceph/ceph/blob/main/src/tools/cephfs/first-damage.py
> >
> "# Suggested recovery sequence (for single MDS cluster):
> #
> # 1) Unmount al
Hello Justin,
Please do:
ceph config set mds debug_mds 20
ceph config set mds debug_ms 1
Then wait for a crash. Please upload the log.
To restore your file system:
ceph config set mds mds_abort_on_newly_corrupt_dentry false
Let the MDS purge the strays and then try:
ceph config set mds mds_a
On Tue, May 23, 2023 at 11:52 PM Dietmar Rieder
wrote:
>
> On 5/23/23 15:58, Gregory Farnum wrote:
> > On Tue, May 23, 2023 at 3:28 AM Dietmar Rieder
> > wrote:
> >>
> >> Hi,
> >>
> >> can the cephfs "max_file_size" setting be changed at any point in the
> >> lifetime of a cephfs?
> >> Or is it c
Yep, bluestore fragmentation is an issue. It's sort of a natural result
of using copy-on-write and never implementing any kind of
defragmentation scheme. Adam and I have been talking about doing it
now, probably piggybacking on scrub or other operations that already
area reading all of the ex
Hi Hector,
thank you very much for the detailed explanation and link to the
documentation.
Given our current situation (7 active MDSs and 1 standby MDS):
RANK STATE MDS ACTIVITY DNSINOS DIRS CAPS
0active icadmin012 Reqs: 82 /s 2345k 2288k 97.2k 307k
1a
On 24/05/2023 21.15, Emmanuel Jaep wrote:
> Hi,
>
> we are currently running a ceph fs cluster at the following version:
> MDS version: ceph version 16.2.10
> (45fa1a083152e41a408d15505f594ec5f1b4fe17) pacific (stable)
>
> The cluster is composed of 7 active MDSs and 1 standby MDS:
> RANK STATE
Thanks, will keep an eye out for this version. Will report back to this thread
about these options and the recovery time/number of objects per second for
recovery.
Again, thank you'll for the information and answers!
___
ceph-users mailing list -- ceph
Yes, the fix should be in the next quincy upstream version. The version I
posted was the downstream one.
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io
On 24/05/2023 21.15, Emmanuel Jaep wrote:
> Hi,
>
> we are currently running a ceph fs cluster at the following version:
> MDS version: ceph version 16.2.10
> (45fa1a083152e41a408d15505f594ec5f1b4fe17) pacific (stable)
>
> The cluster is composed of 7 active MDSs and 1 standby MDS:
> RANK STATE
Hello again,
In two days, the number has increased by about one and a half million
and the ram usage of mds remains high by about 50G. We are very unsure
if this is a normal behavior.
Today:
"num_strays": 53695,
"num_strays_delayed": 4,
"num_strays_enqueuing": 0,
Hi,
I've been seeing relatively large fragmentation numbers on all my OSDs:
ceph daemon osd.13 bluestore allocator score block
{
"fragmentation_rating": 0.77251526920454427
}
These aren't that old, as I recreated them all around July last year.
They mostly hold CephFS data with erasure codin
Hi,
we are currently running a ceph fs cluster at the following version:
MDS version: ceph version 16.2.10
(45fa1a083152e41a408d15505f594ec5f1b4fe17) pacific (stable)
The cluster is composed of 7 active MDSs and 1 standby MDS:
RANK STATE MDS ACTIVITY DNSINOS DIRS CAPS
0
Hi,
I inherited a ceph fs cluster. Even if I have years of experience in
systems management, I fail to grasp the complete logic of it fully.
>From what I found on the web, the documentation is either too "high level"
or too detailed.
Do you know any good resources to get fully acquainted with cep
If I glance at the commits to the quincy branch, shouldn't the mentioned
configurations be included in 17.2.7?
The requested command output:
[ceph: root@mgrhost1 /]# ceph version
ceph version 17.2.6 (d7ff0d10654d2280e08f1ab989c7cdf3064446a5) quincy (stable)
[ceph: root@mgrhost1 /]# ceph config s
Absolutely! :-)
root@icadmin011:/tmp# ceph --cluster floki daemon mds.icadmin011 dump cache
/tmp/dump.txt
root@icadmin011:/tmp# ll
total 48
drwxrwxrwt 12 root root 4096 May 24 13:23 ./
drwxr-xr-x 18 root root 4096 Jun 9 2022 ../
drwxrwxrwt 2 root root 4096 May 4 12:43 .ICE-unix/
drwxrwxrwt
I'm on 17.2.6, but the option "osd_mclock_max_sequential_bandwidth_hdd"
> isn't available when I try to set it via "ceph config set osd.0
> osd_mclock_max_sequential_bandwidth_hdd 500Mi".
>
> Can you paste the output of
1. ceph version
2. ceph config show-with-defaults osd.0 | grep osd_mclock
3. ce
I hope the daemon mds.icadmin011 is running on the same machine that you
are looking for /tmp/dump.txt, since the file is created on the system
which has that daemon running.
On Wed, May 24, 2023 at 2:16 PM Emmanuel Jaep
wrote:
> Hi Milind,
>
> you are absolutely right.
>
> The dump_ops_in_flig
Dear All,
I'm using metadata repair tools to repair a damaged MDS following below
document. My storage has about 276TB data. Cephfs-data-scan is using 32
workers. How long will it take to finish scanning extents? What about scanning
inodes? It has run 6 hours and metadata pool dropped 1G. Is th
I'm on 17.2.6, but the option "osd_mclock_max_sequential_bandwidth_hdd" isn't
available when I try to set it via "ceph config set osd.0
osd_mclock_max_sequential_bandwidth_hdd 500Mi".
I need to use large numbers for hdd, because it looks like the mclock scheduler
isn't using the device class ov
There is a test bucket, I have removed its index and metadata:
radosgw-admin bi purge --bucket abccc --yes-i-really-mean-it
radosgw-admin metadata rm
bucket.instance:abccc:17a4ce99-009e-40f2-a2d2-2afc218ebd9b.425824299.4
Now the index and metadata is gone, but how to clean its data? Or is there
As someone in this thread noted, the cost related config options are
removed in the next version (ceph-17.2.6-45.el9cp).
The cost parameters may not work in all cases due to the inherent
differences in the underlying device types and other
external factors.
With the endeavor to achieve a more hand
Hi Milind,
you are absolutely right.
The dump_ops_in_flight is giving a good hint about what's happening:
{
"ops": [
{
"description": "internal op exportdir:mds.5:975673",
"initiated_at": "2023-05-23T17:49:53.030611+0200",
"age": 60596.355186077999,
Emmanuel,
You probably missed the "daemon" keyword after the "ceph" command name.
Here's the docs for pacific:
https://docs.ceph.com/en/pacific/cephfs/troubleshooting/
So, your command should've been:
# ceph daemon mds.icadmin011 dump cache /tmp/dump.txt
You could also dump the ops in flight with
On 5/22/23 20:24, Patrick Donnelly wrote:
The original script is here:
https://github.com/ceph/ceph/blob/main/src/tools/cephfs/first-damage.py
"# Suggested recovery sequence (for single MDS cluster):
#
# 1) Unmount all clients."
Is this a hard requirement? This might not be feasible for an M
Hi,
we are running a cephfs cluster with the following version:
ceph version 16.2.10 (45fa1a083152e41a408d15505f594ec5f1b4fe17) pacific
(stable)
Several MDSs are reporting slow requests:
HEALTH_WARN 4 MDSs report slow requests
[WRN] MDS_SLOW_REQUEST: 4 MDSs report slow requests
mds.icadmin011
On 5/22/23 17:28, huxia...@horebdata.cn wrote:
Hi, Stefan,
Thanks a lot for the message. It seems that client-side encryption (or
per use) is still on the way and not ready yet for today.
Are there practical methods to implement encryption for CephFS with
today' technique? e.g using LUKS or
39 matches
Mail list logo