Hi,
with 16.2.7, some OSDs are very slow to start, eg it takes ~30min for an
hdd (12TB, 5TB used) to become active. After initialization, there is
20-40min of extreme reading at ~150MB/s from the OSD, just after
---
Uptime(secs): 602.2 total, 0.0 interval
Flush(GB): cumulative 0.101, int
Hi all,
This is my first post to this user group, I’m not a ceph expert, sorry if I
say/ask anything trivial.
On a Kubernetes cluster I have an issue in creating volumes from a (csi) ceph
EC pool.
I can reproduce the problem from rbd cli like this from one of the k8s worker
nodes:
“””
root@f
Hi,
Is the memory ballooning while the MDS is active or could it be while
it is rejoining the cluster?
If the latter, this could be another case of:
https://tracker.ceph.com/issues/54253
Cheers, Dan
On Wed, Feb 9, 2022 at 7:23 PM Izzy Kulbe wrote:
>
> Hi,
>
> last weekend we upgraded one of ou
Can you share some more information how exactly you upgraded? It looks
like a cephadm managed cluster. Did you intall OS updates on all nodes
without waiting for the first one to recover? Maybe I'm misreading so
please clarify what your update process looked like.
Zitat von Mazzystr :
I
Hi,
is there a difference in PG size on new and old OSDs or are they all
similar in size? Is there some fsck enabled during OSD startup?
Zitat von Andrej Filipcic :
Hi,
with 16.2.7, some OSDs are very slow to start, eg it takes ~30min
for an hdd (12TB, 5TB used) to become active. After
Hi,
MDSs are crashing on my production cluster when trying to unlink some files and
I need help :-).
When looking into the log files, I have identified some associated files and I
ran a scrub on the parent directory with force,repair,recursive options. No
error were detected but the problem p
Hi,
I've tried to rm the mds0.openfiles and did a `ceph config set mds
mds_oft_prefetch_dirfrags false` but still with the same result of ceph
status reporting the daemon as up and not a lot more.
I also tried setting the cache to ridiculously small(128M) but the MDS'
memory usage would still go
Hi,
the first thing coming to mind are the user's caps. Which permissions
do they have? Have you compared 'ceph auth get client.fulen' on both
clusters? Please paste the output from both clusters and redact
sensitive information.
Zitat von Lo Re Giuseppe :
Hi all,
This is my first po
Hello Frank,
We've observed seemingly identical issue when a `fstrim` is carried out on one
of the RBD-backed iSCSI multipath device (we use ceph-iscsi to map RBD image to
local multipath device which is formatted in XFS filesystem). BTW, we use
Nautilus 14.2.22.
-- Origina
Hi,
It's a single ceph cluster, I'm testing from 2 different client nodes.
The caps are below.
I think is unlikely that caps are the cause as they work from one client node,
same ceph user, and not from the other one...
Cheers,
Giuseppe
[root@naret-monitor01 ~]# ceph auth get client.fulen
exp
On Fri, Feb 11, 2022 at 3:40 PM Izzy Kulbe wrote:
>
> Hi,
>
> I've tried to rm the mds0.openfiles and did a `ceph config set mds
> mds_oft_prefetch_dirfrags false` but still with the same result of ceph
> status reporting the daemon as up and not a lot more.
>
> I also tried setting the cache to r
How are the permissions of the client keyring on both systems?
Zitat von Lo Re Giuseppe :
Hi,
It's a single ceph cluster, I'm testing from 2 different client nodes.
The caps are below.
I think is unlikely that caps are the cause as they work from one
client node, same ceph user, and not fro
Hi Arnaud,
On Fri, Feb 11, 2022 at 2:42 PM Arnaud MARTEL
wrote:
>
> Hi,
>
> MDSs are crashing on my production cluster when trying to unlink some files
> and I need help :-).
> When looking into the log files, I have identified some associated files and
> I ran a scrub on the parent directory w
On 11/02/2022 15:05, Josh Baergen wrote:
In particular, do you have bluestore_fsck_quick_fix_on_mount set to true?
no, that's set to false.
Andrej
Josh
On Fri, Feb 11, 2022 at 2:08 AM Eugen Block wrote:
Hi,
is there a difference in PG size on new and old OSDs or are they all
similar in si
Hi Andrej,
you might want to set debug_bluestore and debug_bluefs to 10 and check
what's happening during the startup...
Alternatively you might try to compact slow OSD's DB using
ceph_kvstore_tool and check if it helps to speedup the startup...
Just in case - is bluefs_buffered_io set to
Hi,
at the moment no clients should be connected to the MDS(since the MDS
doesn't come up) and the cluster only serves these MDS. The MDS also didn't
start properly with mds_wipe_sessions = true.
ceph health detail with the MDS trying to run:
HEALTH_WARN 1 failed cephadm daemon(s); 3 large omap
My clusters are self rolled. My start command is as follows
podman run -it --privileged --pid=host --cpuset-cpus 0,1 --memory 2g --name
ceph_osd0 --hostname ceph_osd0 -v /dev:/dev -v
/etc/localtime:/etc/localtime:ro -v /etc/ceph:/etc/ceph/ -v
/var/lib/ceph/osd/ceph-0:/var/lib/ceph/osd/ceph-0 -v
/
I forgot to mention I freeze the cluster with 'ceph osd set
no{down,out,backfill}'. Then I zyp up all hosts and reboot them. Only
when everything is backup do I unset.
My client IO patterns allow me to do this since it's a worm data store with
long spans of time between writes and reads. I have
I'm suspicious of cross contamination of devices here. I was on CentOS for
eons until Red Hat shenanigans pinned me to CentOS 7 and nautilus. I had
very well defined udev rules that ensured dm devices were statically set
and owned correctly and survived reboots.
I seem to be struggling with this
root@fulen-w006:~# ll client.fulen.keyring
-rw-r--r-- 1 root root 69 Feb 11 15:30 client.fulen.keyring
root@fulen-w006:~# ll ceph.conf
-rw-r--r-- 1 root root 118 Feb 11 19:15 ceph.conf
root@fulen-w006:~# rbd -c ceph.conf --id fulen --keyring client.fulen.keyring
map fulen-nvme-meta/test-loreg-3
rb
On Fri, Feb 11, 2022 at 9:36 PM Izzy Kulbe wrote:
>
> Hi,
>
> at the moment no clients should be connected to the MDS(since the MDS doesn't
> come up) and the cluster only serves these MDS. The MDS also didn't start
> properly with mds_wipe_sessions = true.
>
> ceph health detail with the MDS tr
Hi,
If the MDS host has enough spare memory, setting
> `mds_cache_memory_limit`[*] to 9GB (or more if it permits) would get
> rid of this warning. Could you check if that improves the situation?
> Normally, the MDS starts trimming its cache when it overshoots the
> cache limit.
>
That won't work.
On Fri, Feb 11, 2022 at 10:53 AM Izzy Kulbe wrote:
> Hi,
>
> If the MDS host has enough spare memory, setting
> > `mds_cache_memory_limit`[*] to 9GB (or more if it permits) would get
> > rid of this warning. Could you check if that improves the situation?
> > Normally, the MDS starts trimming its
I set debug {bdev, bluefs, bluestore, osd} = 20/20 and restarted osd.0
Logs are here
-15> 2022-02-11T11:07:09.944-0800 7f93546c0080 10
bluestore(/var/lib/ceph/osd/ceph-0/block.wal) _read_bdev_label got
bdev(osd_uuid 7755e0c2-b4bf-4cbe-bc9a-26042d5bdc52, size 0xba420, btime
2019-04-11T08:46:
Hi,
thanks for the reply.
since this was a secondary backup anyways recreating the FS in case
everything fails was the plan anyway, it would've just been good to know
how we got it to an inoperable and irrecoverable state like this by
simply running orch upgrade, to avoid running into it again in
This problem is solved. My links are indeed swapped
host0:/var/lib/ceph/osd/ceph-0 # ls -la block*
lrwxrwxrwx 1 ceph ceph 23 Jan 15 15:13 block -> /dev/mapper/ceph-0block
lrwxrwxrwx 1 ceph ceph 24 Jan 15 15:13 block.db -> /dev/mapper/ceph--0db
lrwxrwxrwx 1 ceph ceph 25 Jan 15 15:13 block.wa
Hi everyone!
Be sure to make your voice heard by taking the Ceph User Survey before
March 25, 2022. This information will help guide the Ceph community’s
investment in Ceph and the Ceph community's future development.
https://survey.zohopublic.com/zs/tLCskv
Thank you to the Ceph User Survey Work
On 11/02/2022 15:22, Igor Fedotov wrote:
Hi Andrej,
you might want to set debug_bluestore and debug_bluefs to 10 and check
what's happening during the startup...
Alternatively you might try to compact slow OSD's DB using
ceph_kvstore_tool and check if it helps to speedup the startup...
wit
28 matches
Mail list logo