Are you sure your not being hit by:
ceph config set osd bluestore_fsck_quick_fix_on_mount false @
https://docs.ceph.com/docs/master/releases/octopus/
Have all your OSD's successfully completed the fsck?
Reasons I say that is I can see "20 OSD(s) reporting legacy (not per-pool)
BlueStore om
Hello,
I'm running a cluster with Ceph version 14.2.7
(3d58626ebeec02d8385a4cefb92c6cbc3a45bfe8) nautilus (stable).
I've encountered an issue with my cluster where objects are marked as expired
but are not removed during lifecycle processing. These buckets have a mix of
objects with and withou
Just to confirm this does not get better:
root@backup1:~# ceph status
cluster:
id: 9cd41f0f-936d-4b59-8e5d-9b679dae9140
health: HEALTH_WARN
20 OSD(s) reporting legacy (not per-pool) BlueStore omap
usage stats
4/50952060 objects unfound (0.000%)
nob
The CPU is used by userspace, not kernelspace
Here is the perf top, see attachment
Rocksdb eats everything :/
On 4/8/20 3:14 PM, Paul Emmerich wrote:
> What's the CPU busy with while spinning at 100%?
>
> Check "perf top" for a quick overview
>
>
> Paul
>
Samples: 1M of event 'cycles:ppp',
What's the CPU busy with while spinning at 100%?
Check "perf top" for a quick overview
Paul
--
Paul Emmerich
Looking for help with your Ceph cluster? Contact us at https://croit.io
croit GmbH
Freseniusstr. 31h
81247 München
www.croit.io
Tel: +49 89 1896585 90
On Wed, Apr 8, 2020 at 3:09 PM
I do:
root@backup1:~# ceph config dump | grep snap_trim_sleep
globaladvanced osd_snap_trim_sleep
60.00
globaladvanced osd_snap_trim_sleep_hdd
60.00
(cluster is fully rusty)
On 4/8/20 2:53 PM, Dan van der Ster wrote:
> Do you have a custom value for osd_snap_trim_sleep ?
Do you have a custom value for osd_snap_trim_sleep ?
On Wed, Apr 8, 2020 at 2:03 PM Jack wrote:
>
> I put the nosnaptrim during upgrade because I saw high CPU usage and
> though it was somehow related to the upgrade process
> However, all my daemon are now running Octopus, and the issue is still
I put the nosnaptrim during upgrade because I saw high CPU usage and
though it was somehow related to the upgrade process
However, all my daemon are now running Octopus, and the issue is still
here, so I was wrong
On 4/8/20 1:58 PM, Wido den Hollander wrote:
>
>
> On 4/8/20 1:38 PM, Jack wrote:
On 4/8/20 1:38 PM, Jack wrote:
> Hello,
>
> I've a issue, since my Nautilus -> Octopus upgrade
>
> My cluster has many rbd images (~3k or something)
> Each of them has ~30 snapshots
> Each day, I create and remove a least a snapshot per image
>
> Since Octopus, when I remove the "nosnaptrim"
Hello,
I've a issue, since my Nautilus -> Octopus upgrade
My cluster has many rbd images (~3k or something)
Each of them has ~30 snapshots
Each day, I create and remove a least a snapshot per image
Since Octopus, when I remove the "nosnaptrim" flags, each OSDs uses 100%
of its CPU time
The whole
A note of caution, though. "rbd status" just lists watches on the
image header object and a watch is not a reliable indicator of whether
the image is mapped somewhere or not.
It is true that all read-write mappings establish a watch, but it can
come and go due to network partitions, OSD crashes o
I experienced an issue where lock didn't get cleared automatically on RBDs.
When a kvm hosts crashed the locks never cleared. It was a permission issue on
cephx. Maybe test with a admin user? Maybe post what permissions you have for
that user with `ceph auth list`?
Glen
-Original Message--
On Tue, Apr 7, 2020 at 6:49 PM Void Star Nill wrote:
>
> Hello All,
>
> Is there a way to specify that a lock (shared or exclusive) on an rbd
> volume be released if the client machine becomes unreachable or
> irresponsive?
>
> In one of our clusters, we use rbd locks on volumes to make sure provi
13 matches
Mail list logo