[ceph-users] Increasing number of unscrubbed PGs

2022-09-12 Thread Burkhard Linke
Hi, our cluster is running pacific 16.2.10. Since the upgrade the clusters starts to report an increasing number of PG without a timely deep-scrub: # ceph -s   cluster:     id:        health: HEALTH_WARN     1073 pgs not deep-scrubbed in time   services:     mon: 3 daemons, quo

[ceph-users] Re: Increasing number of unscrubbed PGs

2022-09-12 Thread Eugen Block
Hi, I'm still not sure why increasing the interval doesn't help (maybe there's some flag set to the PG or something), but you could just increase osd_max_scrubs if your OSDs are not too busy. On one customer cluster with high load during the day we configured the scrubs to run during the

[ceph-users] Re: mds's stay in up:standby

2022-09-12 Thread Eugen Block
Hi, what happenend to the cluster? Several services report a short uptime (68 minutes). If you shared some MDS logs someone might find a hint why they won't become active. If the regular logs don't reveal anything enable debug logs. Zitat von Tobias Florek : Hi! I am running a rook man

[ceph-users] Re: Increasing number of unscrubbed PGs

2022-09-12 Thread Burkhard Linke
Hi, On 9/12/22 11:44, Eugen Block wrote: Hi, I'm still not sure why increasing the interval doesn't help (maybe there's some flag set to the PG or something), but you could just increase osd_max_scrubs if your OSDs are not too busy. On one customer cluster with high load during the day we c

[ceph-users] RGW multisite Cloud Sync module with support for client side encryption?

2022-09-12 Thread Christian Rohmann
Hello Ceph-Users, I have a question regarding support for any client side encryption in the Cloud Sync Module for RGW (https://docs.ceph.com/en/latest/radosgw/cloud-sync-module/). While a "regular" multi-site setup (https://docs.ceph.com/en/latest/radosgw/multisite/) is usually syncing data

[ceph-users] OSD Crash in recovery: SST file contains data beyond the point of corruption.

2022-09-12 Thread Benjamin Naber
Hi Everybody, im struggeling now a couple of days with a degraded cehp cluster. Its a simple 3 node Cluster with 6 OSD´s, 3 SSD based, 3 HDD based. A couple of days ago one of the nodes crashed. in case of Hardisk failure, i replaces the hard disk and the recovery process started without any is

[ceph-users] Re: CephFS MDS sizing

2022-09-12 Thread Patrick Donnelly
On Tue, Sep 6, 2022 at 11:29 AM Vladimir Brik wrote: > > > What problem are you actually > > trying to solve with that information? > I suspect that the mds_cache_memory_limit we set (~60GB) is > sub-optimal and I am wondering if we would be better off if, > say, we halved the cache limits and d

[ceph-users] Re: OSD Crash in recovery: SST file contains data beyond the point of corruption.

2022-09-12 Thread Igor Fedotov
Hi Benjamin, honestly the following advice is unlikely to help but you may want to try to set bluestore_rocksdb_options_annex to one of the following options: - wal_recovery_mode=kTolerateCorruptedTailRecords - wal_recovery_mode=kSkipAnyCorruptedRecord The indication that the setting is in

[ceph-users] Re: [ceph-users] OSD Crash in recovery: SST file contains data beyond the point of corruption.

2022-09-12 Thread Benjamin Naber
Hi Igor, looks like the setting wont work, the container now starts with a different error message that the setting is an invalid argument. Did i something wrong by setting: ceph config set osd.4 bluestore_rocksdb_options_annex "wal_recovery_mode=kSkipAnyCorruptedRecord" ? debug 2022-09-12T20:

[ceph-users] Re: Ceph iSCSI rbd-target.api Failed to Load

2022-09-12 Thread Xiubo Li
On 10/09/2022 12:50, duluxoz wrote: Hi Guys, So, I finally got things sorted :-) Time to eat some crow-pie :-P Turns out I had two issues, both of which involved typos (don't they always?). The first was I had transposed two digits of an IP Address in the `iscsi-gateway.cfg` -> `trusted_i