[ceph-users] Re: OSDs crush - Since Pacific

2022-08-30 Thread Wissem MIMOUNA
Hi Stefan, We don’t have automatic conversion going on , and the « bluestore_fsck_quick_fix_on_mount » is not set . So we did an offline compaction as suggested but this didn’t fix the problem os osd crush . In the meantime we are rebuilding all OSDs on the cluster and it seems it improve the cl

[ceph-users] Re: Downside of many rgw bucket shards?

2022-08-30 Thread Boris Behrens
Thanks for your input: There are buckets with over 15m files and >300 shards, but yesterday a customer with 2.5m files and 101 shards complained about the slowness of listing files. We do not have indexless buckets. I am not sure if a customer can create such a bucket on their own via the usual to

[ceph-users] Re: Bug in crush algorithm? 1 PG with the same OSD twice.

2022-08-30 Thread Dan van der Ster
Hi Frank, I suspect this is a combination of issues. 1. You have "choose" instead of "chooseleaf" in rule 1. 2. osd.7 is destroyed but still "up" in the osdmap. 3. The _tries settings in rule 1 are not helping. Here are my tests: # osdmaptool --test-map-pg 4.1c osdmap.bin osdmaptool: osdmap file

[ceph-users] Re: Bug in crush algorithm? 1 PG with the same OSD twice.

2022-08-30 Thread Dan van der Ster
> 2. osd.7 is destroyed but still "up" in the osdmap. Oops, you can ignore this point -- this was an observation I had while playing with the osdmap -- your osdmap.bin has osd.7 down correctly. In case you're curious, here was what confused me: # osdmaptool osdmap.bin2 --mark-up-in --mark-out 7

[ceph-users] Re: Bug in crush algorithm? 1 PG with the same OSD twice.

2022-08-30 Thread Dan van der Ster
BTW, I vaguely recalled seeing this before. Yup, found it: https://tracker.ceph.com/issues/55169 On Tue, Aug 30, 2022 at 11:46 AM Dan van der Ster wrote: > > > 2. osd.7 is destroyed but still "up" in the osdmap. > > Oops, you can ignore this point -- this was an observation I had while > playing

[ceph-users] Re: how to fix slow request without remote or restart mds

2022-08-30 Thread zxcs
Thanks a ton! Yes, restart mds fixed this. But can’t confirm it hit bug 50840, seems when we read huge small files will hit this! (means more than 10,000 small files in one directory ). Thanks Xiong > 2022年8月26日 19:13,Stefan Kooman 写道: > > On 8/26/22 12:33, zxcs wrote: >> Hi, experts >> w

[ceph-users] Re: Bug in crush algorithm? 1 PG with the same OSD twice.

2022-08-30 Thread Dan van der Ster
BTW, the defaults for _tries seems to work too: # diff -u crush.txt crush.txt2 --- crush.txt 2022-08-30 11:27:41.941836374 +0200 +++ crush.txt2 2022-08-30 11:55:45.601891010 +0200 @@ -90,10 +90,10 @@ type erasure min_size 3 max_size 6 - step set_chooseleaf_tries 50 - step set_choose_tries 2

[ceph-users] Re: OSDs crush - Since Pacific

2022-08-30 Thread Igor Fedotov
Hi Wissem, sharing OSD log snippet preceding the crash (e.g. prior 20K lines) could be helpful and hopefully will provide more insigh - there might be some errors/assertion details and/or other artefacts... Thanks, Igor On 8/30/2022 10:51 AM, Wissem MIMOUNA wrote: Hi Stefan, We don’t have

[ceph-users] Re: Automanage block devices

2022-08-30 Thread Dominique Ramaekers
Hi Robert, Thanks for the input. > -Oorspronkelijk bericht- > Van: Robert Sander > Verzonden: maandag 29 augustus 2022 16:23 > Aan: ceph-users@ceph.io > Onderwerp: [ceph-users] Re: Automanage block devices > > Am 29.08.22 um 14:14 schrieb Dominique Ramaekers: > > > Nevertheless, I woul

[ceph-users] Re: Bug in crush algorithm? 1 PG with the same OSD twice.

2022-08-30 Thread Dan van der Ster
> Note: "step chose" was selected by creating the crush rule with ceph on pool > creation. If the default should be "step choseleaf" (with OSD buckets), then > the automatic crush rule generation in ceph ought to be fixed for EC profiles. Interesting. Which exact command was used to create the p

[ceph-users] Re: Bug in crush algorithm? 1 PG with the same OSD twice.

2022-08-30 Thread Dan van der Ster
>> Note: "step chose" was selected by creating the crush rule with ceph on pool >> creation. If the default should be "step choseleaf" (with OSD buckets), then >> the automatic crush rule generation in ceph ought to be fixed for EC >> profiles. > Interesting. Which exact command was used to crea

[ceph-users] Fwd: radosgw-admin hangs

2022-08-30 Thread Magdy Tawfik
Thank you Boris I have used cephadm to install that cluster so readded mon / mgr with no issue cluster health seems OK no issue reviewed ceph.conf it's in place however still not able to run radosgw-admin . Thank you in advance for your help -- Forwarded message - From: B

[ceph-users] Re: OSDs growing beyond full ratio

2022-08-30 Thread Wyll Ingersoll
Yes, this cluster has both - a large cephfs FS (60TB) that is replicated (2-copy) and a really​ large RGW data pool that is EC (12+4). We cannot currently delete any data from either of them because commands to access them are not responsive. The cephfs will not mount and radosgw-admin just h

[ceph-users] Re: OSDs growing beyond full ratio

2022-08-30 Thread Wyll Ingersoll
OSDs are bluestore on HDD with SSD for DB/WAL. We already tuned the sleep_hdd to 0 and cranked up the max_backfills and recovery parameters to much higher values. From: Josh Baergen Sent: Tuesday, August 30, 2022 9:46 AM To: Wyll Ingersoll Cc: Dave Schulz ;

[ceph-users] Re: OSDs growing beyond full ratio

2022-08-30 Thread Dave Schulz
Hi Wyll, The only way I could get my OSDs to start dropping their utilization because of a similar "unable to access the fs" problem was to run "ceph osd crush reweight 0" on the full OSDs then wait while they start to empty and get below the full ratio.  Not this is different from ceph osd

[ceph-users] Re: OSDs growing beyond full ratio

2022-08-30 Thread Wyll Ingersoll
Thanks, we may resort to that if we can't make progress in rebalancing things. From: Dave Schulz Sent: Tuesday, August 30, 2022 11:18 AM To: Wyll Ingersoll ; Josh Baergen Cc: ceph-users@ceph.io Subject: Re: [ceph-users] Re: OSDs growing beyond full ratio Hi

[ceph-users] compile cephadm - call for feedback

2022-08-30 Thread John Mulligan
TLDR - Last call for feedback & reviews on PR https://github.com/ceph/ceph/ pull/41855 before our deadline in two weeks. --- The Ceph Orchestration team has had a long term project [1] to refactor the 'cephadm binary' into something more manageable. The first step in the process is to turn the

[ceph-users] Re: S3 Object Returns Days after Deletion

2022-08-30 Thread J. Eric Ivancich
A couple of questions, Alex. Is it the case that the object does not appear when you list the RGW bucket it was in? You referred to "one side of my cluster”. Does that imply you’re using multisite? And just for completeness, this is not a versioned bucket? With a size of 6252 bytes, it wouldn

[ceph-users] Re: OSDs growing beyond full ratio

2022-08-30 Thread 胡 玮文
在 2022年8月30日,23:20,Dave Schulz 写道: Is a file in ceph assigned to a specific PG? In my case it seems like a file that's close to the size of a single OSD gets moved from one OSD to the next filling it up and domino-ing around the cluster filling up OSDs. I believe no. Each large file is split

[ceph-users] Re: OSDs growing beyond full ratio

2022-08-30 Thread Dave Schulz
Hi Weiwen, Thanks for the reference link.  That does indeed indicate the opposite.  I'm not sure why our issues became much less when the big files were deleted.  I suppose it's just that there was more space available after deleting the big files. -Dave On 2022-08-30 11:56 a.m., 胡 玮文 wrote

[ceph-users] Re: OSDs growing beyond full ratio

2022-08-30 Thread Wyll Ingersoll
One of our OSDs eventually reached 100% capacity (in spite of the full ratio being 95%). Now it is down and we cannot restart the osd process on it because there is not enough space on the device. Is there a way to find PGs on that disk that can be safely removed without destroying data so we

[ceph-users] how to fix mds stuck at dispatched without restart ads

2022-08-30 Thread zxcs
Hi, experts we have a cephfs(15.2.13) cluster with kernel mount, and when we read from 2000+ processes to one ceph path(called /path/to/A/), then all of the process hung, and ls -lrth /path/to/A/ always stuck, but list other directory are health( /path/to/B/), health detail always report md