[ceph-users] Ceph cluster not recover after OSD down

2021-05-05 Thread Andres Rojas Guerrero
Hi, I have a Nautilus cluster version 14.2.6 , and I have noted that when some OSD go down the cluster doesn't start recover. I have checked that the option noout is unset. What could be the reason for this behavior? -- *** Andrés Rojas Guerr

[ceph-users] Re: Ceph cluster not recover after OSD down

2021-05-05 Thread David Caro
Can you share more information? The output of 'ceph status' when the osd is down would help, also 'ceph health detail' could be useful. On 05/05 10:48, Andres Rojas Guerrero wrote: > Hi, I have a Nautilus cluster version 14.2.6 , and I have noted that > when some OSD go down the cluster doesn't

[ceph-users] Re: Ceph cluster not recover after OSD down

2021-05-05 Thread Andres Rojas Guerrero
Yes, the principal problem is the MDS start to report slowly and the information is no longer accessible, and the cluster never recover. # ceph status cluster: id: c74da5b8-3d1b-483e-8b3a-739134db6cf8 health: HEALTH_WARN 2 clients failing to respond to capability release

[ceph-users] Re: Ceph cluster not recover after OSD down

2021-05-05 Thread Andres Rojas Guerrero
Sorry, I have not understood the problem well, the problem I see is that once the OSD fails, the cluster recovers but the MDS remains faulty: # ceph status cluster: id: c74da5b8-3d1b-483e-8b3a-739134db6cf8 health: HEALTH_WARN 3 clients failing to respond to capability rel

[ceph-users] Re: Ceph cluster not recover after OSD down

2021-05-05 Thread David Caro
I think that the recovery might be blocked due to all those PGs in inactive state: https://access.redhat.com/documentation/en-us/red_hat_ceph_storage/4/html/administration_guide/monitoring-a-ceph-storage-cluster#identifying-stuck-placement-groups_admin """ Inactive: Placement groups cannot proc

[ceph-users] Re: Ceph cluster not recover after OSD down

2021-05-05 Thread Andres Rojas Guerrero
I see that problem, when the osds fail the mds fail, with errors with type "slow metadata, slow requests" but do not recover once the cluster has recovered ... Why? El 5/5/21 a las 11:07, Andres Rojas Guerrero escribió: > Sorry, I have not understood the problem well, the problem I see is that

[ceph-users] Re: Ceph cluster not recover after OSD down

2021-05-05 Thread Burkhard Linke
Hi, On 05.05.21 11:07, Andres Rojas Guerrero wrote: Sorry, I have not understood the problem well, the problem I see is that once the OSD fails, the cluster recovers but the MDS remains faulty: *snipsnap* pgs: 1.562% pgs not active 16128 active+clean 238

[ceph-users] Re: Ceph cluster not recover after OSD down

2021-05-05 Thread Andres Rojas Guerrero
They are located on a single node ... El 5/5/21 a las 11:17, Burkhard Linke escribió: > Hi, > > On 05.05.21 11:07, Andres Rojas Guerrero wrote: >> Sorry, I have not understood the problem well, the problem I see is that >> once the OSD fails, the cluster recovers but the MDS remains faulty: > >

[ceph-users] Re: Ceph cluster not recover after OSD down

2021-05-05 Thread Andres Rojas Guerrero
I have in the cluster 768 OSD, it is enough that 32 (~ 4%) of them (in the same node) fall and the information becomes inaccessible. Is it possible to improve this behavior? # ceph status cluster: id: c74da5b8-3d1b-483e-8b3a-739134db6cf8 health: HEALTH_WARN 1 clients fail

[ceph-users] Re: Ceph cluster not recover after OSD down

2021-05-05 Thread Robert Sander
Hi, Am 05.05.21 um 11:44 schrieb Andres Rojas Guerrero: > I have in the cluster 768 OSD, it is enough that 32 (~ 4%) of them (in > the same node) fall and the information becomes inaccessible. Is it > possible to improve this behavior? You need to spread your failure zone in the crush map. It loo

[ceph-users] Re: Ceph cluster not recover after OSD down

2021-05-05 Thread Andres Rojas Guerrero
Thanks for the answer. > For the default redundancy rule and pool size 3 you need three separate > hosts. I have 24 separate server nodes with with 32 osd in everyone in total 768 osd, my question is why the mds suffer when only 4% of the osd goes down (in the same node). I need to modify the cr

[ceph-users] Re: Ceph cluster not recover after OSD down

2021-05-05 Thread Robert Sander
Am 05.05.21 um 12:34 schrieb Andres Rojas Guerrero: > Thanks for the answer. > >> For the default redundancy rule and pool size 3 you need three separate >> hosts. > > I have 24 separate server nodes with with 32 osd in everyone in total > 768 osd, my question is why the mds suffer when only 4%

[ceph-users] Re: Ceph cluster not recover after OSD down

2021-05-05 Thread Andres Rojas Guerrero
# ceph osd crush rule dump [ { "rule_id": 0, "rule_name": "replicated_rule", "ruleset": 0, "type": 1, "min_size": 1, "max_size": 10, "steps": [ { "op": "take", "item": -1, "item_n

[ceph-users] Re: Ceph cluster not recover after OSD down

2021-05-05 Thread Robert Sander
Hi, Am 05.05.21 um 13:39 schrieb Joachim Kraftmayer: > the crush rule with ID 1 distributes your EC chunks over the osds > without considering the ceph host. As Robert already suspected. Yes, the "nxtcloudAF" rule is not fault tolerant enough. Having the OSD as failure zone will lead to data los

[ceph-users] Re: Ceph cluster not recover after OSD down

2021-05-05 Thread Andres Rojas Guerrero
Nice observation, how can avoid this problem? El 5/5/21 a las 14:54, Robert Sander escribió: Hi, Am 05.05.21 um 13:39 schrieb Joachim Kraftmayer: the crush rule with ID 1 distributes your EC chunks over the osds without considering the ceph host. As Robert already suspected. Yes, the "nxt

[ceph-users] dashboard connecting to the object gateway

2021-05-05 Thread Fabrice Bacchella
I'm still trying to understand how the manager and dashboard connect to different object gateway, and I don’t really understand how it works. Initially, I wanted to have each gateway listen only to localhost, on http: [client.radosgw.<%= $id%>] rgw_frontends = beast endpoint=127.0.0.1:9080 It

[ceph-users] radosgw-admin user create takes a long time (with failed to distribute cache message)

2021-05-05 Thread Boris Behrens
Hi, since a couple of days we experience a strange slowness on some radosgw-admin operations. What is the best way to debug this? For example creating a user takes over 20s. [root@s3db1 ~]# time radosgw-admin user create --uid test-bb-user --display-name=test-bb-user 2021-05-05 14:08:14.297 7f6942

[ceph-users] Out of Memory after Upgrading to Nautilus

2021-05-05 Thread Christoph Adomeit
I manage a historical cluster of severak ceph nodes with each 128 GB Ram and 36 OSD each 8 TB size. The cluster ist just for archive purpose and performance is not so important. The cluster was running fine for long time using ceph luminous. Last week I updated it to Debian 10 and Ceph Nautilus

[ceph-users] Re: Out of Memory after Upgrading to Nautilus

2021-05-05 Thread Mark Nelson
Hi Cristoph, 1GB per OSD is tough!  the osd memory target only shrinks the size of the caches but can't control things like osd map size, pg log length, rocksdb wal buffers, etc.  It's a "best effort" algorithm to try to fit the OSD mapped memory into that target but on it's own it doesn't re

[ceph-users] Re: Ceph cluster not recover after OSD down

2021-05-05 Thread Joachim Kraftmayer
Hi Andres, the crush rule with ID 1 distributes your EC chunks over the osds without considering the ceph host. As Robert already suspected. Greetings, Joachim ___ Clyso GmbH Homepage: https://www.clyso.com Am 05.05.2021 um 13:16 schrieb Andres Rojas Guerrero

[ceph-users] Re: Ceph cluster not recover after OSD down

2021-05-05 Thread Andres Rojas Guerrero
Thanks, I will test it. El 5/5/21 a las 16:37, Joachim Kraftmayer escribió: Create a new crush rule with the correct failure domain, test it properly and assign it to the pool(s). -- *** Andrés Rojas Guerrero Unidad Sistemas Linux Area Arqu

[ceph-users] Call For Submissions IO500 ISC21 List

2021-05-05 Thread IO500 Committee
https://io500.org/cfs Stabilization Period: 05 - 14 May 2021 AoE Submission Deadline: 11 June 2021 AoE The IO500 is now accepting and encouraging submissions for the upcoming 8th IO500 list. Once again, we are also accepting submissions to the 10 Node Challenge to encourage the submission of sm

[ceph-users] v16.2.2 Pacific released

2021-05-05 Thread David Galloway
This is the second backport release in the Pacific stable series. For a detailed release notes with links & changelog please refer to the official blog entry at https://ceph.io/releases/v16-2-2-pacific-released Notable Changes --- * Cephadm now supports an *ingress* service type that p

[ceph-users] pgremapper released

2021-05-05 Thread Josh Baergen
Hello all, I just wanted to let you know that DigitalOcean has open-sourced a tool we've developed called pgremapper. Originally inspired by CERN's upmap exception table manipulation scripts, pgremapper is a CLI written in Go which exposes a number of upmap-based algorithms for backfill-related u

[ceph-users] Re: pgremapper released

2021-05-05 Thread Janne Johansson
Looks great! Den ons 5 maj 2021 kl 15:27 skrev Josh Baergen : > > Hello all, > > I just wanted to let you know that DigitalOcean has open-sourced a > tool we've developed called pgremapper. > > Originally inspired by CERN's upmap exception table manipulation > scripts, pgremapper is a CLI written

[ceph-users] Re: Ceph cluster not recover after OSD down

2021-05-05 Thread Joachim Kraftmayer
Create a new crush rule with the correct failure domain, test it properly and assign it to the pool(s). -- Beste Grüße, Joachim Kraftmayer ___ Clyso GmbH Am 05.05.2021 um 15:11 schrieb Andres Rojas Guerrero: Nice observation, how can avoid this problem? El 5

[ceph-users] Re: Out of Memory after Upgrading to Nautilus

2021-05-05 Thread Joachim Kraftmayer
Hi Christoph, can you send me the ceph config set ... command you used and/or the ceph config dump output? Regards, Joachim Clyso GmbH Homepage: https://www.clyso.com Am 05.05.2021 um 16:30 schrieb Christoph Adomeit: I manage a historical cluster of severak ceph nodes with each 128 GB R

[ceph-users] Re: Out of Memory after Upgrading to Nautilus

2021-05-05 Thread Mark Nelson
FWIW, I believe in master those settings should be properly updated now when you change them at runtime.  I don't remember if that every got backported to older releases though.  This is where the mempool, priority cache perf counters, and tcmalloc stats can all be useful to help diagnose where

[ceph-users] Re: How to set bluestore_rocksdb_options_annex

2021-05-05 Thread ceph
Hello Igor, thank you for this hint. I had to restart the osds and then a look in the osd logs... and what do i see? The line you mentioned :thumbsup: :) Have a nice week Mehmet Am 4. Mai 2021 15:26:59 MESZ schrieb Igor Fedotov : >OSD to be restarted similar to altering bluestore_rocksdb_opt

[ceph-users] Re: [Suspicious newsletter] RGW: Multiple Site does not sync olds data

2021-05-05 Thread 特木勒
Hi Jean: Thanks for your info. Unfortunately I check the secondary cluster and non-objects had been synced. The only way I have is to force rewrite objects for whole buckets. I have tried to set up multiple site between Nautilus and octopus. It works pretty well. But after I upgrade primary clus