[ceph-users] Re: Ceph very slow bucket listing performance! how to deal with it.

2021-10-18 Thread 126
I encountered a ceph 14.2.8 bug. and have to upgrade to 14.2.11 to solve this issue. 发自我的iPhone > 在 2021年10月15日,17:01,Xianqiang Jing 写道: > > Here is the bucket stats. I have manually set the adasupload bucket reshard > number to 2000. with this command "radosgw-admin reshard add --bucket >

[ceph-users] ceph IO are interrupted when OSD goes down

2021-10-18 Thread Denis Polom
Hi, I have a EC pool with these settings: crush-device-class= crush-failure-domain=host crush-root=default jerasure-per-chunk-alignment=false k=10 m=2 plugin=jerasure technique=reed_sol_van w=8 and my understanding is if some of the OSDs goes down because of read error or just flapping due

[ceph-users] Re: ceph IO are interrupted when OSD goes down

2021-10-18 Thread Eugen Block
Hi, with this EC setup your pool min_size would be 11 (k+1), so in case one host goes down (or several OSDs fail on this host), your clients should not be affected. But as soon as a second host fails you’ll notice IO pause until at least one host has recovered. Do you have more than 12 ho

[ceph-users] Re: A change in Ceph leadership...

2021-10-18 Thread Florian Haas
On 15/10/2021 17:13, Josh Durgin wrote: Thanks so much Sage, it's difficult to put into words how much you've done over the years. You're always a beacon of the best aspects of open source - kindness, wisdom, transparency, and authenticity. So many folks have learned so much from you, and that's

[ceph-users] Re: Ceph Community Ambassador Sync

2021-10-18 Thread Etienne Menguy
Hi, Is tomorrow meeting still planned? I joined last month but there was nobody (could it be a timezone failure on my side?) - Etienne Menguy etienne.men...@croit.io > On 17 Sep 2021, at 20:46, Michel Niyoyita wrote: > > Hello Mike > > Where can we find a list of ambassadors and their p

[ceph-users] Re: ceph IO are interrupted when OSD goes down

2021-10-18 Thread Eugen Block
Hi, min_size = k is not the safest option, it should be only used in case of disaster recovery. But in this case it's not related to IO interruption, it seems. Are some disks utilized around 100% (iostat) when this happens? Zitat von Denis Polom : Hi, it's min_size: 10 On 10/18/21

[ceph-users] Re: Multisite Pubsub - Duplicates Growing Uncontrollably

2021-10-18 Thread Yuval Lifshitz
Hi Alex, I also seemed to miss your email :-) On Mon, Oct 18, 2021 at 11:32 AM Alex Kershaw wrote: > Hi Yuval, > > Apologies - I'm having some trouble with my microsoft spam filter and I'm > not sure this email reached you. If it did please excuse the duplicate. > This is in response to: > "Mul

[ceph-users] Re: ceph IO are interrupted when OSD goes down

2021-10-18 Thread denispolom
no, disks utilization is around 86%. What is safe value for min_size in this case? 18. 10. 2021 15:46:44 Eugen Block : > Hi, > > min_size = k is not the safest option, it should be only used in case  of > disaster recovery. But in this case it's not related to IO  interruption, it > seems. Ar

[ceph-users] Re: ceph IO are interrupted when OSD goes down

2021-10-18 Thread Eugen Block
Well, the default is k + 1, so 11. Could it be that you reduced it during a recovery phase but didn't set it back to the default? Zitat von denispo...@gmail.com: no, disks utilization is around 86%. What is safe value for min_size in this case? 18. 10. 2021 15:46:44 Eugen Block : Hi, mi

[ceph-users] Multisite RGW - Object count differs

2021-10-18 Thread mhnx
Hello. I use Multi-site RGW and I had to move my secondary zone. Due to moving It was unavailable for 5 days. After starting sync, now I see "Bucket is caught up with source" when I use sync status but when I check bucket stats on master and secondary zone, the object count on MasterZone-$mybucket

[ceph-users] Re: ceph IO are interrupted when OSD goes down

2021-10-18 Thread Denis Polom
No it's actually not. It's by design by colleague of mine.  But anyway it's not related to this issue. On 10/18/21 15:55, Eugen Block wrote: Well, the default is k + 1, so 11. Could it be that you reduced it during a recovery phase but didn't set it back to the default? Zitat von denispo...@

[ceph-users] Re: ceph IO are interrupted when OSD goes down

2021-10-18 Thread Szabo, Istvan (Agoda)
Octopus 15.2.14? I have totally the same issue and it makes me prod issue. Istvan Szabo Senior Infrastructure Engineer --- Agoda Services Co., Ltd. e: istvan.sz...@agoda.com -

[ceph-users] Re: A change in Ceph leadership...

2021-10-18 Thread Patrick Donnelly
Hi Sage, On Fri, Oct 15, 2021 at 10:41 AM Sage Weil wrote: > > This fall I will be stepping back from a leadership role in the Ceph > project. My primary focus during the next two months will be to work with > developers and community members to ensure a smooth transition to a more > formal syste

[ceph-users] Re: OSD Crashes in 16.2.6

2021-10-18 Thread Marco Pizzolo
Hi Everyone, Update on this. 5.4 kernel wasn't working well for us and we had to reinstall the HWE and 5.11 kernel. We can now get all OSDs more or less up, but on a clean OS reinstall we are seeing this type of behavior that is causing slow ops even before any pool and filesystem has been create

[ceph-users] Re: [EXTERNAL] RE: OSDs flapping with "_open_alloc loaded 132 GiB in 2930776 extents available 113 GiB"

2021-10-18 Thread Dave Piper
On 9/30/ 2021 7:03 PM, Igor Fedotov wrote: > 3) reduce main space space fragmentation by using Hybrid allocator from > scratch - OSD redeployment is required as well. > >> We deployed these clusters at nautilus with the default allocator, which was >> bitmap I think? After redeploying condor on

[ceph-users] Re: Ceph Community Ambassador Sync

2021-10-18 Thread Mike Perez
Hi everyone, Just a reminder that our meeting will be taking place today at 6:00 UTC. Don't forget to add agenda items to the etherpad: https://pad.ceph.com/p/community-ambassadors A preview of the Ambassador team page is available: https://ambassador-page.ceph.io/en/community/ambassadors/ On M

[ceph-users] Re: Stretch cluster experiences in production?

2021-10-18 Thread Gregory Farnum
On Fri, Oct 15, 2021 at 8:22 AM Matthew Vernon wrote: > > Hi, > > Stretch clusters[0] are new in Pacific; does anyone have experience of > using one in production? > > I ask because I'm thinking about new RGW cluster (split across two main > DCs), which I would naturally be doing using RGW multi-s

[ceph-users] Which verison of ceph is better

2021-10-18 Thread norman.kern
Hi guys, I have a long holiday since this summer, I came back to setup a new ceph server, I want to know which stable version of ceh you're using for production? ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-use

[ceph-users] Re: Stretch cluster experiences in production?

2021-10-18 Thread Martin Verges
Hello Matthew, building strech clusters is not a big deal. It works quite well and stable as long as you have your network under control. This is the most error prone part of a stretch cluster but can easy be solved when you choose a good vendor and network gear. For 3 data centers make sure to h

[ceph-users] Re: Which verison of ceph is better

2021-10-18 Thread Martin Verges
Use pacific for new deployments. -- Martin Verges Managing director Mobile: +49 174 9335695 | Chat: https://t.me/MartinVerges croit GmbH, Freseniusstr. 31h, 81247 Munich CEO: Martin Verges - VAT-ID: DE310638492 Com. register: Amtsgericht Munich HRB 231263 Web: https://croit.io | YouTube: https: