[ceph-users] disable stretch_mode possible?

2022-10-14 Thread Manuel Lausch
Hi, I am playing around with ceph_stretch_mode and now I have one question: Is it possible to disable stretchmode again? I didn't found anything about this in the documentation. -> Ceph pacific 16.2.9 Manuel ___ ceph-users mailing list -- ceph-us

[ceph-users] Re: why rgw generates large quantities orphan objects?

2022-10-14 Thread Ulrich Klein
Hi, I’m wondering if this problem will ever get fixed? This multipart-orphan-problem has made it now multiple times to the list, the tickets are up to 6 years old … and nothing changes. It screws up per-user space accounting and uses up space for nothing. I’d open another ticket with easy step

[ceph-users] Re: Upgrade from Mimic to Pacific, hidden zone in RGW?

2022-10-14 Thread Eugen Block
Hi, do you have the zone information in the ceph.conf? Do they match on all rgw hosts? Do you see any orphans or anything suspicious in 'rados -p .rgw.root ls' output? Zitat von Federico Lazcano : Hi everyone! I'm looking for help in an upgrade from Mimic. I've managed to upgrade MON, MG

[ceph-users] Re: why rgw generates large quantities orphan objects?

2022-10-14 Thread Dhairya Parmar
Hi Ulrich, I can file a tracker on your behalf if you want. Do let me know. On Fri, Oct 14, 2022 at 1:35 PM Ulrich Klein wrote: > Hi, > > I’m wondering if this problem will ever get fixed? > > This multipart-orphan-problem has made it now multiple times to the list, > the tickets are up to 6 ye

[ceph-users] Re: crush hierarchy backwards and upmaps ...

2022-10-14 Thread Dan van der Ster
Hi, On Thu, Oct 13, 2022 at 8:14 PM Christopher Durham wrote: > > > Dan, > > Again i am using 16.2.10 on rocky 8 > > I decided to take a step back and check a variety of options before I do > anything. Here are my results. > > If I use this rule: > > rule mypoolname { > id -5 > type era

[ceph-users] Cephadm migration

2022-10-14 Thread Jean-Marc FONTANA
Hello everyone ! We're operating a small cluster which contains 1 monitor-manager, 3 osds ans 1 RGW. Tjhe cluster was initially installed with ceph-deploy in version Nautilus (14.2.19) then upgraded in Octopus (15.2.16) and lastly in Pacific (16.2.9). Ceph-deploy does not work any more so we n

[ceph-users] Low space hindering backfill and 2 backfillfull osd(s)

2022-10-14 Thread Szabo, Istvan (Agoda)
Hi, I've added 5 more nodes to my cluster and got this issue. HEALTH_WARN 2 backfillfull osd(s); 17 pool(s) backfillfull; Low space hindering backfill (add storage if this doesn't resolve itself): 4 pgs backfill_toofull OSD_BACKFILLFULL 2 backfillfull osd(s) osd.150 is backfill full osd.1

[ceph-users] Re: Low space hindering backfill and 2 backfillfull osd(s)

2022-10-14 Thread Janne Johansson
Den fre 14 okt. 2022 kl 12:10 skrev Szabo, Istvan (Agoda) : > I've added 5 more nodes to my cluster and got this issue. > HEALTH_WARN 2 backfillfull osd(s); 17 pool(s) backfillfull; Low space > hindering backfill (add storage if this doesn't resolve itself): 4 pgs > backfill_toofull > OSD_BACKFIL

[ceph-users] Re: Low space hindering backfill and 2 backfillfull osd(s)

2022-10-14 Thread Szabo, Istvan (Agoda)
Thank you very much the detailed explanation. Will wait then, based on the speed 5 more hours, let's see Istvan Szabo Senior Infrastructure Engineer --- Agoda Services Co., Ltd. e: istvan.sz...@agoda.com -

[ceph-users] Re: monitoring drives

2022-10-14 Thread Paul Mezzanini
smartctl can very much read sas drives so I would look into that chain first. Are they behind a raid controller that is masking the smart commands? As for monitoring, we run the smartd service to keep an eye on drives. More often than not I notice weird things with ceph long before smart thr

[ceph-users] Re: Cephadm migration

2022-10-14 Thread Adam King
For the weird image, perhaps just "ceph orch daemon redeploy rgw.testrgw.svtcephrgwv1.invwmo --image quay.io/ceph/ceph:v16.2.10" will resolve it. Not sure about the other things wrong with it yet but I think the image should be fixed before looking into that. On Fri, Oct 14, 2022 at 5:47 AM Jean-M

[ceph-users] Re: why rgw generates large quantities orphan objects?

2022-10-14 Thread Matt Benjamin
Hi Folks, I noted in another context, I have work-in-progress code to address this issue. I should be able to update and push a bugfix for ceph/main fairly soon. Matt On Thu, Oct 13, 2022 at 11:45 AM Haas, Josh wrote: > Hi Liang, > > My guess would be this bug: > > https://tracker.ceph.com/is

[ceph-users] Re: Cephadm migration

2022-10-14 Thread Jean-Marc FONTANA
Hi Adam, Thanks for your quick answering. Gonna try it  fastly and keep in touch for the result Best regards JM Le 14/10/2022 à 14:09, Adam King a écrit : For the weird image, perhaps just "ceph orch daemon redeploy rgw.testrgw.svtcephrgwv1.invwmo --image quay.io/ceph/ceph:v16.2.10" will re

[ceph-users] Re: monitoring drives

2022-10-14 Thread Marc
> smartctl can very much read sas drives so I would look into that chain > first. I have smartd running and it does recognize the sas drives, however I have collectd is grabbing smart data and I am getting nothing from it. This is all the stuff I am getting from a sata drive # SELECT * FROM "sm

[ceph-users] Re: How to remove remaining bucket index shard objects

2022-10-14 Thread Konstantin Shalygin
Hi, What you mean "strange"? It is normal, the object need only for OMAP data, not for actual data. Is only key for k,v database I see that you have lower number of objects, some of your PG don't have data at all. I suggest check your buckets for properly resharding process How this look (the i

[ceph-users] Re: monitoring drives

2022-10-14 Thread John Petrini
We run a mix of Samsung and Intel SSD's, our solution was to write a script that parses the output of the Samsung SSD Toolkit and Intel ISDCT CLI tools respectively. In our case, we expose those metrics using node_exporter's textfile collector for ingestion by prometheus. It's mostly the same smart

[ceph-users] Re: Cluster crashing when stopping some host

2022-10-14 Thread Murilo Morais
Eugen, it worked and it didn't. I had to bootstrap in v17.2.3, using v17.2.4 this behavior is occurring. I did numerous tests with 3 VMs, two with disks and another only for MON, in v17.2.4 the cluster simply crashes when one of the hosts with disk dies even with three MONs. I don't understand why

[ceph-users] Re: monitoring drives

2022-10-14 Thread Konstantin Shalygin
Hi, You can get this metrics, even wear level, from official smartctl_exporter [1] [1] https://github.com/prometheus-community/smartctl_exporter k Sent from my iPhone > On 14 Oct 2022, at 17:12, John Petrini wrote: > > We run a mix of Samsung and Intel SSD's, our solution was to write a > sc

[ceph-users] strange OSD status when rebooting one server

2022-10-14 Thread Matthew Darwin
Hi, I am hoping someone can help explain this strange message.  I took 1 physical server offline which contains 11 OSDs.  "ceph -s" reports 11 osd down.  Great. But on the next line it says "4 hosts" are impacted.  It should only be 1 single host?  When I look the manager dashboard all the O

[ceph-users] Re: strange OSD status when rebooting one server

2022-10-14 Thread ceph
Could you please share output of Ceph osd df tree There could be an hint... Hth Am 14. Oktober 2022 18:45:40 MESZ schrieb Matthew Darwin : >Hi, > >I am hoping someone can help explain this strange message.  I took 1 physical >server offline which contains 11 OSDs.  "ceph -s" reports 11 osd down

[ceph-users] Re: strange OSD status when rebooting one server

2022-10-14 Thread Matthew Darwin
https://gist.githubusercontent.com/matthewdarwin/aec3c2b16ba5e74beb4af1d49e8cfb1a/raw/d8d8f34d989823b9f708608bb2773c7d4093c648/ceph-osd-tree.txt ID CLASS WEIGHT REWEIGHT SIZE RAW USE DATA OMAP META AVAIL%USE VAR PGS STATUS TYPE NAME -1 1331.88013

[ceph-users] Re: strange OSD status when rebooting one server

2022-10-14 Thread Frank Schilder
You have hosts in the crush map with no OSDs. They are out+down and will be counted while other hosts are also down. It will go back to normal when you start the host with disks again. If you delete the hosts with no disks, you will probably see misplaced objects. Why are they there in the first

[ceph-users] Re: CephFS constant high write I/O to the metadata pool

2022-10-14 Thread Patrick Donnelly
Hello Olli, On Thu, Oct 13, 2022 at 5:01 AM Olli Rajala wrote: > > Hi, > > I'm seeing constant 25-50MB/s writes to the metadata pool even when all > clients and the cluster is idling and in clean state. This surely can't be > normal? > > There's no apparent issues with the performance of the clus

[ceph-users] Re: strange OSD status when rebooting one server

2022-10-14 Thread Matthew Darwin
The hosts with no OSDs now previously had OSDs, but they no longer do. Those hosts don't exist any more.  I guess they could be removed from the crushmap, but I assume hosts with no OSD doesn't hurt anything? The OSDs that used to be on these servers were deleted, and then new OSDs were create

[ceph-users] Re: monitoring drives

2022-10-14 Thread Wyll Ingersoll
This looks very useful. Has anyone created a grafana dashboard that will display the collected data ? From: Konstantin Shalygin Sent: Friday, October 14, 2022 12:12 PM To: John Petrini Cc: Marc ; Paul Mezzanini ; ceph-users Subject: [ceph-users] Re: monitori

[ceph-users] Re: crush hierarchy backwards and upmaps ...

2022-10-14 Thread Christopher Durham
Dan, I added the relevant info to: https://tracker.ceph.com/issues/51729 Perhaps someone will take a look now ... I will be going with: rule mypoolname {   id -5   type erasure   step take myroot   step choose indep 4 type rack   step chooseleaf indep 2 type chassis   step emit I will let you

[ceph-users] Re: monitoring drives

2022-10-14 Thread Fox, Kevin M
Would it cause problems to mix the smartctl exporter along with ceph's built in monitoring stuff? Thanks, Kevin From: Wyll Ingersoll Sent: Friday, October 14, 2022 10:48 AM To: Konstantin Shalygin; John Petrini Cc: Marc; Paul Mezzanini; ceph-users Subjec

[ceph-users] pool size ...

2022-10-14 Thread Christopher Durham
Hi, I've seen Dan's talk: https://www.youtube.com/watch?v=0i7ew3XXb7Q and other similar ones that talk about CLUSTER size. But, I see nothing (perhaps I have not looked hard enough), on any recommendations regarding max POOL size. So, are there any limitations on a given pool that has all OSDs of