[ceph-users] Re: ceph-ansible in Pacific and beyond?

2021-05-20 Thread Gregory Orange
Hi, On 19/3/21 1:11 pm, Stefan Kooman wrote: Is it going to continue to be supported? We use it (and uncontainerised packages) for all our clusters, so I'd be a bit alarmed if it was going to go away... Just a reminder to all of you. Please fill in the Ceph-user survey and > make your voice

[ceph-users] Bucket index OMAP keys unevenly distributed among shards

2021-05-20 Thread James, GleSYS
Hi, we're running 15.2.7 and our cluster is warning us about LARGE_OMAP_OBJECTS (1 large omap objects). Here is what the distribution looks like for the bucket in question, and as you can see all but 3 of the keys reside in shard 2. .dir.5a5c812a-3d31-4d79-87e6-1a17206228ac.18635192.221.0

[ceph-users] Re: fsck error: found stray omap data on omap_head

2021-05-20 Thread Igor Fedotov
I think there is no way to fix that at the moment other than manually identify and remove relevant record(s) in RocksDB with ceph-kvstore-tool. Which might be pretty tricky.. Looks like we should implement these stray records removal when repairing BlueStore... On 5/19/2021 11:12 PM, Picket

[ceph-users] Re: CRUSH rule for EC 6+2 on 6-node cluster

2021-05-20 Thread Fulvio Galeazzi
Hallo Dan, Bryan, I have a rule similar to yours, for an 8+4 pool, with only difference that I replaced the second "choose" with "chooseleaf", which I understand should make no difference: rule default.rgw.buckets.data { id 6 type erasure min_size 3 max_size

[ceph-users] Re: ceph orch status hangs forever

2021-05-20 Thread ManuParra
Hi Eugen thank you very much for your reply. I'm Manuel, a colleague of Sebastián. I complete what you ask us. We have checked more ceph commands, not only ceph crash and ceph org and many other commands are equally hung: [spsrc-mon-1 ~]# cephadm shell -- ceph pg stat hangs forever [spsrc-mon

[ceph-users] Re: "radosgw-admin bucket radoslist" loops when a multipart upload is happening

2021-05-20 Thread Boris Behrens
I try to bump it once more, because it makes finding orphan objects nearly impossible. Am Di., 11. Mai 2021 um 13:03 Uhr schrieb Boris Behrens : > Hi together, > > I still search for orphan objects and came across a strange bug: > There is a huge multipart upload happening (around 4TB), and listi

[ceph-users] Re: "radosgw-admin bucket radoslist" loops when a multipart upload is happening

2021-05-20 Thread Boris Behrens
Reading through the bugtracker: https://tracker.ceph.com/issues/50293 Thanks for your patience. Am Do., 20. Mai 2021 um 15:10 Uhr schrieb Boris Behrens : > I try to bump it once more, because it makes finding orphan objects nearly > impossible. > > Am Di., 11. Mai 2021 um 13:03 Uhr schrieb Boris

[ceph-users] Re: ceph orch status hangs forever

2021-05-20 Thread Eugen Block
Which mgr modules are enabled? Can you share (if it responds): ceph mgr module ls | jq -r '.enabled_modules[]' We have checked the call made from the container by checking DEBUG logs and I see that it is correct, in some commands work but others hang: Do you see those shell sessions on the

[ceph-users] Re: ceph orch status hangs forever

2021-05-20 Thread Sebastian Luna Valero
Hi Eugen, Here it is: # ceph mgr module ls | jq -r '.enabled_modules[]' cephadm dashboard diskprediction_local iostat prometheus restful Should "crash" and "orchestrator" be part on the list? Why would have they disappeared in the first place? Best regards, Sebastian On Thu, 20 May 2021 at 15:

[ceph-users] mgr+Prometheus/grafana (+consul)

2021-05-20 Thread Jeremy Austin
I recently configured Prometheus to scrape mgr /metrics and add Grafana dashboards. All daemons at 15.2.11 I use Hashicorp consul to advertise the active mgr in DNS, and Prometheus points at a single DNS target. (Is anyone else using this method, or just statically pointing Prometheus at all poten

[ceph-users] Re: CRUSH rule for EC 6+2 on 6-node cluster

2021-05-20 Thread Dan van der Ster
Hi Fulvio, That's strange... It doesn't seem right to me. Are there any upmaps for that PG? ceph osd dump | grep upmap | grep 116.453 Cheers, Dan On Thu, May 20, 2021, 1:30 PM Fulvio Galeazzi wrote: > Hallo Dan, Bryan, > I have a rule similar to yours, for an 8+4 pool, with only >

[ceph-users] Re: CRUSH rule for EC 6+2 on 6-node cluster

2021-05-20 Thread Nathan Fish
The obvious thing to do is to set 4+2 instead - is that not an option? On Wed, May 12, 2021 at 11:58 AM Bryan Stillwell wrote: > > I'm trying to figure out a CRUSH rule that will spread data out across my > cluster as much as possible, but not more than 2 chunks per host. > > If I use the defaul

[ceph-users] Re: CRUSH rule for EC 6+2 on 6-node cluster

2021-05-20 Thread Dan van der Ster
Hold on: 8+4 needs 12 osds but you only show 10 there. Shouldn't you choose 6 type host and then chooseleaf 2 type osd? .. Dan On Thu, May 20, 2021, 1:30 PM Fulvio Galeazzi wrote: > Hallo Dan, Bryan, > I have a rule similar to yours, for an 8+4 pool, with only > difference that I replaced

[ceph-users] Re: ceph df: pool stored vs bytes_used -- raw or not?

2021-05-20 Thread Igor Fedotov
This patch (https://github.com/ceph/ceph/pull/38354) should be present in Nautilus starting v14.2.21 Perhaps you're facing a different issue, could you please share "ceph osd tree" output? Thanks, Igor On 5/19/2021 6:18 PM, Konstantin Shalygin wrote: Dan, Igor Seems this wasn't backpor

[ceph-users] Re: ceph df: pool stored vs bytes_used -- raw or not?

2021-05-20 Thread Dan van der Ster
I can confirm that we still occasionally see stored==used even with 14.2.21, but I didn't have time yet to debug the pattern behind the observations. I'll let you know if we find anything useful. .. Dan On Thu, May 20, 2021, 6:56 PM Konstantin Shalygin wrote: > > > > On 20 May 2021, at 19:47,

[ceph-users] Re: [EXTERNAL] Re: fsck error: found stray omap data on omap_head

2021-05-20 Thread Pickett, Neale T
You are correct, even though the repair reports an error, I was able to join the disk back into the cluster, and it stopped reporting the legacy omap warning. I had assumed an "error" was something that needed to be rectified before anything could proceed, but apparently it's more like "warning:

[ceph-users] OSD's still UP after power loss

2021-05-20 Thread by morphin
Hello I have a weird problem on 3 node cluster. "Nautilus 14.2.9" When I try power failure OSD's are not marking as DOWN and MDS do not respond anymore. If I manually set osd down then MDS becomes active again. BTW: Only 2 node has OSD's. Third node is only for MON. I've set mon_osd_down_out_int

[ceph-users] MDS Stuck in Replay Loop (Segfault) after subvolume creation

2021-05-20 Thread Carsten Feuls
Hello, i want to test something with cephfs subvolume an how to mount it and set quota. after some "ceph fs" commands I got an E-Mail from Prometheus that the cluster is in "Health Warn". The Error was that every MDS crash with a Segfault. Following Some Information of my cluster. The cluster

[ceph-users] Stray hosts and daemons

2021-05-20 Thread Vladimir Brik
I am not sure how to interpret CEPHADM_STRAY_HOST and CEPHADM_STRAY_DAEMON warnings. They seem to be inconsistent. I converted my cluster to be managed by cephadm by adopting mon and all other daemons, and they show up in ceph orch ps, but ceph health says mons are stray: [WRN] CEPHADM_STRAY

[ceph-users] Application for mirror.csclub.uwaterloo.ca as an official mirror

2021-05-20 Thread Zachary Seguin
Hello, I am contacting you on behalf of the Computer Science Club of the University of Waterloo (https://csclub.uwaterloo.ca) to add our mirror (https://mirror.csclub.uwaterloo.ca) as an official mirror of the Ceph project. Our mirror is located at the University of Waterloo in Waterloo, Ontario,

[ceph-users] Re: Does dynamic resharding block I/Os by design?

2021-05-20 Thread Satoru Takeuchi
2021年5月18日(火) 14:09 Satoru Takeuchi : > 2021年5月18日(火) 9:23 Satoru Takeuchi : > > > > Hi, > > > > I have a Ceph cluster used for RGW and RBD. I found that all I/Os to > > RGW seemed to be > > blocked while dynamic resharding. Could you tell me whether this > > behavior is by design or not? > > > >

[ceph-users] Fw: Welcome to the "ceph-users" mailing list

2021-05-20 Thread 274456...@qq.com
274456...@qq.com From: ceph-users-request Date: 2021-05-21 13:55 To: 274456...@qq.com Subject: Welcome to the "ceph-users" mailing list Welcome to the "ceph-users" mailing list! To post to this list, send your email to: ceph-users@ceph.io You can unsubscribe or make adjustments to yo

[ceph-users] Re: ceph orch status hangs forever

2021-05-20 Thread Eugen Block
Hi, if you check ceph mgr module ls | jq -r '.always_on_modules[]' you'll see that crash, orchestrator and other modules are always on and can't be disabled. Without the pipe to jq you can see the whole list which is a bit long to get just an overview. Anyway, comparing your enabled modules