[ceph-users] Re: Octopus 15.2.1
Hi Jeff,Thank you for your quick and clear answer.I was not aware of the ceph-el8 repo. This is great!. Installing Ceph on CentOS8 now succeeds without any missing dependencies. rgds,-gw On Fri, 2020-04-10 at 13:45 -0400, Jeff Bailey wrote: > Leveldb is currently in epel-testing and should be moved to epel next > week. You can get the rest of the dependencies from > https://copr.fedorainfracloud.org/coprs/ktdreyer/ceph-el8/ It works > fine. Hopefully, everything will make it into epel eventually but > for now this is good enough for me. > On 4/10/2020 4:06 AM, gert.wieberd...@ziggo.nl wrote: > > I am trying to install a fresh Ceph cluster on CentOS 8.Using the > > latest Ceph repo for el8, it still is not possible because of > > certain dependencies:libleveldb.so.1 needed by ceph-osd.Even after > > manually downloading and installing the leveldb-1.20- > > 1.el8.x86_64.rpm package, there are still dependencies:Problem: > > package ceph-mgr-2:15.2.1-0.el8.x86_64 requires ceph-mgr-modules- > > core = 2:15.2.1-0.el8, but none of the providers can be > > installed - conflicting requests - nothing provides python3- > > cherrypy needed by ceph-mgr-modules-core-2:15.2.1-0.el8.noarch - > > nothing provides python3-pecan needed by ceph-mgr-modules-core- > > 2:15.2.1-0.el8.noarch > > Is there a way to perform a fresh Ceph install on CentOS 8?Thanking > > in advance for your > > answer.___ceph-users > > mailing list -- ceph-users@ceph.io > > To unsubscribe send an email to ceph-users-le...@ceph.io > > ___ceph-users mailing > list -- ceph-users@ceph.io > To unsubscribe send an email to ceph-users-le...@ceph.io ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io
[ceph-users] Re: How to fix 1 pg stale+active+clean
I had just one osd go down (31), why is ceph not auto-healing in this 'simple' case? -Original Message- To: ceph-users Subject: [ceph-users] How to fix 1 pg stale+active+clean How to fix 1 pg marked as stale+active+clean pg 30.4 is stuck stale for 175342.419261, current state stale+active+clean, last acting [31] ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io
[ceph-users] radosgw garbage collection seems stuck and mannual gc process didn't work
Ceph Version : Mimic 13.2.4 The cluster has been running steadily for more than a year, recently I found cluster usage grows faster than usual .And we figured out the problem is garbage collection. 'radosgw-admin gc list ' has millions of objects to gc. the earliest tag time is 2019-09 , but 99% of them are from 2020-03 to now `ceph df` GLOBAL: SIZEAVAIL RAW USED %RAW USED 1.7 PiB 1.1 PiB 602 TiB 35.22 POOLS: NAME ID USED%USED MAX AVAIL OBJECTS .rgw.root 10 1.2 KiB 0 421 TiB 4 default.rgw.control11 0 B 0 421 TiB 8 default.rgw.data.root 12 0 B 0 421 TiB 0 default.rgw.gc 13 0 B 0 421 TiB 0 default.rgw.log14 4.8 GiB 0 421 TiB 6414 default.rgw.intent-log 15 0 B 0 421 TiB 0 default.rgw.meta 16 110 KiB 0 421 TiB 463 default.rgw.usage 17 0 B 0 421 TiB 0 default.rgw.users.keys 18 0 B 0 421 TiB 0 default.rgw.users.email19 0 B 0 421 TiB 0 default.rgw.users.swift20 0 B 0 421 TiB 0 default.rgw.users.uid 21 0 B 0 421 TiB 0 default.rgw.buckets.extra 22 0 B 0 421 TiB 0 default.rgw.buckets.index 23 0 B 0 421 TiB 118720 default.rgw.buckets.data 24 263 TiB 38.41 421 TiB 138902771 default.rgw.buckets.non-ec 25 0 B 0 421 TiB 16678 however we counted each bucket usage by ' radosgw-admin bucket stats ' ,it should cost only 160TiB , about 80TiB are in GC list former gc config setting before we find gc problem: rgw_gc_max_objs = 32 rgw_gc_obj_min_wait = 3600 rgw_gc_processor_period = 3600 rgw_gc_processor_max_time = 3600 yesterday we adjust our setting and restart rgw: rgw_gc_max_objs = 1024 rgw_gc_obj_min_wait = 300 rgw_gc_processor_period = 600 rgw_gc_processor_max_time = 600 rgw_gc_max_concurrent_io = 40 rgw_gc_max_trim_chunk = 1024 today we use ' rados -p default.rgw.log listomapkeys gc.$i --cluster ceph -N gc | wc -l ' (i from 0 to 1023) well , only gc.0 to gc.511 has data here are some result sorted -time 14:43 result: …… 36 gc_202004111443/gc.502.tag 38 gc_202004111443/gc.501.tag 40 gc_202004111443/gc.136.tag 46 gc_202004111443/gc.511.tag 212 gc_202004111443/gc.9.tag 218 gc_202004111443/gc.24.tag 21976 gc_202004111443/gc.13.tag 42956 gc_202004111443/gc.26.tag 71772 gc_202004111443/gc.25.tag 85766 gc_202004111443/gc.6.tag 104504 gc_202004111443/gc.7.tag 105444 gc_202004111443/gc.10.tag 106114 gc_202004111443/gc.3.tag 126860 gc_202004111443/gc.31.tag 127352 gc_202004111443/gc.23.tag 147942 gc_202004111443/gc.27.tag 148046 gc_202004111443/gc.15.tag 167116 gc_202004111443/gc.28.tag 167932 gc_202004111443/gc.21.tag 187986 gc_202004111443/gc.5.tag 188312 gc_202004111443/gc.22.tag 209084 gc_202004111443/gc.30.tag 209152 gc_202004111443/gc.18.tag 209702 gc_202004111443/gc.19.tag 231100 gc_202004111443/gc.8.tag 249622 gc_202004111443/gc.14.tag 251092 gc_202004111443/gc.2.tag 251366 gc_202004111443/gc.12.tag 251802 gc_202004111443/gc.0.tag 252158 gc_202004111443/gc.11.tag 272114 gc_202004111443/gc.1.tag 291518 gc_202004111443/gc.20.tag 293646 gc_202004111443/gc.16.tag 312998 gc_202004111443/gc.17.tag 352984 gc_202004111443/gc.29.tag 488232 gc_202004111443/gc.4.tag 5935806 total -time 16:53 result: …… 28 gc_202004111653/gc.324.tag 28 gc_202004111653/gc.414.tag 30 gc_202004111653/gc.350.tag 30 gc_202004111653/gc.456.tag 204 gc_202004111653/gc.9.tag 208 gc_202004111653/gc.24.tag 21986 gc_202004111653/gc.13.tag 42964 gc_202004111653/gc.26.tag 71780 gc_202004111653/gc.25.tag 85778 gc_202004111653/gc.6.tag 104512 gc_202004111653/gc.7.tag 105452 gc_202004111653/gc.10.tag 106122 gc_202004111653/gc.3.tag 126866 gc_202004111653/gc.31.tag 127372 gc_202004111653/gc.23.tag 147944 gc_202004111653/gc.27.tag 148058 gc_202004111653/gc.15.tag 167124 gc_202004111653/gc.28.tag 167936 gc_202004111653/gc.21.tag 187992 gc_202004111653/gc.5.tag 188320 gc_202004111653/gc.22.tag 209090 gc_202004111653/gc.30.tag 209170 gc_202004111653/gc.18.tag 209704 gc_202004111653/gc.19.tag 231108 gc_202004111653/gc.8.tag 249632 gc_202004111653/gc.14.tag 251096
[ceph-users] Possible to "move" an OSD?
This is an edge case and probably not something that would be done in production, so I suspect the answer is “lol, no,” but here goes: I have three nodes running Nautilus courtesy of Proxmox. One of them is a self-built Ryzen 5 3600 system, and the other two are salvaged i5 Skylake desktops that I have pressed into service as virtualization and storage nodes. I want to replace the i5 systems with machines that are identical to the Ryzen 5 system. What I want to know is whether it’s possible to just take the devices that are currently hosting the OSDs, together with the hard drive that is hosting Proxmox, move them into the new machine, power up and have everything working. I don’t *think* the device names should change. What does everything think about this possibly insane plan? (Yes, I will back up all my important data before trying this.) Thanks, J ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io
[ceph-users] Re: Possible to "move" an OSD?
As far as Ceph is concerned, as long as there are no separate journal/blockdb/wal devices, you absolutely can transfer osds between hosts. If there are separate journal/blockdb/wal devices, you can do it still, provided they move with the OSDs. With Nautilus and up, make sure the osd bootstrap key is on the new host, and run 'ceph-volume lvm scan --all'. It will scan through the devices, identify the ceph osds et al and start them on the new host. There are no other "gotchas" that I remember. I cannot speak to Proxmox, however. -- Adam On Sat, Apr 11, 2020 at 12:45 PM Jarett DeAngelis wrote: > > This is an edge case and probably not something that would be done in > production, so I suspect the answer is “lol, no,” but here goes: > > I have three nodes running Nautilus courtesy of Proxmox. One of them is a > self-built Ryzen 5 3600 system, and the other two are salvaged i5 Skylake > desktops that I have pressed into service as virtualization and storage > nodes. I want to replace the i5 systems with machines that are identical to > the Ryzen 5 system. What I want to know is whether it’s possible to just take > the devices that are currently hosting the OSDs, together with the hard drive > that is hosting Proxmox, move them into the new machine, power up and have > everything working. I don’t *think* the device names should change. What does > everything think about this possibly insane plan? (Yes, I will back up all my > important data before trying this.) > > Thanks, > J > ___ > ceph-users mailing list -- ceph-users@ceph.io > To unsubscribe send an email to ceph-users-le...@ceph.io ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io
[ceph-users] Re: radosgw garbage collection seems stuck and mannual gc process didn't work
An issue presenting exactly like this was fixed in spring of last year, for certain on nautilus and higher. Matt On Sat, Apr 11, 2020, 12:04 PM <346415...@qq.com> wrote: > Ceph Version : Mimic 13.2.4 > > The cluster has been running steadily for more than a year, recently I > found cluster usage grows faster than usual .And we figured out the problem > is garbage collection. > 'radosgw-admin gc list ' has millions of objects to gc. the earliest tag > time is 2019-09 , but 99% of them are from 2020-03 to now > > `ceph df` > GLOBAL: > SIZEAVAIL RAW USED %RAW USED > 1.7 PiB 1.1 PiB 602 TiB 35.22 > POOLS: > NAME ID USED%USED MAX AVAIL >OBJECTS > .rgw.root 10 1.2 KiB 0 421 TiB >4 > default.rgw.control11 0 B 0 421 TiB >8 > default.rgw.data.root 12 0 B 0 421 TiB >0 > default.rgw.gc 13 0 B 0 421 TiB >0 > default.rgw.log14 4.8 GiB 0 421 TiB > 6414 > default.rgw.intent-log 15 0 B 0 421 TiB >0 > default.rgw.meta 16 110 KiB 0 421 TiB > 463 > default.rgw.usage 17 0 B 0 421 TiB >0 > default.rgw.users.keys 18 0 B 0 421 TiB >0 > default.rgw.users.email19 0 B 0 421 TiB >0 > default.rgw.users.swift20 0 B 0 421 TiB >0 > default.rgw.users.uid 21 0 B 0 421 TiB >0 > default.rgw.buckets.extra 22 0 B 0 421 TiB >0 > default.rgw.buckets.index 23 0 B 0 421 TiB > 118720 > default.rgw.buckets.data 24 263 TiB 38.41 421 TiB >138902771 > default.rgw.buckets.non-ec 25 0 B 0 421 TiB >16678 > > however we counted each bucket usage by ' radosgw-admin bucket stats ' > ,it should cost only 160TiB , about 80TiB are in GC list > > former gc config setting before we find gc problem: > rgw_gc_max_objs = 32 > rgw_gc_obj_min_wait = 3600 > rgw_gc_processor_period = 3600 > rgw_gc_processor_max_time = 3600 > > yesterday we adjust our setting and restart rgw: > rgw_gc_max_objs = 1024 > rgw_gc_obj_min_wait = 300 > rgw_gc_processor_period = 600 > rgw_gc_processor_max_time = 600 > rgw_gc_max_concurrent_io = 40 > rgw_gc_max_trim_chunk = 1024 > > today we use ' rados -p default.rgw.log listomapkeys gc.$i --cluster > ceph -N gc | wc -l ' (i from 0 to 1023) > well , only gc.0 to gc.511 has data > > here are some result sorted > -time 14:43 result: >…… >36 gc_202004111443/gc.502.tag >38 gc_202004111443/gc.501.tag >40 gc_202004111443/gc.136.tag >46 gc_202004111443/gc.511.tag > 212 gc_202004111443/gc.9.tag > 218 gc_202004111443/gc.24.tag > 21976 gc_202004111443/gc.13.tag > 42956 gc_202004111443/gc.26.tag > 71772 gc_202004111443/gc.25.tag > 85766 gc_202004111443/gc.6.tag >104504 gc_202004111443/gc.7.tag >105444 gc_202004111443/gc.10.tag >106114 gc_202004111443/gc.3.tag >126860 gc_202004111443/gc.31.tag >127352 gc_202004111443/gc.23.tag >147942 gc_202004111443/gc.27.tag >148046 gc_202004111443/gc.15.tag >167116 gc_202004111443/gc.28.tag >167932 gc_202004111443/gc.21.tag >187986 gc_202004111443/gc.5.tag >188312 gc_202004111443/gc.22.tag >209084 gc_202004111443/gc.30.tag >209152 gc_202004111443/gc.18.tag >209702 gc_202004111443/gc.19.tag >231100 gc_202004111443/gc.8.tag >249622 gc_202004111443/gc.14.tag >251092 gc_202004111443/gc.2.tag >251366 gc_202004111443/gc.12.tag >251802 gc_202004111443/gc.0.tag >252158 gc_202004111443/gc.11.tag >272114 gc_202004111443/gc.1.tag >291518 gc_202004111443/gc.20.tag >293646 gc_202004111443/gc.16.tag >312998 gc_202004111443/gc.17.tag >352984 gc_202004111443/gc.29.tag >488232 gc_202004111443/gc.4.tag > 5935806 total > > > -time 16:53 result: > …… >28 gc_202004111653/gc.324.tag >28 gc_202004111653/gc.414.tag >30 gc_202004111653/gc.350.tag >30 gc_202004111653/gc.456.tag > 204 gc_202004111653/gc.9.tag > 208 gc_202004111653/gc.24.tag > 21986 gc_202004111653/gc.13.tag > 42964 gc_202004111653/gc.26.tag > 71780 gc_202004111653/gc.25.tag > 85778 gc_202004111653/gc.6.tag >104512 gc_202004111653/gc.7.tag >105452 gc_202004111653/gc.10.tag >106122 gc_202004111653/gc.3.tag >126866 gc_202004111653/gc.31.tag >127372 gc_202004111653/gc.23.tag >147944 gc_202004111653/gc.27.tag >14
[ceph-users] Re: Understanding monitor requirements
Hi again, after all, this appears to be an MTU issue: Baseline: 1) Two of the nodes have a straight ethernet with 1500MTU, the third (problem) node is on a WAN tunnel with a restricted MTU. It appears that the MTUs were not set up correctly, so no surprise some software has problems. 2) I decided I knew Ceph well enough that I could handle recovery from disaster cases in Rook and it has advantages I can use. So please keep that in mind as I discuss this issue. (For those who aren’t familiar, Rook just orchestrates containers that are built by the Ceph team.) 3) In Rook, monitors run as pods under a CNI. The CNI adds additional overhead for transit, in my case a VxLAN overlay network. This overhead is apparently not enough to cause problems when running between nodes on a full 1500MTU local net. So the first two monitors come up cleanly. After spending a lot of time looking at the logs, I could see the mon map of all three nodes properly distributed, but when it came to an election, all nodes knew the election epoch but the third was not joining. Comparing the logs of the second node as peon with the troubled third node on the other side of the restricted MTU, the difference appeared to be that the third node was not providing a feature proposal when in fact it probably was and it was being dropped. So the election would end without the third node being a part of the quorum. The third node stopped asking for a new election and that’s how things ended. What I did this morning was figure out the MTU of the WAN tunnel and then change the entire CNI to that number. My expectation was that everything would start working and the necessary fragmentation would be generated by the client end of any connection. Instead, the second node that was previously able to join as peon was no longer able to do so. It seems to follow that the smaller MTU (1340 to be exact) set on the overall CNI causes the elections to fail. There are a number of things that I can do to improve the behavior of the cluster (such as PMTUD), but if Ceph is not going to work with a small MTU, all bets are off. I tried looking for issues in tracker.ceph.com, but apparently I haven’t logged in there for a while and my account was deleted. I applied for a new one. Any ideas what I can do here? Thanks! Brian ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io
[ceph-users] Re: Understanding monitor requirements
> On Apr 11, 2020, at 5:54 PM, Anthony D'Atri wrote: > > Dumb question, can’t you raise the MTU of the tunnel? I’m good with any question, that got it, thank you! I’m not exactly sure what happened, I believe an MTU setting I tried didn’t actually take or the CNI software was somehow not reloaded after I set it. What a great adventure this has been to learn more internals on Ceph! Cheers all, hope all of you and yours are well. best, Brian ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io
[ceph-users] Re: radosgw garbage collection seems stuck and mannual gc process didn't work
thanks a lot i'm not sure if the PR is https://github.com/ceph/ceph/pull/26601 ? and that has been backport to mimic https://github.com/ceph/ceph/pull/27796 it seems the cluster needs to be upgraded to 13.2.6 or higher after upgrade , what else should I do ? like manually execute gc process to clean up those objects or just let it runs automatically? ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io