[ceph-users] Re: Best distro to run ceph.
Yes I think the debate might be debian vs Ubuntu. We've been a bit of a CentOs shop but with red hat removing some of the support for last gen high end hardware, and the issues with centos8 stream. We may just need to keep our eyes open. Don't get me wrong I'm using last gen hpc kit so it's not bad kit, it's just not new either. Peter. On Fri, 30 Apr 2021, 23:57 Mark Lehrer, wrote: > I've had good luck with the Ubuntu LTS releases - no need to add extra > repos. 20.04 uses Octopus. > > On Fri, Apr 30, 2021 at 1:14 PM Peter Childs wrote: > > > > I'm trying to set up a new ceph cluster, and I've hit a bit of a blank. > > > > I started off with centos7 and cephadm. Worked fine to a point, except I > > had to upgrade podman but it mostly worked with octopus. > > > > Since this is a fresh cluster and hence no data at risk, I decided to > jump > > straight into Pacific when it came out and upgrade. Which is where my > > trouble began. Mostly because Pacific needs a version on lvm later than > > what's in centos7. > > > > I can't upgrade to centos8 as my boot drives are not supported by centos8 > > due to the way redhst disabled lots of disk drivers. I think I'm looking > at > > Ubuntu or debian. > > > > Given cephadm has a very limited set of depends it would be good to have > a > > supported matrix, it would also be good to have a check in cephadm on > > upgrade, that says no I won't upgrade if the version of lvm2 is too low > on > > any host and let's the admin fix the issue and try again. > > > > I was thinking to upgrade to centos8 for this project anyway until I > > relised that centos8 can't support my hardware I've inherited. But > > currently I've got a broken cluster unless I can workout some way to > > upgrade lvm in centos7. > > > > Peter. > > ___ > > ceph-users mailing list -- ceph-users@ceph.io > > To unsubscribe send an email to ceph-users-le...@ceph.io > ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io
[ceph-users] Re: Best distro to run ceph.
Hello Peter, if you want to get rid of all these troubles, just use croit. We bring a pre build and tested Debian 10 (Buster) OS or optionally in Beta a OpenSuse Leap 15.2 based image with our deployment and management software. Btw. all the OS, installation, upgrading hassle are solved with the free forever version. -- Martin Verges Managing director Mobile: +49 174 9335695 E-Mail: martin.ver...@croit.io Chat: https://t.me/MartinVerges croit GmbH, Freseniusstr. 31h, 81247 Munich CEO: Martin Verges - VAT-ID: DE310638492 Com. register: Amtsgericht Munich HRB 231263 Web: https://croit.io YouTube: https://goo.gl/PGE1Bx On Fri, 30 Apr 2021 at 21:14, Peter Childs wrote: > > I'm trying to set up a new ceph cluster, and I've hit a bit of a blank. > > I started off with centos7 and cephadm. Worked fine to a point, except I > had to upgrade podman but it mostly worked with octopus. > > Since this is a fresh cluster and hence no data at risk, I decided to jump > straight into Pacific when it came out and upgrade. Which is where my > trouble began. Mostly because Pacific needs a version on lvm later than > what's in centos7. > > I can't upgrade to centos8 as my boot drives are not supported by centos8 > due to the way redhst disabled lots of disk drivers. I think I'm looking at > Ubuntu or debian. > > Given cephadm has a very limited set of depends it would be good to have a > supported matrix, it would also be good to have a check in cephadm on > upgrade, that says no I won't upgrade if the version of lvm2 is too low on > any host and let's the admin fix the issue and try again. > > I was thinking to upgrade to centos8 for this project anyway until I > relised that centos8 can't support my hardware I've inherited. But > currently I've got a broken cluster unless I can workout some way to > upgrade lvm in centos7. > > Peter. > ___ > ceph-users mailing list -- ceph-users@ceph.io > To unsubscribe send an email to ceph-users-le...@ceph.io ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io
[ceph-users] using ec pool with rgw
Hi, I’m currently deploying a new cluster for cold storage with rgw. Is there actually a more elegant method to get the bucket data on an erasure coding pool other than moving the pool or creating the bucket.data pool prior to data upload? Thanks, Marco Savoca ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io
[ceph-users] How can I get tail information a parted rados object
Hello. I'm trying to export objects from rados with rados get. Some objects bigger than 4M and they have tails. Is there any easy way to get tail information an object? For example this is an object: - c106b26b.3_Img/2017/12/im034113.jpg These are the objet parts: - c106b26b.3__multipart_Img/2017/12/im034113.jpg.2~fjrC5r_KCWMBat_4bFVtmBv9pxcVL-9.1 - c106b26b.3__shadow_Img/2017/12/im034113.jpg.2~fjrC5r_KCWMBat_4bFVtmBv9pxcVL-9.1_1 - c106b26b.3__shadow_Img/2017/12/im034113.jpg.2~fjrC5r_KCWMBat_4bFVtmBv9pxcVL-9.1_2 - c106b26b.3__multipart_Img/2017/12/im034113.jpg.2~fjrC5r_KCWMBat_4bFVtmBv9pxcVL-9.2 As you can see the object has 2 multipart and 2 shadow object. This jpg only works when I get all the parts and make it one with the order. order: "cat 9.1 9.1_1 9.1_2 9.2 > im034113.jpg" I'm trying to write a code and the code gonna read objects from a list and find all the parts, bring it together with the order... But I couldn't find a good way to get part information. I followed the link https://www.programmersought.com/article/31497869978/ and I get the object manifest with getxattr and decode it with "ceph-dencoder type RGWBucketEnt decode dump_json" But in the manifest I can not find a path to code it. It's not useful. Is there any different place that I can take the part information an object? Or better! Is there any tool to export an object with its tails? btw: these objects created by RGW using s3. RGW can not access these files. Because of that I'm trying to export it from rados and send it to different RGW. ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io
[ceph-users] Big OSD add, long backfill, degraded PGs, deep-scrub backlog, OSD restarts
Hello, I recently added 2 OSD nodes to my Nautilus cluster, increasing the OSD count from 32 to 48 - all 12TB HDDs with NVMe for db. I generally keep an ssh session open where I can run 'watch cepf -s'. My observations are mostly based on what I saw from watching this. Even with 10GB networking, rebalancing 529 pgs took 10 days, during which there were always a few PGs undersized+degraded, frequent flashes of slow ops, occasional OSD restarts, and the scrub and deep-scrub backlog steadily increased. When the backfills completed I had 24 missed deep-scrubs and 10 missed scrubs. I suspect that this is because of some settings that I had fiddled with, so this post may be an advertisement for what not to do to your cluster. However, I'd like to know if my understanding is accurate. I believe that my settings resulted In short, I think I had my config set up so there was contention due to too many processes trying to do things to some OSDs all at once: - osd_scrub_during_recovery: I think I had this set to true for the first 9 days, but set it to false when I started to realize that it might be causing contention - osd_max_scrubs: I had this set high - global:30 osd:10. At some earlier time when I had a scrub backlog I thought that these were counts for simultaneous scrubs across all OSDs rather than 'per OSD' - Now I see why the default is 1. - Assumption: on an HDD multiple competing scrubs cause excessive seeking and thus compound impacts to scrub progress - osd_max_backfills: I had bumped this up as well - global:30 osd:10, thinking it would speed up the rebalancing of my PGs onto my new OSDs. - Now, the same thinking as for osd_max_scrubs: compounding contention, further compounded by the scrub acivity that should have been inhibited by osd_scrub_during_recovery:false. I believe that all of this also resulted in my EC pgs (8 + 2) becoming degraded. My assumption here is that collisions between deep-scrubs and backfills sometimes locked the backfill process out of a piece of an EC PG, causing backfil to rebuild instead of copy. The good news is that I haven't lost and data and, other than the scrub backlog things seem to be working smoothly. It seems like with 1 or 2 scrubs (deep or regular) running they are taking about 2 hours per scrub. As the scrubs progress, more scrub deadlines are missed, so it's not a steady march to zero. Please feel free to comment. I'd be glad to know if I'm on the right track as we expect the cluster to double in size over the next 12 to 18 months. Thanks. -Dave -- Dave Hall Binghamton University kdh...@binghamton.edu ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io