[ceph-users] Re: Best distro to run ceph.

2021-05-01 Thread Peter Childs
Yes I think the debate might be debian vs Ubuntu.

We've been a bit of a CentOs shop but with red hat removing some of the
support for last gen high end hardware, and the issues with centos8 stream.
We may just need to keep our eyes open.

Don't get me wrong I'm using last gen hpc kit so it's not bad kit, it's
just not new either.

Peter.

On Fri, 30 Apr 2021, 23:57 Mark Lehrer,  wrote:

> I've had good luck with the Ubuntu LTS releases - no need to add extra
> repos.  20.04 uses Octopus.
>
> On Fri, Apr 30, 2021 at 1:14 PM Peter Childs  wrote:
> >
> > I'm trying to set up a new ceph cluster, and I've hit a bit of a blank.
> >
> > I started off with centos7 and cephadm. Worked fine to a point, except I
> > had to upgrade podman but it mostly worked with octopus.
> >
> > Since this is a fresh cluster and hence no data at risk, I decided to
> jump
> > straight into Pacific when it came out and upgrade. Which is where my
> > trouble began. Mostly because Pacific needs a version on lvm later than
> > what's in centos7.
> >
> > I can't upgrade to centos8 as my boot drives are not supported by centos8
> > due to the way redhst disabled lots of disk drivers. I think I'm looking
> at
> > Ubuntu or debian.
> >
> > Given cephadm has a very limited set of depends it would be good to have
> a
> > supported matrix, it would also be good to have a check in cephadm on
> > upgrade, that says no I won't upgrade if the version of lvm2 is too low
> on
> > any host and let's the admin fix the issue and try again.
> >
> > I was thinking to upgrade to centos8 for this project anyway until I
> > relised that centos8 can't support my hardware I've inherited. But
> > currently I've got a broken cluster unless I can workout some way to
> > upgrade lvm in centos7.
> >
> > Peter.
> > ___
> > ceph-users mailing list -- ceph-users@ceph.io
> > To unsubscribe send an email to ceph-users-le...@ceph.io
>
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Best distro to run ceph.

2021-05-01 Thread Martin Verges
Hello Peter,

if you want to get rid of all these troubles, just use croit. We bring
a pre build and tested Debian 10 (Buster) OS or optionally in Beta a
OpenSuse Leap 15.2 based image with our deployment and management
software.

Btw. all the OS, installation, upgrading hassle are solved with the
free forever version.

--
Martin Verges
Managing director

Mobile: +49 174 9335695
E-Mail: martin.ver...@croit.io
Chat: https://t.me/MartinVerges

croit GmbH, Freseniusstr. 31h, 81247 Munich
CEO: Martin Verges - VAT-ID: DE310638492
Com. register: Amtsgericht Munich HRB 231263

Web: https://croit.io
YouTube: https://goo.gl/PGE1Bx

On Fri, 30 Apr 2021 at 21:14, Peter Childs  wrote:
>
> I'm trying to set up a new ceph cluster, and I've hit a bit of a blank.
>
> I started off with centos7 and cephadm. Worked fine to a point, except I
> had to upgrade podman but it mostly worked with octopus.
>
> Since this is a fresh cluster and hence no data at risk, I decided to jump
> straight into Pacific when it came out and upgrade. Which is where my
> trouble began. Mostly because Pacific needs a version on lvm later than
> what's in centos7.
>
> I can't upgrade to centos8 as my boot drives are not supported by centos8
> due to the way redhst disabled lots of disk drivers. I think I'm looking at
> Ubuntu or debian.
>
> Given cephadm has a very limited set of depends it would be good to have a
> supported matrix, it would also be good to have a check in cephadm on
> upgrade, that says no I won't upgrade if the version of lvm2 is too low on
> any host and let's the admin fix the issue and try again.
>
> I was thinking to upgrade to centos8 for this project anyway until I
> relised that centos8 can't support my hardware I've inherited. But
> currently I've got a broken cluster unless I can workout some way to
> upgrade lvm in centos7.
>
> Peter.
> ___
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] using ec pool with rgw

2021-05-01 Thread Marco Savoca
Hi,

I’m currently deploying a new cluster for cold storage with rgw. 

Is there actually a more elegant method to get the bucket data on an erasure 
coding pool other than moving the pool or creating the bucket.data pool prior 
to data upload?

Thanks,

Marco Savoca
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] How can I get tail information a parted rados object

2021-05-01 Thread by morphin
Hello.

I'm trying to export objects from rados with rados get. Some objects
bigger than 4M and they have tails. Is there any easy way to get tail
information an object?

For example this is an object:
- c106b26b.3_Img/2017/12/im034113.jpg
These are the objet parts:
- 
c106b26b.3__multipart_Img/2017/12/im034113.jpg.2~fjrC5r_KCWMBat_4bFVtmBv9pxcVL-9.1
- 
c106b26b.3__shadow_Img/2017/12/im034113.jpg.2~fjrC5r_KCWMBat_4bFVtmBv9pxcVL-9.1_1
- 
c106b26b.3__shadow_Img/2017/12/im034113.jpg.2~fjrC5r_KCWMBat_4bFVtmBv9pxcVL-9.1_2
- 
c106b26b.3__multipart_Img/2017/12/im034113.jpg.2~fjrC5r_KCWMBat_4bFVtmBv9pxcVL-9.2

As you can see the object has 2 multipart and 2 shadow object.
This jpg only works when I get all the parts and make it one with the order.
order: "cat 9.1 9.1_1 9.1_2 9.2 > im034113.jpg"

I'm trying to write a code and the code gonna read objects from a list
and find all the parts, bring it together with the order...  But I
couldn't find a good way to get part information.

I followed the link https://www.programmersought.com/article/31497869978/
and I get the object manifest with getxattr and decode it with
"ceph-dencoder type RGWBucketEnt  decode dump_json"
But in the manifest I can not find a path to code it. It's not useful.
Is there any different place that I can take the part information an
object?

Or better! Is there any tool to export an object with its tails?

btw: these objects created by RGW using s3. RGW can not access these
files. Because of that I'm trying to export it from rados and send it
to different RGW.
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Big OSD add, long backfill, degraded PGs, deep-scrub backlog, OSD restarts

2021-05-01 Thread Dave Hall
Hello,

I recently added 2 OSD nodes to my Nautilus cluster, increasing the OSD
count from 32 to 48 - all 12TB HDDs with NVMe for db.

I generally keep an ssh session open where I can run 'watch cepf -s'.  My
observations are mostly based on what I saw from watching this.

Even with 10GB networking, rebalancing 529 pgs took 10 days, during which
there were always a few PGs undersized+degraded, frequent flashes of slow
ops, occasional OSD restarts, and the scrub and deep-scrub backlog steadily
increased.  When the backfills completed I had 24 missed deep-scrubs and 10
missed scrubs.

I suspect that this is because of some settings that I had fiddled with, so
this post may be an advertisement for what not to do to your cluster.
However, I'd like to know if my understanding is accurate.  I believe that
my settings resulted

In short, I think I had my config set up so there was contention due to too
many processes trying to do things to some OSDs all at once:


   - osd_scrub_during_recovery:  I think I had this set to true for the
   first 9 days, but set it to false when I started to realize that it might
   be causing contention
   - osd_max_scrubs:  I had this set high - global:30 osd:10.  At some
   earlier time when I had a scrub backlog I thought that these were counts
   for simultaneous scrubs across all OSDs rather than 'per OSD'
  - Now I see why the default is 1.
  - Assumption:  on an HDD multiple competing scrubs cause excessive
  seeking and thus compound impacts to scrub progress
   - osd_max_backfills:  I had bumped this up as well - global:30 osd:10,
   thinking it would speed up the rebalancing of my PGs onto my new OSDs.
  - Now, the same thinking as for osd_max_scrubs:  compounding
  contention, further compounded by the scrub acivity that should have been
  inhibited by osd_scrub_during_recovery:false.

I believe that all of this also resulted in my EC pgs (8 + 2) becoming
degraded.  My assumption here is that collisions between deep-scrubs and
backfills sometimes locked the backfill process out of a piece of an EC PG,
causing backfil to rebuild instead of copy.

The good news is that I haven't lost and data and, other than the scrub
backlog things seem to be working smoothly.  It seems like with 1 or 2
scrubs (deep or regular) running they are taking about 2 hours per scrub.
As the scrubs progress, more scrub deadlines are missed, so it's not a
steady march to zero.

Please feel free to comment.  I'd be glad to know if I'm on the right track
as we expect the cluster to double in size over the next 12 to 18 months.

Thanks.

-Dave

--
Dave Hall
Binghamton University
kdh...@binghamton.edu
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io