from:"shubjero"

[ceph-users] radosgw breaking because of too many open files

2021-10-05 Thread shubjero

Just upgraded from Ceph Nautilus to Ceph Octopus on Ubuntu 18.04 using standard ubuntu packages from the Ceph repo. Upgrade has gone OK but we are having issues with our radosgw service, eventually failing after some load, here's what we see in the logs: 2021-10-05T15:55:16.328-0400 7fa47700

[ceph-users] Re: radosgw breaking because of too many open files

2021-10-05 Thread shubjero

oad' and bouncing the radosgw process and we seem to be humming along nicely now. On Tue, Oct 5, 2021 at 4:55 PM shubjero wrote: > > Just upgraded from Ceph Nautilus to Ceph Octopus on Ubuntu 18.04 using > standard ubuntu packages from the Ceph repo. > > Upgrade has gone OK

[ceph-users] Re: how to upgrade host os under ceph

2022-10-26 Thread shubjero

We've done 14.04 -> 16.04 -> 18.04 -> 20.04 all at various stages of our ceph cluster life. The latest 18.04 to 20.04 was painless and we ran: apt update && apt dist-upgrade -y -o Dpkg::Options::=\"--force-confdef\" -o Dpkg::Options::=\"--force-confold\" do-release-upgrade --allow-third-party -f D

[ceph-users] Trying to debug "Failed to send data to Zabbix"

2021-10-19 Thread shubjero

Hey all, Recently upgraded to Ceph Octopus (15.2.14). We also run Zabbix 5.0.15. Have had ceph/zabbix monitoring for a long time. After the Ceph Octopus update I installed the latest version of the Ceph template in Zabbix (https://github.com/ceph/ceph/blob/master/src/pybind/mgr/zabbix/zabbix_templ

[ceph-users] OSD node OS upgrade strategy

2020-06-19 Thread shubjero

Hi all, I have a 39 node, 1404 spinning disk Ceph Mimic cluster across 6 racks for a total of 9.1PiB raw and about 40% utilized. These storage nodes started their life on Ubuntu 14.04 and in-place upgraded to 16.04 2 years ago however I have started a project to do fresh installs of each OSD node

[ceph-users] Multipart upload issue from Java SDK clients

2020-09-02 Thread shubjero

Good day, I am having an issue with some multipart uploads to radosgw. I recently upgraded my cluster from Mimic to Nautilus and began having problems with multipart uploads from clients using the Java AWS SDK (specifically 1.11.219). I do NOT have issues with multipart uploads with other clients

[ceph-users] Re: Multipart upload issue from Java SDK clients

2020-09-04 Thread shubjero

, Sep 2, 2020 at 3:15 PM shubjero wrote: > > Good day, > > I am having an issue with some multipart uploads to radosgw. I > recently upgraded my cluster from Mimic to Nautilus and began having > problems with multipart uploads from clients using the Java AWS SDK > (specificall

[ceph-users] Re: RadosGW and DNS Round-Robin

2020-09-04 Thread shubjero

We have our object storage endpoint fqdn DNS round robining to 2 IP's. Those 2 IP's are managed by keepalived across 3 servers running haproxy where each haproxy instance is listening on each round robin'd IP and then load balanced to 5 servers running radosgw. On Fri, Sep 4, 2020 at 12:35 PM Oliv

[ceph-users] Multipart uploads with partsizes larger than 16MiB failing on Nautilus

2020-09-08 Thread shubjero

Hey all, I'm creating a new post for this issue as we've narrowed the problem down to a partsize limitation on multipart upload. We have discovered that in our production Nautilus (14.2.11) cluster and our lab Nautilus (14.2.10) cluster that multipart uploads with a configured part size of greater

[ceph-users] Re: Multipart uploads with partsizes larger than 16MiB failing on Nautilus

2020-09-08 Thread shubjero

) breaks multipart uploads. On Tue, Sep 8, 2020 at 12:12 PM shubjero wrote: > > Hey all, > > I'm creating a new post for this issue as we've narrowed the problem > down to a partsize limitation on multipart upload. We have discovered > that in our production Nautilus (

[ceph-users] Re: Multipart uploads with partsizes larger than 16MiB failing on Nautilus

2020-09-09 Thread shubjero

Will do Matt On Tue, Sep 8, 2020 at 5:36 PM Matt Benjamin wrote: > > thanks, Shubjero > > Would you consider creating a ceph tracker issue for this? > > regards, > > Matt > > On Tue, Sep 8, 2020 at 4:13 PM shubjero wrote: > > > > I had been looking

[ceph-users] Re: Multipart uploads with partsizes larger than 16MiB failing on Nautilus

2020-09-10 Thread shubjero

_max_chunk_size > rgw_put_obj_min_window_size, > because we try to write in units of chunk size but the window is too > small to write a single chunk. > > On Wed, Sep 9, 2020 at 8:51 AM shubjero wrote: > > > > Will do Matt > > > > On Tue, Sep 8, 2020 at 5:36

[ceph-users] Re: Mgr stability

2019-08-14 Thread shubjero

I'm having a similar issue with ceph-mgr stability problems since upgrading from 13.2.5 to 13.2.6. I have isolated the crashing to the prometheus module being enabled and notice much better stability when the prometheus module is NOT enabled. No more failovers, however I do notice that even with pr

[ceph-users] Bucket policies with OpenStack integration and limiting access

2019-09-04 Thread shubjero

Good day, We have a Ceph cluster and make use of object-storage and integrate with OpenStack. Each OpenStack project/tenant is given a radosgw user which allows all keystone users of that project to access the object-storage as that single radosgw user. The radosgw user is the project id of the Op

[ceph-users] Bucket policies with OpenStack integration and limiting access

2019-09-09 Thread shubjero

Good day, We have a Ceph cluster and make use of object-storage and integrate with OpenStack. Each OpenStack project/tenant is given a radosgw user which allows all keystone users of that project to access the object-storage as that single radosgw user. The radosgw user is the project id of the Op

[ceph-users] HEALTH_WARN due to large omap object wont clear even after trim

2019-09-19 Thread shubjero

Hey all, Yesterday our cluster went in to HEALTH_WARN due to 1 large omap object in the .usage pool (I've posted about this in the past). Last time we resolved the issue by trimming the usage log below the alert threshold but this time it seems like the alert wont clear even after trimming and (th

[ceph-users] Re: HEALTH_WARN due to large omap object wont clear even after trim

2019-09-20 Thread shubjero

rados -p .usage listomapkeys usage.22 root@infra:~# On Thu, Sep 19, 2019 at 12:54 PM Charles Alva wrote: > > Could you please share how you trimmed the usage log? > > Kind regards, > > Charles Alva > Sent from Gmail Mobile > > > On Thu, Sep 19, 2019 at 11:4

[ceph-users] Re: HEALTH_WARN due to large omap object wont clear even after trim

2019-09-20 Thread shubjero

> issued or cleared during scrub, so I'd expect them to go away the next > time the usage objects get scrubbed. > > On 9/20/19 2:31 PM, shubjero wrote: > > Still trying to solve this one. > > > > Here is the corresponding log entry when the large omap object was

[ceph-users] Re: HEALTH_WARN due to large omap object wont clear even after trim

2019-09-20 Thread shubjero

The deep scrub of the pg updated the cluster that the large omap was gone. HEALTH_OK ! On Fri., Sep. 20, 2019, 2:31 p.m. shubjero, wrote: > Still trying to solve this one. > > Here is the corresponding log entry when the large omap object was found: > > ceph-osd.1284.log.2.gz:2

[ceph-users] Question about ceph-balancer and OSD reweights

2020-02-25 Thread shubjero

Hi all, I'm running a Ceph Mimic cluster 13.2.6 and we use the ceph-balancer in upmap mode. This cluster is fairly old and pre-Mimic we used to set osd reweights to balance the standard deviation of the cluster. Since moving to Mimic about 9 months ago I enabled the ceph-balancer with upmap mode a

[ceph-users] Re: Question about ceph-balancer and OSD reweights

2020-02-26 Thread shubjero

Right, but should I be proactively returning any reweighted OSD's that are not 1. to 1.? On Wed, Feb 26, 2020 at 3:36 AM Konstantin Shalygin wrote: > > On 2/26/20 3:40 AM, shubjero wrote: > > I'm running a Ceph Mimic cluster 13.2.6 and we use the ceph-balancer &

[ceph-users] Re: Question about ceph-balancer and OSD reweights

2020-02-28 Thread shubjero

I talked to some guys on IRC about going back to the non-1 reweight OSD's and setting them to 1. I went from a standard deviation of 2+ to 0.5. Awesome. On Wed, Feb 26, 2020 at 10:08 AM shubjero wrote: > > Right, but should I be proactively returning any reweighted OSD's that

[ceph-users] Re: No reply or very slow reply from Prometheus plugin - ceph-mgr 13.2.8 mimic

2020-03-27 Thread shubjero

I've reported stability problems with ceph-mgr w/ prometheus plugin enabled on all versions we ran in production which were several versions of Luminous and Mimic. Our solution was to disable the prometheus exporter. I am using Zabbix instead. Our cluster is 1404 OSD's in size with about 9PB raw wi

[ceph-users] radosgw breaking because of too many open files

[ceph-users] Re: radosgw breaking because of too many open files

[ceph-users] Re: how to upgrade host os under ceph

[ceph-users] Trying to debug "Failed to send data to Zabbix"

[ceph-users] OSD node OS upgrade strategy

[ceph-users] Multipart upload issue from Java SDK clients

[ceph-users] Re: Multipart upload issue from Java SDK clients

[ceph-users] Re: RadosGW and DNS Round-Robin

[ceph-users] Multipart uploads with partsizes larger than 16MiB failing on Nautilus

[ceph-users] Re: Multipart uploads with partsizes larger than 16MiB failing on Nautilus

[ceph-users] Re: Multipart uploads with partsizes larger than 16MiB failing on Nautilus

[ceph-users] Re: Multipart uploads with partsizes larger than 16MiB failing on Nautilus

[ceph-users] Re: Mgr stability

[ceph-users] Bucket policies with OpenStack integration and limiting access

[ceph-users] Bucket policies with OpenStack integration and limiting access

[ceph-users] HEALTH_WARN due to large omap object wont clear even after trim

[ceph-users] Re: HEALTH_WARN due to large omap object wont clear even after trim

[ceph-users] Re: HEALTH_WARN due to large omap object wont clear even after trim

[ceph-users] Re: HEALTH_WARN due to large omap object wont clear even after trim

[ceph-users] Question about ceph-balancer and OSD reweights

[ceph-users] Re: Question about ceph-balancer and OSD reweights

[ceph-users] Re: Question about ceph-balancer and OSD reweights

[ceph-users] Re: No reply or very slow reply from Prometheus plugin - ceph-mgr 13.2.8 mimic

23 matches

Site Navigation

Mail list logo

Footer information