[ceph-users] Re: RGW returning HTTP 500 during resharding

2024-09-28 Thread Anthony D'Atri
> On Sep 28, 2024, at 5:21 PM, Floris Bos wrote: > > "Anthony D'Atri" schreef op 28 september 2024 16:24: >>> No retries. >>> Is it expected that resharding can take so long? >>> (in a setup with all NVMe drives) >> >> Which drive SKU(s)? How full are they? Is their firmware up to date? How

[ceph-users] Re: RGW returning HTTP 500 during resharding

2024-09-28 Thread Floris Bos
"Anthony D'Atri" schreef op 28 september 2024 16:24: >> No retries. >> Is it expected that resharding can take so long? >> (in a setup with all NVMe drives) > > Which drive SKU(s)? How full are they? Is their firmware up to date? How many > RGWs? Have you tuned > your server network stack? Disab

[ceph-users] RGW Graphs in cephadm setup

2024-09-28 Thread bkennedy
We recently upgraded all our clusters to rocky 9.4 and reef 18.2.4. Two of the clusters show the rgw metrics in the ceph dashboard and the other two don't. I made sure the firewalls were open for ceph-exporter and that Prometheus was gathering the stats on all 4 clusters. For the clusters that a

[ceph-users] RGW Graphs in cephadm setup

2024-09-28 Thread brentk
We recently upgraded all our clusters to rocky 9.4 and reef 18.2.4. Two of the clusters show the rgw metrics in the ceph dashboard and the other two don't. I made sure the firewalls were open for ceph-exporter and that Prometheus was gathering the stats on all 4 clusters. For the clusters that a

[ceph-users] Re: RGW returning HTTP 500 during resharding

2024-09-28 Thread Anthony D'Atri
> > No retries. > Is it expected that resharding can take so long? > (in a setup with all NVMe drives) Which drive SKU(s)? How full are they? Is their firmware up to date? How many RGWs? Have you tuned your server network stack? Disabled Nagle? How many bucket OSDs? How many index OSDs?

[ceph-users] RGW returning HTTP 500 during resharding

2024-09-28 Thread Floris Bos
Hi, I am doing an attempt at loading a large number (billions) of tiny objects from other existing system into Ceph (18.2.4 reef) RadosGW/S3. I wrote a little program for that using the AWS S3 Java SDK (using CRT/S3AsyncClient), and am using 250 simultanous connections to get the data in. Howeve