That is just one user, cluster has in peak time 1.1M read IOPS with 10GiB/s
read throughput on 27-30 gws and around 20-50k write iops with 1.5-2GiB/s write
throughput.
I'll give a try to increase index pool pg with aiming to 400pg/nvme.
Istvan
From: Frédéric Nas
Hi,
I have heard nothing on this, but have done some more research.
Again, both sides of a multisite s3 configuration are ceph 18.2.4 on Rocky 9.
For a given bucket, there are thousands of 'missing' objects. I did:
radosgw-admin bucket sync init --bucket --src-zone sync starts after I restart a r
You might also first try
ceph osd down 1701
This marks the OSD down in the map, it doesn’t restart anything, but it does
serve in some cases to goose progress. The OSD will quickly mark itself back
up.
Where 1701 is the ID of said primary.
ceph health detail
These mirrors will sync very soon and delete the tree as well. This needs to be
fixed on the ceph repo side.
Best regards,
=
Frank Schilder
AIT Risø Campus
Bygning 109, rum S14
From: Ben Zieglmeier
Sent: Thursday, November 14, 2024 1:50 P
I redid what I did before. I changed the osd class and had a look at
the storage and found a VM image! I deleted it (it was a copy) and now
the storage is empty. I guess the image (assigned to a non-existent
storage after reverting left pg's that could not be moved. I'm
reverting now again
How many RGW gateways? With 300 update requests per second, I would start by
increasing the number of shards.
Frédéric.
- Le 14 Nov 24, à 13:33, Istvan Szabo, Agoda a
écrit :
> This bucket receives 300 post/put/delete a sec.
> I'll take a look at that, thank you.
> 37x4/nvme, however y
Hi Roland,
Yes, you can. See mclock documentation here [1].
One think I can think of is that these 113 PGs may have a common misbehaving
OSD (primary or not) with a ridiculous osd_mclock_max_capacity_iops_ssd value
set.
Restarting the primary and/or adjusting osd_mclock_max_capacity_iops_ssd
v
I was able to get what I needed from http://mirrors.gigenet.com/ceph/ (one
of the mirrors listed in the Ceph doco).
On Thu, Nov 14, 2024, 6:05 AM Frank Schilder wrote:
> Hi all,
>
> +1 from me
>
> this is a really bad issue. We need access to these packages very soon.
> Please restore this folde
This bucket receives 300 post/put/delete a sec.
I'll take a look at that, thank you.
37x4/nvme, however yes, I think we need to increase for now.
Thank you.
From: Frédéric Nass
Sent: Thursday, November 14, 2024 5:50 PM
To: Szabo, Istvan (Agoda)
Cc: Ceph Users
Su
Hi all,
+1 from me
this is a really bad issue. We need access to these packages very soon. Please
restore this folder.
In the meantime, is there a mirror somewhere?
Best regards,
=
Frank Schilder
AIT Risø Campus
Bygning 109, rum S14
Fro
On 2024/11/14 11:44, Joachim Kraftmayer wrote:
I know the similar behaviour when mclock is active.
For osd.0 I see:
osd.0 basic osd_mclock_max_capacity_iops_ssd 14305.161403
I'm unfamiliar with mclock. Can one tune that to improve the situation?
Roland
Joachim
joachim.kra
I don't know how many pools you have in your cluster but ~37 PGs per OSD seems
quite low, especially with NVMes. You could try increasing the number of PGs on
this pool and maybe the data pool also.
I don't know how many iops this bucket receives but the fact that index is
spread over only 11 r
156x NVME osd
Sharding I do like 10 objects/1 shard. Default 11 but they don't have 1.1m
objects.
This is the tree:
https://gist.github.com/Badb0yBadb0y/835a45f8e82ddfcbbd82cf28126da728
From: Frédéric Nass
Sent: Thursday, November 14, 2024 4:28 PM
To: Szab
I know the similar behaviour when mclock is active.
Joachim
joachim.kraftma...@clyso.com
www.clyso.com
Hohenzollernstr. 27, 80801 Munich
Utting a. A. | HR: Augsburg | HRB: 25866 | USt. ID-Nr.: DE2754306
Roland Giesler schrieb am Do., 14. Nov. 2024, 05:40:
> On 2024/11/13 21:05, Anthony D'
Hi Istvan,
> Only thing what I have in my mind to increase the replica size from 3 to 5 so
> it could tollerate more osd slowness with size 5 min_size 2.
I wouldn't do that, it will only get worse as every write IO will have to wait
for 2 mores OSDs to ACK and the slow ops you've seen refer t
It's not clear to me if you wanted to add some more details after "I
see this:" (twice).
So you do see backfilling traffic if you out the OSD? Then maybe the
remapped PGs are not even on that OSD? Have you checked 'ceph pg ls
remapped'?
To drain an OSD, you can either set it "out" as you alre
I had attached images, but these are not shown...
On 2024/11/14 10:12, Roland Giesler wrote:
On 2024/11/14 09:37, Eugen Block wrote:
Remapped PGs is exactly what to expect after removing (or adding) a
device class. Did you revert the change entirely? It sounds like you
maybe forgot to add the
On 2024/11/14 09:37, Eugen Block wrote:
Remapped PGs is exactly what to expect after removing (or adding) a
device class. Did you revert the change entirely? It sounds like you
maybe forgot to add the original device class back to the OSD where
you changed it? Maybe share 'ceph osd tree'? Do yo
18 matches
Mail list logo