A short correction:
The IOPS from the bench in out pacific cluster are also down to 40 again
for the 4/8TB disks , but the apply latency seems to stay in the same place.
But I still don't understand why it is down again. Even when I synced out
the OSD so it receives 0 traffic it is still slow. After idling over night
it is back up to 120 IOPS

Am Do., 30. März 2023 um 09:45 Uhr schrieb Boris Behrens <b...@kervyn.de>:

> After some digging in the nautilus cluster I see that the disks with the
> exceptional high IOPS performance are actually SAS attached NVME disks
> (these:
> https://semiconductor.samsung.com/ssd/enterprise-ssd/pm1643-pm1643a/mzilt7t6hala-00007/
> ) and these disk make around 45% of cluster capacity. Maybe this explains
> the very low commit latency in the nautilus cluster.
>
> I did a bench on all SATA 8TB disks (nautilus) and most all of them only
> have ~30-50 IOPS.
> After redeploying one OSD with blkdiscard the IOPS went from 48 -> 120.
>
> The IOPS from the bench in out pacific cluster are also down to 40 again
> for the 4/8TB disks , but the apply latency seems to stay in the same place.
> But I still don't understand why it is down again. Even when I synced out
> the OSD so it receives 0 traffic it is still slow.
>
> I am unsure how I should interpret this. It also looks like that the AVG
> apply latency (4h resolution) goes up again (2023-03-01 upgrade to pacific,
> the dip around 25th was the redeploy and now it seems to go up again)
> [image: image.png]
>
>
>
> Am Mo., 27. März 2023 um 17:24 Uhr schrieb Igor Fedotov <
> igor.fedo...@croit.io>:
>
>>
>> On 3/27/2023 12:19 PM, Boris Behrens wrote:
>>
>> Nonetheless the IOPS the bench command generates are still VERY low
>> compared to the nautilus cluster (~150 vs ~250). But this is something I
>> would pin to this bug: https://tracker.ceph.com/issues/58530
>>
>> I've just run "ceph tell bench" against main, octopus and nautilus
>> branches (fresh osd deployed with vstart.sh) - I don't see any difference
>> between releases - sata drive shows around 110 IOPs in my case..
>>
>> So I suspect some difference between clusters in your case. E.g. are you
>> sure disk caching is off for both?
>>
>> @Igor do you want to me to update the ticket with my findings and the logs
>> from pastebin?
>>
>> Feel free to update if you like but IMO we still lack the understanding
>> what was the trigger for perf improvements in you case - OSD redeployment,
>> disk trimming or both?
>>
>>
>>

-- 
Die Selbsthilfegruppe "UTF-8-Probleme" trifft sich diesmal abweichend im
groüen Saal.
_______________________________________________
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

Reply via email to