[ceph-users] Snaptrim making cluster unusable

2021-01-10 Thread Pascal Ehlert
Hi all, We are running a small cluster with three nodes and 6-8 OSDs each. The OSDs are SSDs with sizes from 2 to 4 TB. Crush map is configured so all data is replicated to each node. The Ceph version is Ceph 15.2.6. Today I deleted 4 Snapshots of the same two 400GB and 500GB rbd volumes. Shor

[ceph-users] Re: Snaptrim making cluster unusable

2021-01-10 Thread Pascal Ehlert
Hi Frank, Thanks for getting back! - ceph version 15.2.6 (now upgraded to 15.2.8 and I was able to reproduce the issue) - rbd image config (meta- and data pool the same/different?) We are not using EC but regular replicated pools, so I assume meta and data pool are the same? - how many PGs d

[ceph-users] Re: Snaptrim making cluster unusable

2021-01-10 Thread Pascal Ehlert
I made the suggested changes. (Un)fortunately I am not able to reproduce the issue anymore. Neither with the original settings nor the updated setting. This may be due to the fact that the problematic snapshots have been removed/trimmed now. When I make new snapshots of the same volumes, they a

[ceph-users] Re: Snaptrim making cluster unusable

2021-01-10 Thread Frank Schilder
Hi Pascal, can you add a bit more information: - ceph version - rbd image config (meta- and data pool the same/different?) - how many PGs do the affected pools have - how many PGs per OSD (as stated by ceph osd df tree) - what type of SSDs, do they have power loss protection, is write cache disab

[ceph-users] Re: Snaptrim making cluster unusable

2021-01-10 Thread Frank Schilder
>> - do you have bluefs_buffered_io set to true > No Try setting it to true. > Is there anything specific I can do to check the write cache configuration? Yes, "smartctl -g wcache DEVICE" will tell you if writeback cache is disabled. If not, use "smartctl -s wcache=off DEVICE" to disable it. No

[ceph-users] Re: Snaptrim making cluster unusable

2021-01-10 Thread Frank Schilder
SSDs are not equal to high performance: https://yourcmc.ru/wiki/index.php?title=Ceph_performance&mobileaction=toggle_view_desktop Depending on your model, performance can be very poor. Best regards, = Frank Schilder AIT Risø Campus Bygning 109, rum S14 __

[ceph-users] Re: Snaptrim making cluster unusable

2021-01-10 Thread Anthony D'Atri
When the below was first published my team tried to reproduce, and couldn’t. A couple of factors likely contribute to differing behavior: * _Micron 5100_ for example isn’t a model, the 5100 _Eco_, _Pro_, and _Max_ are different beasts. Similarly, implementation and firmware details vary by dri

[ceph-users] Which version of Ceph fully supports CephFS Snapshot?

2021-01-10 Thread fantastic2085
I would like to use the Cephfs Snapshot feature, which version of Ceph supports it ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io