Hi, You should upgrade to the latest firefly release. Your probably suffering from the known issue with snapshot trimming.
Cheers, Dan On Jul 4, 2015 10:19, "Eino Tuominen" <[email protected]> wrote: > > Hello, > > We are running 0.80.5 on our production cluster and we are seeing slow requests when deleting rbd snapshots. We have now reduced snapshot counts to 4 weeklies but it seems that the snapshot count is not a factor of this problem. The cluster is practically unresponsive so long that clients timeout. > > Here are top ten slowest requests per osd from last night (times in seconds): > > 1 /var/log/ceph/ceph-osd.46.log 1920 > 2 /var/log/ceph/ceph-osd.42.log 1455 > 3 /var/log/ceph/ceph-osd.74.log 1292 > 4 /var/log/ceph/ceph-osd.77.log 1170 > 5 /var/log/ceph/ceph-osd.48.log 1083 > 6 /var/log/ceph/ceph-osd.0.log 960 > 7 /var/log/ceph/ceph-osd.40.log 960 > 8 /var/log/ceph/ceph-osd.57.log 960 > 9 /var/log/ceph/ceph-osd.61.log 960 > 10 /var/log/ceph/ceph-osd.76.log 960 > > Some OSDs don't report slow requests at all, they are not evenly distributed. > > Currently we run journals on the osd sata drives, but are considering upgrading to SSD journals. However, we do not have any performance problems other than when deleting snapshots. > > Is there any way to mitigate the problem other than investing on SSD journals? > > -- > Eino Tuominen > _______________________________________________ > ceph-users mailing list > [email protected] > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
_______________________________________________ ceph-users mailing list [email protected] http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
