It may be that having multiple mds is masking the issue, or that we truly didn't have a large enough inode cache at 55GB. Things are behaving for me now, even when presenting the same 0 entries in req and rlat.
If this happens again, I'll attempt to get perf trace logs, along with ops, ops_in_flight, perf dump and objecter requests. Thanks for your time. -- Adam On Mon, Oct 1, 2018 at 10:36 PM Adam Tygart <mo...@ksu.edu> wrote: > > Okay, here's what I've got: https://www.paste.ie/view/abe8c712 > > Of note, I've changed things up a little bit for the moment. I've > activated a second mds to see if it is a particular subtree that is > more prone to issues. maybe EC vs replica... The one that is currently > being slow has my EC volume pinned to it. > > -- > Adam > On Mon, Oct 1, 2018 at 10:02 PM Gregory Farnum <gfar...@redhat.com> wrote: > > > > Can you grab the perf dump during this time, perhaps plus dumps of the ops > > in progress? > > > > This is weird but given it’s somewhat periodic it might be something like > > the MDS needing to catch up on log trimming (though I’m unclear why > > changing the cache size would impact this). > > > > On Sun, Sep 30, 2018 at 9:02 PM Adam Tygart <mo...@ksu.edu> wrote: > >> > >> Hello all, > >> > >> I've got a ceph (12.2.8) cluster with 27 servers, 500 osds, and 1000 > >> cephfs mounts (kernel client). We're currently only using 1 active > >> mds. > >> > >> Performance is great about 80% of the time. MDS responses (per ceph > >> daemonperf mds.$(hostname -s), indicates 2k-9k requests per second, > >> with a latency under 100. > >> > >> It is the other 20ish percent I'm worried about. I'll check on it and > >> it with be going 5-15 seconds with "0" requests, "0" latency, then > >> give me 2 seconds of reasonable response times, and then back to > >> nothing. Clients are actually seeing blocked requests for this period > >> of time. > >> > >> The strange bit is that when I *reduce* the mds_cache_size, requests > >> and latencies go back to normal for a while. When it happens again, > >> I'll increase it back to where it was. It feels like the mds server > >> decides that some of these inodes can't be dropped from the cache > >> unless the cache size changes. Maybe something wrong with the LRU? > >> > >> I feel like I've got a reasonable cache size for my workload, 30GB on > >> the small end, 55GB on the large. No real reason for a swing this > >> large except to potentially delay it recurring after expansion for > >> longer. > >> > >> I also feel like there is probably some magic tunable to change how > >> inodes get stuck in the LRU. perhaps mds_cache_mid. Anyone know what > >> this tunable actually does? The documentation is a little sparse. > >> > >> I can grab logs from the mds if needed, just let me know the settings > >> you'd like to see. > >> > >> -- > >> Adam > >> _______________________________________________ > >> ceph-users mailing list > >> ceph-users@lists.ceph.com > >> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com > _______________________________________________ > ceph-users mailing list > ceph-users@lists.ceph.com > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com _______________________________________________ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com