Making mds cache size 5 million seems to have helped significantly, but we’re still seeing issues occasionally on metadata reads while under load. Settings over 5 million don’t seem to have any noticeable impact on this problem. I’m starting the upgrade to Giant today. -- Kevin Sumner ke...@sumner.io
> On Nov 18, 2014, at 1:10 PM, Kevin Sumner <ke...@sumner.io> wrote: > > Hi Thomas, > > I looked over the mds config reference a bit yesterday, but mds cache size > seems to be the most relevant tunable. > > As suggested, I upped mds-cache-size to 1 million yesterday and started the > load generator. During load generation, we’re seeing similar behavior on the > filesystem and the mds. The mds process is running a little hotter now with > higher CPU average and 11GB resident size (was just under 10GB iirc). > Enumerating files on the filesystem, e.g., with ls, is still hanging though. > > With load generation disabled, the behavior is the same as before, i.e., > things work ask expected. > > I’ve got a lot of memory and CPU headroom on the box hosting the mds, so > unless there’s good reason not to, I’m to continue increasing the mds cache > iteratively in the hopes of finding a size that produces good behavior. > Right now, I’d expect us to hit around 2 million inodes each minute, so cache > at 1 million is still undersized. If that doesn’t work, we’re running > Firefly on the cluster currently and I’ll be upgrading it to Giant. > -- > Kevin Sumner > ke...@sumner.io <mailto:ke...@sumner.io> > > > >> On Nov 18, 2014, at 1:36 AM, Thomas Lemarchand >> <thomas.lemarch...@cloud-solutions.fr >> <mailto:thomas.lemarch...@cloud-solutions.fr>> wrote: >> >> Hi Kevin, >> >> There are every (I think) MDS tunables listed on this page with a short >> description : http://ceph.com/docs/master/cephfs/mds-config-ref/ >> <http://ceph.com/docs/master/cephfs/mds-config-ref/> >> >> Can you tell us how your cluster behave after the mds-cache-size >> change ? What is your MDS ram consumption, before and after ? >> >> Thanks ! >> -- >> Thomas Lemarchand >> Cloud Solutions SAS - Responsable des systèmes d'information >> >> >> >> On lun., 2014-11-17 at 16:06 -0800, Kevin Sumner wrote: >>>> On Nov 17, 2014, at 15:52, Sage Weil <s...@newdream.net >>>> <mailto:s...@newdream.net>> wrote: >>>> >>>> On Mon, 17 Nov 2014, Kevin Sumner wrote: >>>>> I?ve got a test cluster together with a ~500 OSDs and, 5 MON, and >>>>> 1 MDS. All >>>>> the OSDs also mount CephFS at /ceph. I?ve got Graphite pointing >>>>> at a space >>>>> under /ceph. Over the weekend, I drove almost 2 million metrics, >>>>> each of >>>>> which creates a ~3MB file in a hierarchical path, each sending a >>>>> datapoint >>>>> into the metric file once a minute. CephFS seemed to handle the >>>>> writes ok >>>>> while I was driving load. All files containing each metric are at >>>>> paths >>>>> like this: >>>>> /ceph/whisper/sandbox/cephtest-osd0013/2/3/4/5.wsp >>>>> >>>>> Today, however, with the load generator still running, reading >>>>> metadata of >>>>> files (e.g. directory entries and stat(2) info) in the filesystem >>>>> (presumably MDS-managed data) seems nearly impossible, especially >>>>> deeper >>>>> into the tree. For example, in a shell cd seems to work but >>>>> ls hangs, >>>>> seemingly indefinitely. After turning off the load generator and >>>>> allowing a >>>>> while for things to settle down, everything seems to behave >>>>> better. >>>>> >>>>> ceph status and ceph health both return good statuses the entire >>>>> time. >>>>> During load generation, the ceph-mds process seems pegged at >>>>> between 100% >>>>> and 150%, but with load generation turned off, the process has >>>>> some high >>>>> variability from near-idle up to similar 100-150% CPU. >>>>> >>>>> Hopefully, I?ve missed something in the CephFS tuning. However, >>>>> I?m looking for >>>>> direction on figuring out if it is, indeed, a tuning problem or if >>>>> this >>>>> behavior is a symptom of the ?not ready for production? banner in >>>>> the >>>>> documentation. >>>> >>>> My first guess is that the MDS cache is just too small and it is >>>> thrashing. Try >>>> >>>> ceph mds tell 0 injectargs '--mds-cache-size 1000000' >>>> >>>> That's 10x bigger than the default, tho be aware that it will eat up >>>> 10x >>>> as much RAM too. >>>> >>>> We've also seen teh cache behave in a non-optimal way when evicting >>>> things, making it thrash more often than it should. I'm hoping we >>>> can >>>> implement something like MQ instead of our two-level LRU, but it >>>> isn't >>>> high on the priority list right now. >>>> >>>> sage >>> >>> >>> Thanks! I’ll pursue mds cache size tuning. Is there any guidance on >>> setting the cache and other mds tunables correctly, or is it an >>> adjust-and-test sort of thing? Cursory searching doesn’t return any >>> relevant documentation for ceph.com <http://ceph.com/>. I’m plowing >>> through some other >>> list posts now. >>> -- >>> Kevin Sumner >>> ke...@sumner.io <mailto:ke...@sumner.io> >>> >>> >>> >>> >>> -- >>> This message has been scanned for viruses and >>> dangerous content by MailScanner, and is >>> believed to be clean. >>> _______________________________________________ >>> ceph-users mailing list >>> ceph-users@lists.ceph.com <mailto:ceph-users@lists.ceph.com> >>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com >>> <http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com> >> >> >> -- >> This message has been scanned for viruses and >> dangerous content by MailScanner, and is >> believed to be clean. > > _______________________________________________ > ceph-users mailing list > ceph-users@lists.ceph.com > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
_______________________________________________ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com