On Tue, Oct 17, 2017 at 6:36 AM Yoann Moulin <yoann.mou...@epfl.ch> wrote:

> Hello,
>
> I have a luminous (12.2.1) cluster with 3 nodes for cephfs (no rbd or rgw)
> and we hit the "X clients failing to respond to cache pressure" message.
> I have 3 mds servers active.
>
> Is this something I have to worry about ?
>

This message means
* the MDS has exceeded the size of its cache, and
* the MSD has asked clients to reduce the number of files they hold
capabilities on (so the MDS can trim them out of cache), and
* the clients are not returning capabilities

It's entirely possible this is because the clients are actually holding
references to all those files. If you haven't configured your cache size
explicitly, you can probably increase it by a lot, and perhaps put this
warning to bed.
-Greg


>
> here some information about the cluster :
>
> > root@iccluster054:~# ceph --cluster container -s
> >   cluster:
> >     id:     a294a95a-0baa-4641-81c1-7cd70fd93216
> >     health: HEALTH_WARN
> >             3 clients failing to respond to cache pressure
> >
> >   services:
> >     mon: 3 daemons, quorum iccluster041.iccluster.epfl.ch,
> iccluster042.iccluster.epfl.ch,iccluster054.iccluster.epfl.ch
> >     mgr: iccluster042(active), standbys: iccluster054
> >     mds: cephfs-3/3/3 up  {0=iccluster054.iccluster.epfl.ch=up:active,1=
> iccluster041.iccluster.epfl.ch=up:active,2=iccluster042.iccluster.epfl.ch
> =up:active}
> >     osd: 18 osds: 18 up, 18 in
> >
> >   data:
> >     pools:   3 pools, 544 pgs
> >     objects: 2357k objects, 564 GB
> >     usage:   2011 GB used, 65055 GB / 67066 GB avail
> >     pgs:     544 active+clean
> >
>
>
>
> > root@iccluster041:~# ceph --cluster container daemon
> mds.iccluster041.iccluster.epfl.ch perf dump mds
> > {
> >     "mds": {
> >         "request": 193508283,
> >         "reply": 192815355,
> >         "reply_latency": {
> >             "avgcount": 192815355,
> >             "sum": 457371.475011160,
> >             "avgtime": 0.002372069
> >         },
> >         "forward": 692928,
> >         "dir_fetch": 1717132,
> >         "dir_commit": 43521,
> >         "dir_split": 4197,
> >         "dir_merge": 4244,
> >         "inode_max": 2147483647 <(214)%20748-3647>,
> >         "inodes": 11098,
> >         "inodes_top": 7668,
> >         "inodes_bottom": 3404,
> >         "inodes_pin_tail": 26,
> >         "inodes_pinned": 143,
> >         "inodes_expired": 1386234444,
> >         "inodes_with_caps": 87,
> >         "caps": 239,
> >         "subtrees": 15,
> >         "traverse": 195425369,
> >         "traverse_hit": 192867085,
> >         "traverse_forward": 692723,
> >         "traverse_discover": 476,
> >         "traverse_dir_fetch": 1714684,
> >         "traverse_remote_ino": 0,
> >         "traverse_lock": 6,
> >         "load_cent": 19465322425,
> >         "q": 0,
> >         "exported": 1211,
> >         "exported_inodes": 845556,
> >         "imported": 1082,
> >         "imported_inodes": 1209280
> >     }
> > }
>
>
> > root@iccluster041:~# ceph --cluster container daemon
> mds.iccluster041.iccluster.epfl.ch perf dump mds
> > {
> >     "mds": {
> >         "request": 193508283,
> >         "reply": 192815355,
> >         "reply_latency": {
> >             "avgcount": 192815355,
> >             "sum": 457371.475011160,
> >             "avgtime": 0.002372069
> >         },
> >         "forward": 692928,
> >         "dir_fetch": 1717132,
> >         "dir_commit": 43521,
> >         "dir_split": 4197,
> >         "dir_merge": 4244,
> >         "inode_max": 2147483647 <(214)%20748-3647>,
> >         "inodes": 11098,
> >         "inodes_top": 7668,
> >         "inodes_bottom": 3404,
> >         "inodes_pin_tail": 26,
> >         "inodes_pinned": 143,
> >         "inodes_expired": 1386234444,
> >         "inodes_with_caps": 87,
> >         "caps": 239,
> >         "subtrees": 15,
> >         "traverse": 195425369,
> >         "traverse_hit": 192867085,
> >         "traverse_forward": 692723,
> >         "traverse_discover": 476,
> >         "traverse_dir_fetch": 1714684,
> >         "traverse_remote_ino": 0,
> >         "traverse_lock": 6,
> >         "load_cent": 19465322425,
> >         "q": 0,
> >         "exported": 1211,
> >         "exported_inodes": 845556,
> >         "imported": 1082,
> >         "imported_inodes": 1209280
> >     }
> > }
>
> > root@iccluster054:~# ceph --cluster container daemon
> mds.iccluster054.iccluster.epfl.ch perf dump mds
> > {
> >     "mds": {
> >         "request": 267620366,
> >         "reply": 255792944,
> >         "reply_latency": {
> >             "avgcount": 255792944,
> >             "sum": 42256.407340600,
> >             "avgtime": 0.000165197
> >         },
> >         "forward": 11827411,
> >         "dir_fetch": 183,
> >         "dir_commit": 2607,
> >         "dir_split": 27,
> >         "dir_merge": 19,
> >         "inode_max": 2147483647 <(214)%20748-3647>,
> >         "inodes": 3740,
> >         "inodes_top": 2517,
> >         "inodes_bottom": 1149,
> >         "inodes_pin_tail": 74,
> >         "inodes_pinned": 143,
> >         "inodes_expired": 2103018,
> >         "inodes_with_caps": 57,
> >         "caps": 272,
> >         "subtrees": 8,
> >         "traverse": 267626346,
> >         "traverse_hit": 255796915,
> >         "traverse_forward": 11826902,
> >         "traverse_discover": 77,
> >         "traverse_dir_fetch": 30,
> >         "traverse_remote_ino": 0,
> >         "traverse_lock": 0,
> >         "load_cent": 26824996745,
> >         "q": 3,
> >         "exported": 1319,
> >         "exported_inodes": 2037400,
> >         "imported": 418,
> >         "imported_inodes": 7347
> >     }
> > }
>
> --
> Yoann Moulin
> EPFL IC-IT
> _______________________________________________
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
_______________________________________________
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Reply via email to