On Tue, Oct 17, 2017 at 6:36 AM Yoann Moulin <yoann.mou...@epfl.ch> wrote:
> Hello, > > I have a luminous (12.2.1) cluster with 3 nodes for cephfs (no rbd or rgw) > and we hit the "X clients failing to respond to cache pressure" message. > I have 3 mds servers active. > > Is this something I have to worry about ? > This message means * the MDS has exceeded the size of its cache, and * the MSD has asked clients to reduce the number of files they hold capabilities on (so the MDS can trim them out of cache), and * the clients are not returning capabilities It's entirely possible this is because the clients are actually holding references to all those files. If you haven't configured your cache size explicitly, you can probably increase it by a lot, and perhaps put this warning to bed. -Greg > > here some information about the cluster : > > > root@iccluster054:~# ceph --cluster container -s > > cluster: > > id: a294a95a-0baa-4641-81c1-7cd70fd93216 > > health: HEALTH_WARN > > 3 clients failing to respond to cache pressure > > > > services: > > mon: 3 daemons, quorum iccluster041.iccluster.epfl.ch, > iccluster042.iccluster.epfl.ch,iccluster054.iccluster.epfl.ch > > mgr: iccluster042(active), standbys: iccluster054 > > mds: cephfs-3/3/3 up {0=iccluster054.iccluster.epfl.ch=up:active,1= > iccluster041.iccluster.epfl.ch=up:active,2=iccluster042.iccluster.epfl.ch > =up:active} > > osd: 18 osds: 18 up, 18 in > > > > data: > > pools: 3 pools, 544 pgs > > objects: 2357k objects, 564 GB > > usage: 2011 GB used, 65055 GB / 67066 GB avail > > pgs: 544 active+clean > > > > > > > root@iccluster041:~# ceph --cluster container daemon > mds.iccluster041.iccluster.epfl.ch perf dump mds > > { > > "mds": { > > "request": 193508283, > > "reply": 192815355, > > "reply_latency": { > > "avgcount": 192815355, > > "sum": 457371.475011160, > > "avgtime": 0.002372069 > > }, > > "forward": 692928, > > "dir_fetch": 1717132, > > "dir_commit": 43521, > > "dir_split": 4197, > > "dir_merge": 4244, > > "inode_max": 2147483647 <(214)%20748-3647>, > > "inodes": 11098, > > "inodes_top": 7668, > > "inodes_bottom": 3404, > > "inodes_pin_tail": 26, > > "inodes_pinned": 143, > > "inodes_expired": 1386234444, > > "inodes_with_caps": 87, > > "caps": 239, > > "subtrees": 15, > > "traverse": 195425369, > > "traverse_hit": 192867085, > > "traverse_forward": 692723, > > "traverse_discover": 476, > > "traverse_dir_fetch": 1714684, > > "traverse_remote_ino": 0, > > "traverse_lock": 6, > > "load_cent": 19465322425, > > "q": 0, > > "exported": 1211, > > "exported_inodes": 845556, > > "imported": 1082, > > "imported_inodes": 1209280 > > } > > } > > > > root@iccluster041:~# ceph --cluster container daemon > mds.iccluster041.iccluster.epfl.ch perf dump mds > > { > > "mds": { > > "request": 193508283, > > "reply": 192815355, > > "reply_latency": { > > "avgcount": 192815355, > > "sum": 457371.475011160, > > "avgtime": 0.002372069 > > }, > > "forward": 692928, > > "dir_fetch": 1717132, > > "dir_commit": 43521, > > "dir_split": 4197, > > "dir_merge": 4244, > > "inode_max": 2147483647 <(214)%20748-3647>, > > "inodes": 11098, > > "inodes_top": 7668, > > "inodes_bottom": 3404, > > "inodes_pin_tail": 26, > > "inodes_pinned": 143, > > "inodes_expired": 1386234444, > > "inodes_with_caps": 87, > > "caps": 239, > > "subtrees": 15, > > "traverse": 195425369, > > "traverse_hit": 192867085, > > "traverse_forward": 692723, > > "traverse_discover": 476, > > "traverse_dir_fetch": 1714684, > > "traverse_remote_ino": 0, > > "traverse_lock": 6, > > "load_cent": 19465322425, > > "q": 0, > > "exported": 1211, > > "exported_inodes": 845556, > > "imported": 1082, > > "imported_inodes": 1209280 > > } > > } > > > root@iccluster054:~# ceph --cluster container daemon > mds.iccluster054.iccluster.epfl.ch perf dump mds > > { > > "mds": { > > "request": 267620366, > > "reply": 255792944, > > "reply_latency": { > > "avgcount": 255792944, > > "sum": 42256.407340600, > > "avgtime": 0.000165197 > > }, > > "forward": 11827411, > > "dir_fetch": 183, > > "dir_commit": 2607, > > "dir_split": 27, > > "dir_merge": 19, > > "inode_max": 2147483647 <(214)%20748-3647>, > > "inodes": 3740, > > "inodes_top": 2517, > > "inodes_bottom": 1149, > > "inodes_pin_tail": 74, > > "inodes_pinned": 143, > > "inodes_expired": 2103018, > > "inodes_with_caps": 57, > > "caps": 272, > > "subtrees": 8, > > "traverse": 267626346, > > "traverse_hit": 255796915, > > "traverse_forward": 11826902, > > "traverse_discover": 77, > > "traverse_dir_fetch": 30, > > "traverse_remote_ino": 0, > > "traverse_lock": 0, > > "load_cent": 26824996745, > > "q": 3, > > "exported": 1319, > > "exported_inodes": 2037400, > > "imported": 418, > > "imported_inodes": 7347 > > } > > } > > -- > Yoann Moulin > EPFL IC-IT > _______________________________________________ > ceph-users mailing list > ceph-users@lists.ceph.com > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com >
_______________________________________________ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com