To my best knowledge nobody used hardlinks within fs. So I have unmounted everything to see what would happen:
[root@005-s-ragnarok ragnarok]# ceph daemon mds.fast-test session ls [] -----mds------ --mds_server-- ---objecter--- -----mds_cache----- ---mds_log---- rlat inos caps|hsr hcs hcr |writ read actv|recd recy stry purg|segs evts subm| 0 99k 0 | 0 0 0 | 0 0 0 | 0 0 27k 0 | 31 27k 0 0 99k 0 | 0 0 0 | 0 0 0 | 0 0 27k 0 | 31 27k 0 0 99k 0 | 0 0 0 | 0 0 0 | 0 0 27k 0 | 31 27k 0 0 99k 0 | 0 0 0 | 0 0 0 | 0 0 27k 0 | 31 27k 0 0 99k 0 | 0 0 0 | 0 0 0 | 0 0 27k 0 | 31 27k 0 0 99k 0 | 0 0 0 | 0 0 0 | 0 0 27k 0 | 31 27k 0 0 99k 0 | 0 0 0 | 0 0 0 | 0 0 27k 0 | 31 27k 0 0 99k 0 | 0 0 0 | 0 0 0 | 0 0 27k 0 | 31 27k 0 0 99k 0 | 0 0 0 | 0 0 0 | 0 0 27k 0 | 31 27k 0 0 99k 0 | 0 0 0 | 0 0 0 | 0 0 27k 0 | 31 27k 0 0 99k 0 | 0 0 0 | 0 0 0 | 0 0 27k 0 | 31 27k 0 0 99k 0 | 0 0 0 | 0 0 0 | 0 0 27k 0 | 31 27k 0 0 99k 0 | 0 0 0 | 0 0 0 | 0 0 27k 0 | 31 27k 0 0 99k 0 | 0 0 0 | 0 0 0 | 0 0 27k 0 | 31 27k 0 0 99k 0 | 0 0 0 | 0 0 0 | 0 0 27k 0 | 31 27k 0 0 99k 0 | 0 0 0 | 0 0 0 | 0 0 27k 0 | 31 27k 0 0 99k 0 | 0 0 0 | 0 0 0 | 0 0 27k 0 | 31 27k 0 0 99k 0 | 0 0 0 | 0 0 0 | 0 0 27k 0 | 31 27k 0 0 99k 0 | 0 0 0 | 0 0 0 | 0 0 27k 0 | 31 27k 0 0 99k 0 | 0 0 0 | 0 0 0 | 0 0 27k 0 | 31 27k 0 0 99k 0 | 0 0 0 | 0 0 0 | 0 0 27k 0 | 31 27k 0 0 99k 0 | 0 0 0 | 0 0 0 | 0 0 27k 0 | 31 27k 0 0 99k 0 | 0 0 0 | 0 0 0 | 0 0 27k 0 | 31 27k 0 0 99k 0 | 0 0 0 | 0 0 0 | 0 0 27k 0 | 31 27k 0 The amount of objects is stry stays the same over the time. Then I mount one of the clients and start deleting again *ceph tell mds.fast-test injectargs --mds-max-purge-files 64 (default):2016-10-04 13:58:13.754666 7f39e0010700 0 client.1522041 ms_handle_reset on XXX.XXX.XXX.XXX:6800/52612016-10-04 13:58:13.773739 7f39e0010700 0 client.1522042 ms_handle_reset on * *XXX.XXX.XXX.XXX:6800/5261mds_max_purge_files = '64' (unchangeable)* -----mds------ --mds_server-- ---objecter--- -----mds_cache----- ---mds_log---- rlat inos caps|hsr hcs hcr |writ read actv|recd recy stry purg|segs evts subm| 0 100k 40k| 0 0 1.1k| 50 0 68 | 0 0 40k 46 | 35 21k 635 0 100k 40k| 0 0 1.1k| 32 0 68 | 0 0 40k 31 | 35 22k 625 0 101k 39k| 0 0 935 | 46 0 69 | 0 0 41k 43 | 31 22k 516 0 101k 39k| 0 0 833 | 80 0 64 | 0 0 41k 75 | 32 23k 495 0 101k 39k| 0 0 1.1k| 73 0 64 | 0 0 42k 73 | 33 24k 649 0 100k 39k| 0 0 1.1k| 84 0 68 | 0 0 42k 79 | 31 22k 651 0 100k 39k| 0 0 1.1k|100 0 67 | 0 0 42k 100 | 31 22k 695 0 101k 33k| 0 0 1.1k| 38 0 69 | 0 0 43k 36 | 33 23k 607 0 101k 33k| 0 0 1.1k| 72 0 68 | 0 0 44k 72 | 33 24k 668 0 102k 33k| 0 0 1.2k| 64 0 68 | 0 0 44k 64 | 34 24k 666 0 100k 33k| 0 0 1.0k|418 0 360 | 0 0 45k 33 | 35 25k 573 0 100k 33k| 0 0 1.2k| 19 0 310 | 0 0 45k 19 | 36 25k 624 0 101k 33k| 0 0 1.2k| 33 0 236 | 0 0 46k 31 | 37 26k 633 0 102k 33k| 0 0 1.1k| 54 0 176 | 0 0 46k 54 | 37 27k 618 0 102k 33k| 0 0 1.1k| 65 0 133 | 0 0 47k 63 | 39 27k 639 0 100k 33k| 0 0 804 | 87 0 93 | 0 0 47k 79 | 39 28k 485 0 100k 33k| 0 0 1.2k| 62 0 85 | 0 0 48k 62 | 40 28k 670 0 101k 28k| 0 1 1.0k|109 0 65 | 0 0 48k 103 | 41 29k 617 0 101k 28k| 0 0 1.1k| 92 0 65 | 0 0 49k 92 | 42 30k 690 0 102k 28k| 0 0 1.1k| 80 0 65 | 0 0 49k 78 | 43 30k 672 0 100k 28k| 0 0 1.0k|234 0 261 | 0 0 50k 35 | 34 24k 582 0 100k 28k| 0 0 1.1k| 71 0 258 | 0 0 50k 71 | 35 25k 667 0 101k 26k| 0 0 1.2k| 97 0 259 | 0 0 51k 95 | 36 26k 706 0 102k 26k| 0 0 1.0k| 53 0 258 | 0 0 51k 53 | 37 26k 569 *ceph tell mds.fast-test injectargs --mds-max-purge-files 1000:2016-10-04 14:03:20.449961 7fd9e1012700 0 client.1522044 ms_handle_reset on * * XXX.XXX.XXX.XXX:6800/52612016-10-04 14:03:20.469952 7fd9e1012700 0 client.1522045 ms_handle_reset on * * XXX.XXX.XXX.XXX:6800/5261mds_max_purge_files = '1000' (unchangeable) *-----mds------ --mds_server-- ---objecter--- -----mds_cache----- ---mds_log---- rlat inos caps|hsr hcs hcr |writ read actv|recd recy stry purg|segs evts subm| 0 99k 1.0k| 0 0 0 |111 0 260 | 0 0 68k 110 | 39 29k 111 0 99k 1.0k| 0 0 0 |198 0 260 | 0 0 68k 198 | 39 29k 198 0 52k 1.0k| 0 0 0 |109 0 264 | 0 0 68k 102 | 39 23k 106 0 52k 1.0k| 0 0 0 |130 0 265 | 0 0 68k 125 | 39 23k 125 0 52k 1.0k| 0 1 0 |127 0 265 | 0 0 67k 127 | 39 23k 127 0 52k 1.0k| 0 0 0 | 84 0 264 | 0 0 67k 84 | 39 24k 84 0 52k 1.0k| 0 0 0 | 80 0 263 | 0 0 67k 80 | 39 24k 80 0 52k 1.0k| 0 0 0 | 89 0 260 | 0 0 67k 87 | 32 24k 89 0 52k 1.0k| 0 0 0 |134 0 259 | 0 0 67k 134 | 32 24k 134 0 52k 1.0k| 0 0 0 |155 0 259 | 0 0 67k 152 | 33 24k 154 0 52k 1.0k| 0 0 0 | 99 0 257 | 0 0 67k 99 | 33 24k 99 0 52k 1.0k| 0 0 0 | 84 0 257 | 0 0 67k 84 | 33 24k 84 0 52k 1.0k| 0 0 0 |117 0 257 | 0 0 67k 115 | 33 24k 115 0 52k 1.0k| 0 0 0 |122 0 257 | 0 0 66k 122 | 33 24k 122 0 52k 1.0k| 0 0 0 | 73 0 257 | 0 0 66k 73 | 33 24k 73 0 52k 1.0k| 0 0 0 |123 0 257 | 0 0 66k 123 | 33 25k 123 0 52k 1.0k| 0 0 0 | 87 0 257 | 0 0 66k 87 | 33 25k 87 0 52k 1.0k| 0 0 0 | 85 0 257 | 0 0 66k 83 | 33 25k 83 0 52k 1.0k| 0 0 0 | 55 0 257 | 0 0 66k 55 | 33 25k 55 0 52k 1.0k| 0 0 0 | 34 0 257 | 0 0 66k 34 | 33 25k 34 0 52k 1.0k| 0 0 0 | 58 0 257 | 0 0 66k 58 | 33 25k 58 0 52k 1.0k| 0 0 0 | 35 0 257 | 0 0 66k 35 | 33 25k 35 0 52k 1.0k| 0 0 0 | 65 0 259 | 0 0 66k 63 | 31 22k 64 0 52k 1.0k| 0 0 0 | 52 0 258 | 0 0 66k 52 | 31 23k 52 Seems like purge rate is virtually not sensitive to mds_max_purge_files. BTW, the rm completed well before the stry approached the ground state. -Mykola On 4 October 2016 at 09:16, John Spray <jsp...@redhat.com> wrote: > (Re-adding list) > > The 7.5k stray dentries while idle is probably indicating that clients > are holding onto references to them (unless you unmount the clients > and they don't purge, in which case you may well have found a bug). > The other way you can end up with lots of dentries sitting in stray > dirs is if you had lots of hard links and unlinked the original > location but left the hard link in place. > > The rate at which your files are purging seems to roughly correspond > to mds_max_purge_files, so I'd definitely try changing that to get > things purging faster. > > John > > On Mon, Oct 3, 2016 at 3:21 PM, Mykola Dvornik <mykola.dvor...@gmail.com> > wrote: > > Hi John, > > > > This is how the daemonperf looks like : > > > > background > > > > -----mds------ --mds_server-- ---objecter--- -----mds_cache----- > > ---mds_log---- > > rlat inos caps|hsr hcs hcr |writ read actv|recd recy stry purg|segs > evts > > subm| > > 0 99k 177k| 0 0 0 | 0 0 0 | 0 0 7.5k 0 | 31 > 22k > > 0 > > 0 99k 177k| 0 0 0 | 0 0 0 | 0 0 7.5k 0 | 31 > 22k > > 0 > > 0 99k 177k| 0 0 0 | 0 0 0 | 0 0 7.5k 0 | 31 > 22k > > 0 > > 0 99k 177k| 0 5 0 | 0 0 0 | 0 0 7.5k 0 | 31 > 22k > > 1 > > 0 99k 177k| 0 0 0 | 0 0 0 | 0 0 7.5k 0 | 31 > 22k > > 0 > > 0 99k 177k| 0 0 0 | 2 0 0 | 0 0 7.5k 0 | 31 > 22k > > 0 > > 0 99k 177k| 0 2 0 | 0 0 0 | 0 0 7.5k 0 | 31 > 22k > > 0 > > 0 99k 177k| 0 2 0 | 0 0 0 | 0 0 7.5k 0 | 31 > 22k > > 0 > > 0 99k 177k| 0 1 0 | 0 0 0 | 0 0 7.5k 0 | 31 > 22k > > 0 > > 0 99k 177k| 0 2 0 | 0 0 0 | 0 0 7.5k 0 | 31 > 22k > > 0 > > 0 99k 177k| 0 0 0 | 0 0 0 | 0 0 7.5k 0 | 31 > 22k > > 0 > > 0 99k 177k| 0 1 0 | 0 0 0 | 0 0 7.5k 0 | 31 > 22k > > 0 > > 0 99k 177k| 0 6 0 | 0 0 0 | 0 0 7.5k 0 | 31 > 22k > > 0 > > > > with 4 rm instances > > > > -----mds------ --mds_server-- ---objecter--- -----mds_cache----- > > ---mds_log---- > > rlat inos caps|hsr hcs hcr |writ read actv|recd recy stry purg|segs > evts > > subm| > > 0 172k 174k| 0 5 3.1k| 85 0 34 | 0 0 79k 83 | 45 > 31k > > 1.6k > > 0 174k 174k| 0 0 3.0k| 76 0 35 | 0 0 80k 72 | 48 > 32k > > 1.6k > > 0 175k 174k| 0 0 2.7k| 81 0 37 | 0 0 81k 69 | 42 > 28k > > 1.4k > > 3 175k 174k| 0 2 468 | 41 0 17 | 0 0 82k 35 | 42 > 28k > > 276 > > 0 177k 174k| 0 2 2.2k|134 0 41 | 0 0 83k 118 | 44 > 29k > > 1.2k > > 0 178k 174k| 0 1 2.7k|123 0 33 | 0 0 84k 121 | 46 > 31k > > 1.5k > > 0 179k 162k| 0 2 2.6k|133 0 32 | 0 0 85k 131 | 48 > 32k > > 1.4k > > 0 181k 162k| 0 0 2.3k|113 0 36 | 0 0 86k 102 | 40 > 27k > > 1.2k > > 0 182k 162k| 0 1 2.7k| 83 0 36 | 0 0 87k 81 | 42 > 28k > > 1.4k > > 0 183k 162k| 0 6 2.6k| 22 0 35 | 0 0 89k 22 | 43 > 30k > > 1.3k > > 0 184k 162k| 0 1 2.5k| 9 0 35 | 0 0 90k 7 | 45 > 31k > > 1.2k > > 0 186k 155k| 0 3 2.5k| 2 0 36 | 0 0 91k 0 | 47 > 32k > > 1.2k > > 0 187k 155k| 0 3 1.9k| 18 0 49 | 0 0 92k 0 | 48 > 32k > > 970 > > 0 188k 155k| 0 2 2.5k| 46 0 30 | 0 0 93k 32 | 48 > 33k > > 1.3k > > 0 189k 155k| 0 0 2.4k| 55 0 36 | 0 0 95k 50 | 50 > 34k > > 1.2k > > 0 190k 155k| 0 0 2.7k| 2 0 36 | 0 0 96k 0 | 52 > 36k > > 1.3k > > 0 192k 150k| 0 1 3.0k| 30 0 37 | 0 0 97k 28 | 54 > 37k > > 1.5k > > 0 183k 150k| 0 0 2.7k| 58 0 40 | 0 0 99k 50 | 56 > 39k > > 1.4k > > 0 184k 150k| 0 0 3.2k| 12 0 41 | 0 0 100k 10 | 59 > 40k > > 1.6k > > 0 185k 150k| 0 0 2.1k| 3 0 41 | 0 0 102k 0 | 60 > 41k > > 1.0k > > 0 186k 150k| 0 5 1.6k| 12 0 41 | 0 0 102k 10 | 62 > 42k > > 837 > > 0 186k 148k| 0 0 1.0k| 62 0 32 | 0 0 103k 57 | 62 > 43k > > 575 > > 0 170k 148k| 0 0 858 | 31 0 25 | 0 0 103k 27 | 40 > 27k > > 458 > > 5 165k 148k| 0 2 865 | 77 2 28 | 0 0 104k 45 | 41 > 28k > > 495 > > > > with all the rm instances killed > > > > -----mds------ --mds_server-- ---objecter--- -----mds_cache----- > > ---mds_log---- > > rlat inos caps|hsr hcs hcr |writ read actv|recd recy stry purg|segs > evts > > subm| > > 0 194k 147k| 0 0 0 | 64 0 32 | 0 0 117k 63 | 31 > 22k > > 63 > > 0 194k 147k| 0 0 0 | 58 0 32 | 0 0 117k 58 | 31 > 22k > > 58 > > 0 194k 147k| 0 0 0 | 49 0 32 | 0 0 117k 49 | 31 > 22k > > 50 > > 0 194k 147k| 0 5 0 | 65 0 32 | 0 0 117k 65 | 31 > 22k > > 65 > > 0 194k 147k| 0 0 0 | 42 0 32 | 0 0 117k 40 | 31 > 22k > > 40 > > 0 194k 147k| 0 0 0 | 7 0 32 | 0 0 117k 7 | 31 > 22k > > 7 > > 0 194k 147k| 0 2 0 | 23 0 32 | 0 0 117k 23 | 31 > 22k > > 23 > > 0 194k 147k| 0 3 0 | 61 0 32 | 0 0 116k 61 | 31 > 23k > > 62 > > 0 194k 147k| 0 0 0 | 59 0 32 | 0 0 116k 59 | 31 > 23k > > 59 > > 0 194k 147k| 0 2 0 |107 0 32 | 0 0 116k 103 | 31 > 22k > > 103 > > 0 194k 147k| 0 1 0 |126 0 32 | 0 0 116k 125 | 31 > 22k > > 125 > > 0 194k 147k| 0 6 0 | 74 0 32 | 0 0 116k 74 | 31 > 22k > > 74 > > 0 194k 147k| 0 1 0 | 37 0 32 | 0 0 116k 37 | 31 > 23k > > 37 > > 0 194k 147k| 0 2 0 | 96 0 32 | 0 0 116k 96 | 31 > 23k > > 96 > > 0 194k 147k| 0 2 0 |111 0 33 | 0 0 116k 110 | 31 > 23k > > 110 > > 0 194k 147k| 0 3 0 |105 0 33 | 0 0 116k 105 | 31 > 23k > > 105 > > 0 194k 147k| 0 1 0 | 79 0 33 | 0 0 116k 79 | 31 > 23k > > 79 > > 0 194k 147k| 0 0 0 | 67 0 33 | 0 0 116k 67 | 31 > 23k > > 68 > > 0 194k 147k| 0 0 0 | 75 0 33 | 0 0 116k 75 | 31 > 23k > > 75 > > 0 194k 147k| 0 1 0 | 54 0 35 | 0 0 116k 51 | 31 > 23k > > 51 > > 0 194k 147k| 0 0 0 | 40 0 35 | 0 0 115k 40 | 31 > 23k > > 40 > > 0 194k 147k| 0 0 0 | 32 0 35 | 0 0 115k 32 | 31 > 23k > > 32 > > 0 194k 147k| 0 5 0 | 43 0 35 | 0 0 115k 43 | 31 > 23k > > 43 > > 0 194k 147k| 0 0 0 | 7 0 35 | 0 0 115k 7 | 31 > 23k > > 7 > > > > So I guess the purge ops are extremely slow. > > > > The first question is it OK to have 7.5K objects in stry when cluster is > > idle for a while? > > > > The second question is who to blame for the slow purges, i.e. MDS or > OSDs? > > > > Regards, > > > > -Mykola > > > > > > On 2 October 2016 at 23:48, Mykola Dvornik <mykola.dvor...@gmail.com> > wrote: > >> > >> Hi Johan, > >> > >> Many thanks for your reply. I will try to play with the mds tunables and > >> report back to your ASAP. > >> > >> So far I see that mds log contains a lot of errors of the following > kind: > >> > >> 2016-10-02 11:58:03.002769 7f8372d54700 0 mds.0.cache.dir(100056ddecd) > >> _fetched badness: got (but i already had) [inode 10005729a77 [2,head] > >> ~mds0/stray1/10005729a77 auth v67464942 s=196728 nl=0 n(v0 b196728 > 1=1+0) > >> (iversion lock) 0x7f84acae82a0] mode 33204 mtime 2016-08-07 > 23:06:29.776298 > >> > >> 2016-10-02 11:58:03.002789 7f8372d54700 -1 log_channel(cluster) log > [ERR] > >> : loaded dup inode 10005729a77 [2,head] v68621 at > >> /users/mykola/mms/NCSHNO/final/120nm-uniform-h8200/ > j002654.out/m_xrange192-320_yrange192-320_016232.dump, > >> but inode 10005729a77.head v67464942 already exists at > >> ~mds0/stray1/10005729a77 > >> > >> Those folders within mds.0.cache.dir that got badness report a size of > >> 16EB on the clients. rm on them fails with 'Directory not empty'. > >> > >> As for the "Client failing to respond to cache pressure", I have 2 > kernel > >> clients on 4.4.21, 1 on 4.7.5 and 16 fuse clients always running the > most > >> recent release version of ceph-fuse. The funny thing is that every > single > >> client misbehaves from time to time. I am aware of quite discussion > about > >> this issue on the ML, but cannot really follow how to debug it. > >> > >> Regards, > >> > >> -Mykola > >> > >> On 2 October 2016 at 22:27, John Spray <jsp...@redhat.com> wrote: > >>> > >>> On Sun, Oct 2, 2016 at 11:09 AM, Mykola Dvornik > >>> <mykola.dvor...@gmail.com> wrote: > >>> > After upgrading to 10.2.3 we frequently see messages like > >>> > >>> From which version did you upgrade? > >>> > >>> > 'rm: cannot remove '...': No space left on device > >>> > > >>> > The folders we are trying to delete contain approx. 50K files 193 KB > >>> > each. > >>> > >>> My guess would be that you are hitting the new > >>> mds_bal_fragment_size_max check. This limits the number of entries > >>> that the MDS will create in a single directory fragment, to avoid > >>> overwhelming the OSD with oversized objects. It is 100000 by default. > >>> This limit also applies to "stray" directories where unlinked files > >>> are put while they wait to be purged, so you could get into this state > >>> while doing lots of deletions. There are ten stray directories that > >>> get a roughly even share of files, so if you have more than about one > >>> million files waiting to be purged, you could see this condition. > >>> > >>> The "Client failing to respond to cache pressure" messages may play a > >>> part here -- if you have misbehaving clients then they may cause the > >>> MDS to delay purging stray files, leading to a backlog. If your > >>> clients are by any chance older kernel clients, you should upgrade > >>> them. You can also unmount/remount them to clear this state, although > >>> it will reoccur until the clients are updated (or until the bug is > >>> fixed, if you're running latest clients already). > >>> > >>> The high level counters for strays are part of the default output of > >>> "ceph daemonperf mds.<id>" when run on the MDS server (the "stry" and > >>> "purg" columns). You can look at these to watch how fast the MDS is > >>> clearing out strays. If your backlog is just because it's not doing > >>> it fast enough, then you can look at tuning mds_max_purge_files and > >>> mds_max_purge_ops to adjust the throttles on purging. Those settings > >>> can be adjusted without restarting the MDS using the "injectargs" > >>> command > >>> (http://docs.ceph.com/docs/master/rados/operations/ > control/#mds-subsystem) > >>> > >>> Let us know how you get on. > >>> > >>> John > >>> > >>> > >>> > The cluster state and storage available are both OK: > >>> > > >>> > cluster 98d72518-6619-4b5c-b148-9a781ef13bcb > >>> > health HEALTH_WARN > >>> > mds0: Client XXX.XXX.XXX.XXX failing to respond to cache > >>> > pressure > >>> > mds0: Client XXX.XXX.XXX.XXX failing to respond to cache > >>> > pressure > >>> > mds0: Client XXX.XXX.XXX.XXX failing to respond to cache > >>> > pressure > >>> > mds0: Client XXX.XXX.XXX.XXX failing to respond to cache > >>> > pressure > >>> > mds0: Client XXX.XXX.XXX.XXX failing to respond to cache > >>> > pressure > >>> > monmap e1: 1 mons at {000-s-ragnarok=XXX.XXX.XXX.XXX:6789/0} > >>> > election epoch 11, quorum 0 000-s-ragnarok > >>> > fsmap e62643: 1/1/1 up {0=000-s-ragnarok=up:active} > >>> > osdmap e20203: 16 osds: 16 up, 16 in > >>> > flags sortbitwise > >>> > pgmap v15284654: 1088 pgs, 2 pools, 11263 GB data, 40801 > kobjects > >>> > 23048 GB used, 6745 GB / 29793 GB avail > >>> > 1085 active+clean > >>> > 2 active+clean+scrubbing > >>> > 1 active+clean+scrubbing+deep > >>> > > >>> > > >>> > Has anybody experienced this issue so far? > >>> > > >>> > Regards, > >>> > -- > >>> > Mykola > >>> > > >>> > _______________________________________________ > >>> > ceph-users mailing list > >>> > ceph-users@lists.ceph.com > >>> > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com > >>> > > >> > >> > >> > >> > >> -- > >> Mykola > > > > > > > > > > -- > > Mykola > -- Mykola
_______________________________________________ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com