On Fri, Aug 24, 2018 at 1:20 AM Stefan Kooman <ste...@bit.nl> wrote: > Hi Gregory, > > Quoting Gregory Farnum (gfar...@redhat.com): > > This is quite strange. Given that you have a log, I think what you want > to > > do is find one request in the log, trace it through its lifetime, and see > > where the time is elapsed. You may find a bifurcation, where some > > categories of requests happen instantly but other categories take a > second > > or more; focus on the second, obviously. > > So that is what I did. Turns out it's not the (slow) cache at all, probably > not to your surprise. The reads are quit fast actually, compared to > kernel client it's ~ 8 ms slower, or ~ 40%. It looks like couple > of writes / updates to, at least a session file, are slow: > > 2018-08-23 16:40:25.631 7f79156a8700 10 client.15158830 put_inode on > 0x10000693859.head(faked_ino=0 ref=5 ll_ref=1 cap_refs={} open={3=1} > mode=100600 size=0/4194304 nlink=1 btime=2018-08-23 16:40:25.632601 > mtime=2018-08-23 16:40:25.632601 ctime=2018-08-23 16:40:25.632601 > caps=pAsxLsXsxFsxcrwb(0=pAsxLsXsxFsxcrwb) objectset[0x10000693859 ts 0/0 > objects 0 dirty_or_tx 0] > parents=0x1000068547c.head["sess_ia0agoj01ul4rob7ji55ouca41"] > 0x5646ff0e8000) > > 2018-08-23 16:40:28.547 7f79156a8700 10 client.15158830 > update_inode_file_time 0x10000693859.head(faked_ino=0 ref=4 ll_ref=1 > cap_refs={} open={3=1} mode=100600 size=0/4194304 nlink=1 > btime=2018-08-23 16:40:25.632601 mtime=2018-08-23 16:40:25.632601 > ctime=2018-08-23 16:40:25.632601 > caps=pAsxLsXsxFsxcrwb(0=pAsxLsXsxFsxcrwb) objectset[0x10000693859 ts 0/0 > objects 0 dirty_or_tx 0] > parents=0x1000068547c.head["sess_ia0agoj01ul4rob7ji55ouca41"] > 0x5646ff0e8000) pAsxLsXsxFsxcrwb ctime 2018-08-23 16:40:25.632601 mtime > 2018-08-23 16:40:25.632601 >
Hmm, these aren't actually the start and end times to the same operation. put_inode() is literally adjusting a refcount, which can happen for reasons ranging from the VFS doing something that drops it to an internal operation completing to a response coming back from the MDS. You should be able to find requests coming in from the kernel and a response going back out (the function names will be prefixed with "ll_", eg "ll_lookup"). > > So, almost 3 seconds. Page is only served after this, and possibly, after > some cache files have been written. Note though that ceph-fuse is in > debug=20 mode. Apparently the kernel client is _much_ faster in writing > than ceph-fuse. If I write a file with "dd" (from /dev/urandom) it's in > the tens of milliseconds range, not seconds. atime / ctime changes are > handled in < 5 ms. > > I wonder if tuning file-striping [1] with stripe units of 4KB would be > beneficial in this case. > > Gr. Stefan > > [1]: http://docs.ceph.com/docs/master/dev/file-striping/ > > -- > | BIT BV http://www.bit.nl/ Kamer van Koophandel 09090351 > | GPG: 0xD14839C6 +31 318 648 688 > <+31%20318%20648%20688> / i...@bit.nl >
_______________________________________________ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com