Hi, I am running cephfs (10.2.2) with kernel 4.7.0-1. I have noticed that frequently static files are showing empty when serviced via a web server (apache). I have tracked this down further and can see when running a checksum against the file on the cephfs file system on the node serving the empty http response the checksum is '00000'
The below shows the checksum on a defective node. [root@server2]# ls -al /cephfs/webdata/static/456/JHL/66448H-755h.jpg -rw-r--r-- 1 apache apache 53317 Aug 28 23:46 /cephfs/webdata/static/456/JHL/66448H-755h.jpg [root@server2]# sum /cephfs/webdata/static/456/JHL/66448H-755h.jpg 00000 53 The below shows the checksum on a working node. [root@server1]# ls -al /cephfs/webdata/static/456/JHL/66448H-755h.jpg -rw-r--r-- 1 apache apache 53317 Aug 28 23:46 /cephfs/webdata/static/456/JHL/66448H-755h.jpg [root@server1]# sum /cephfs/webdata/static/456/JHL/66448H-755h.jpg 03620 53 [root@server1]# If I flush the cache as shown below the checksum returns as expected and the web server serves up valid content. [root@server2]# echo 3 > /proc/sys/vm/drop_caches [root@server2]# sum /cephfs/webdata/static/456/JHL/66448H-755h.jpg 03620 53 After some time typically less than 1hr the issue repeats, It seems to not repeat if I take any one of the servers out of the LB and only serve requests from one of the servers. I may try and use the FUSE client has has a mount option direct_io that looks to disable page cache. I have been hunting in the ML and tracker but could not see anything really close to this issue, Any input or feedback on similar experiences is welcome. Thanks
_______________________________________________ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com