> On Oct 22, 2015, at 3:57 PM, John-Paul Robinson <j...@uab.edu> wrote:
> 
> Hi,
> 
> Has anyone else experienced a problem with RBD-to-NFS gateways blocking
> nfsd server requests when their ceph cluster has a placement group that
> is not servicing I/O for some reason, eg. too few replicas or an osd
> with slow request warnings?

We have experienced exactly that kind of problem except that it sometimes 
happens even when ceph health reports "HEALTH_OK". This has been incredibly 
vexing for us. 


If the cluster is unhealthy for some reason, then I'd expect your/our symptoms 
as writes can't be completed. 

I'm guessing that you have file systems with barriers turned on. Whichever file 
system that has a barrier write stuck on the problem pg, will cause any other 
process trying to write anywhere in that FS also to block. This likely means a 
cascade of nfsd processes will block as they each try to service various client 
writes to that FS. Even though, theoretically, the rest of the "disk" (rbd) and 
other file systems might still be writable, the NFS processes will still be in 
uninterruptible sleep just because of that stuck write request (or such is my 
understanding). 

Disabling barriers on the gateway machine might postpone the problem (never 
tried it and don't want to) until you hit your vm.dirty_bytes or vm.dirty_ratio 
thresholds, but it is dangerous as you could much more easily lose data. You'd 
be better off solving the underlying issues when they happen (too few replicas 
available or overloaded osds). 


For us, even when the cluster reports itself as healthy, we sometimes have this 
problem. All nfsd processes block. sync blocks. echo 3 > 
/proc/sys/vm/drop_caches blocks. There is a persistent 4-8MB "Dirty" in 
/proc/meminfo. None of the osds log slow requests. Everything seems fine on the 
osds and mons. Neither CPU nor I/O load is extraordinary on the ceph nodes, but 
at least one file system on the gateway machine will stop accepting writes. 

If we just wait, the situation resolves itself in 10 to 30 minutes. A forced 
reboot of the NFS gateway "solves" the performance problem, but is annoying and 
dangerous (we unmount all of the file systems that are still unmountable, but 
the stuck ones lead us to a sysrq-b). 

This is on Scientific Linux 6.7 systems with elrepo 4.1.10 Kernels running Ceph 
Firefly (0.8.10) and XFS file systems exported over NFS and samba. 

Ryan
_______________________________________________
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Reply via email to