Re: [ceph-users] cephfs kernel client hangs

2018-08-27 Thread Yan, Zheng
On Mon, Aug 27, 2018 at 6:10 AM Zhenshi Zhou wrote: > > Hi, > The kernel version is 4.12.8-1.el7.elrepo.x86_64. > Client.267792 has gone as I restart the server at weekend. > Does ceph-fuse more stable than kernel client? > For old kernels such as 4.12. ceph-fuse is more stable. If you use kernel

Re: [ceph-users] cephfs kernel client hangs

2018-08-27 Thread Zhenshi Zhou
Hi, The kernel version is 4.12.8-1.el7.elrepo.x86_64. Client.267792 has gone as I restart the server at weekend. Does ceph-fuse more stable than kernel client? Yan, Zheng 于2018年8月27日周一 上午11:41写道: > please check client.213528, instead of client.267792. which version of > kernel client.213528 use.

Re: [ceph-users] cephfs kernel client hangs

2018-08-26 Thread Yan, Zheng
please check client.213528, instead of client.267792. which version of kernel client.213528 use. On Sat, Aug 25, 2018 at 6:12 AM Zhenshi Zhou wrote: > > Hi, > This time, osdc: > > REQUESTS 0 homeless 0 > LINGER REQUESTS > > monc: > > have monmap 2 want 3+ > have osdmap 4545 want 4546 > have fsmap

Re: [ceph-users] cephfs kernel client hangs

2018-08-25 Thread Zhenshi Zhou
Hi, This time, osdc: REQUESTS 0 homeless 0 LINGER REQUESTS monc: have monmap 2 want 3+ have osdmap 4545 want 4546 have fsmap.user 0 have mdsmap 446 want 447+ fs_cluster_id -1 mdsc: 649065 mds0setattr #12e7e5a Anything useful? Yan, Zheng 于2018年8月25日周六 上午7:53写道: > Are there hang

Re: [ceph-users] cephfs kernel client hangs

2018-08-24 Thread Yan, Zheng
Are there hang request in /sys/kernel/debug/ceph//osdc On Fri, Aug 24, 2018 at 9:32 PM Zhenshi Zhou wrote: > > I'm afaid that the client hangs again...the log shows: > > 2018-08-24 21:27:54.714334 [WRN] slow request 62.607608 seconds old, > received at 2018-08-24 21:26:52.106633: client_req

Re: [ceph-users] cephfs kernel client hangs

2018-08-24 Thread Zhenshi Zhou
I'm afaid that the client hangs again...the log shows: 2018-08-24 21:27:54.714334 [WRN] slow request 62.607608 seconds old, received at 2018-08-24 21:26:52.106633: client_request(client.213528:241811 getattr pAsLsXsFs #0x12e7e5a 2018-08-24 21:26:52.106425 caller_uid=0, caller_gid=0{}) current

Re: [ceph-users] cephfs kernel client hangs

2018-08-14 Thread Zhenshi Zhou
kernel client Yan, Zheng 于2018年8月14日周二 下午3:13写道: > On Mon, Aug 13, 2018 at 9:55 PM Zhenshi Zhou wrote: > > > > Hi Burkhard, > > I'm sure the user has permission to read and write. Besides, we're not > using EC data pools. > > Now the situation is that any openration to a specific file, the comm

Re: [ceph-users] cephfs kernel client hangs

2018-08-14 Thread Yan, Zheng
On Mon, Aug 13, 2018 at 9:55 PM Zhenshi Zhou wrote: > > Hi Burkhard, > I'm sure the user has permission to read and write. Besides, we're not using > EC data pools. > Now the situation is that any openration to a specific file, the command will > hang. > Operations to any other files won't hang.

Re: [ceph-users] cephfs kernel client hangs

2018-08-13 Thread Zhenshi Zhou
Hi Burkhard, I'm sure the user has permission to read and write. Besides, we're not using EC data pools. Now the situation is that any openration to a specific file, the command will hang. Operations to any other files won't hang. Burkhard Linke 于2018年8月13日周一 下午9:42写道: > Hi, > > > On 08/13/2018

Re: [ceph-users] cephfs kernel client hangs

2018-08-13 Thread Burkhard Linke
Hi, On 08/13/2018 03:22 PM, Zhenshi Zhou wrote: Hi, Finally, I got a running server with files /sys/kernel/debug/ceph/xxx/ [root@docker27 525c4413-7a08-40ca-9a98-0a6df009025b.client213522]# cat mdsc [root@docker27 525c4413-7a08-40ca-9a98-0a6df009025b.client213522]# cat monc have monmap 2 want

Re: [ceph-users] cephfs kernel client hangs

2018-08-13 Thread Zhenshi Zhou
Hi, Finally, I got a running server with files /sys/kernel/debug/ceph/xxx/ [root@docker27 525c4413-7a08-40ca-9a98-0a6df009025b.client213522]# cat mdsc [root@docker27 525c4413-7a08-40ca-9a98-0a6df009025b.client213522]# cat monc have monmap 2 want 3+ have osdmap 4545 want 4546 have fsmap.user 0 have

Re: [ceph-users] cephfs kernel client hangs

2018-08-09 Thread Burkhard Linke
Hi, On 08/09/2018 03:21 PM, Yan, Zheng wrote: try 'mount -f', recent kernel should handle 'mount -f' pretty well On Wed, Aug 8, 2018 at 10:46 PM Zhenshi Zhou wrote: Hi, Is there any other way excpet rebooting the server when the client hangs? If the server is in production environment, I can'

Re: [ceph-users] cephfs kernel client hangs

2018-08-09 Thread Yan, Zheng
try 'mount -f', recent kernel should handle 'mount -f' pretty well On Wed, Aug 8, 2018 at 10:46 PM Zhenshi Zhou wrote: > > Hi, > Is there any other way excpet rebooting the server when the client hangs? > If the server is in production environment, I can't restart it everytime. > > Webert de Souza

Re: [ceph-users] cephfs kernel client hangs

2018-08-09 Thread Jake Grimmett
Hi John, thanks for the advice, it's greatly appreciated. We have 45 x 8TB OSDs & 128GB RAM per node, this is 35% of the recommended quantity, so our OOM problems are predictable. I'll increase the RAM on one node to 256GB, and see if this handles OSD fault conditions without the bluestore RAM l

Re: [ceph-users] cephfs kernel client hangs

2018-08-08 Thread John Spray
On Wed, Aug 8, 2018 at 4:46 PM Jake Grimmett wrote: > > Hi John, > > With regard to memory pressure; Does the cephfs fuse client also cause a > deadlock - or is this just the kernel client? TBH, I'm not expert enough on the kernel-side implementation of fuse to say. Ceph does have the fuse_disab

Re: [ceph-users] cephfs kernel client hangs

2018-08-08 Thread Jake Grimmett
Hi John, With regard to memory pressure; Does the cephfs fuse client also cause a deadlock - or is this just the kernel client? We run the fuse client on ten OSD nodes, and use parsync (parallel rsync) to backup two beegfs systems (~1PB). Ordinarily fuse works OK, but any OSD problems can cause

Re: [ceph-users] cephfs kernel client hangs

2018-08-08 Thread Webert de Souza Lima
You can only try to remount the cephs dir. It will probably not work, giving you I/O Errors, so the fallback would be to use a fuse-mount. If I recall correctly you could do a lazy umount on the current dir (umount -fl /mountdir) and remount it using the FUSE client. it will work for new sessions

Re: [ceph-users] cephfs kernel client hangs

2018-08-08 Thread Zhenshi Zhou
Hi, Is there any other way excpet rebooting the server when the client hangs? If the server is in production environment, I can't restart it everytime. Webert de Souza Lima 于2018年8月8日周三 下午10:33写道: > Hi Zhenshi, > > if you still have the client mount hanging but no session is connected, > you pro

Re: [ceph-users] cephfs kernel client hangs

2018-08-08 Thread Webert de Souza Lima
Hi Zhenshi, if you still have the client mount hanging but no session is connected, you probably have some PID waiting with blocked IO from cephfs mount. I face that now and then and the only solution is to reboot the server, as you won't be able to kill a process with pending IO. Regards, Weber

Re: [ceph-users] cephfs kernel client hangs

2018-08-08 Thread Zhenshi Zhou
Hi Webert, That command shows the current sessions, whereas the server which I get the files(osdc,mdsc,monc) disconnect for a long time. So I cannot get useful infomation from the command you provide. Thanks Webert de Souza Lima 于2018年8月8日周三 下午10:10写道: > You could also see open sessions at the

Re: [ceph-users] cephfs kernel client hangs

2018-08-08 Thread Webert de Souza Lima
You could also see open sessions at the MDS server by issuing `ceph daemon mds.XX session ls` Regards, Webert Lima DevOps Engineer at MAV Tecnologia *Belo Horizonte - Brasil* *IRC NICK - WebertRLZ* On Wed, Aug 8, 2018 at 5:08 AM Zhenshi Zhou wrote: > Hi, I find an old server which mounted ce

Re: [ceph-users] cephfs kernel client hangs

2018-08-08 Thread Zhenshi Zhou
Hi, I find an old server which mounted cephfs and has the debug files. # cat osdc REQUESTS 0 homeless 0 LINGER REQUESTS BACKOFFS # cat monc have monmap 2 want 3+ have osdmap 3507 have fsmap.user 0 have mdsmap 55 want 56+ fs_cluster_id -1 # cat mdsc 194 mds0getattr #1036ae3 What does i

Re: [ceph-users] cephfs kernel client hangs

2018-08-07 Thread Zhenshi Zhou
I restarted the client server so that there's no file in that directory. I will take care of it if the client hangs next time. Thanks Yan, Zheng 于2018年8月8日周三 上午11:23写道: > On Wed, Aug 8, 2018 at 11:02 AM Zhenshi Zhou wrote: > > > > Hi, > > I check all my ceph servers and they are not mount ceph

Re: [ceph-users] cephfs kernel client hangs

2018-08-07 Thread Yan, Zheng
On Wed, Aug 8, 2018 at 11:02 AM Zhenshi Zhou wrote: > > Hi, > I check all my ceph servers and they are not mount cephfs on each of > them(maybe I umount after testing). As a result, the cluster didn't encounter > a memory deadlock. Besides, I check the monitoring system and the memory and > cpu

Re: [ceph-users] cephfs kernel client hangs

2018-08-07 Thread Zhenshi Zhou
Hi, I check all my ceph servers and they are not mount cephfs on each of them(maybe I umount after testing). As a result, the cluster didn't encounter a memory deadlock. Besides, I check the monitoring system and the memory and cpu usage were at common level while the clients hung. Back to my quest

Re: [ceph-users] cephfs kernel client hangs

2018-08-07 Thread Zhenshi Zhou
Hi, I'm not sure if it just mounts the cephfs without using or doing any operation within the mounted directory would be affected by flushing cache. I mounted cephfs on osd servers only for testing and then left it there. Anyway I will umount it. Thanks John Spray 于2018年8月8日 周三03:37写道: > On Tue,

Re: [ceph-users] cephfs kernel client hangs

2018-08-07 Thread Webert de Souza Lima
That's good to know, thanks for the explanation. Fortunately we are in the process of cluster redesign and we can definitely fix that scenario. Regards, Webert Lima DevOps Engineer at MAV Tecnologia *Belo Horizonte - Brasil* *IRC NICK - WebertRLZ* On Tue, Aug 7, 2018 at 4:37 PM John Spray wrot

Re: [ceph-users] cephfs kernel client hangs

2018-08-07 Thread John Spray
On Tue, Aug 7, 2018 at 5:42 PM Reed Dier wrote: > > This is the first I am hearing about this as well. This is not a Ceph-specific thing -- it can also affect similar systems like Lustre. The classic case is when under some memory pressure, the kernel tries to free memory by flushing the client'

Re: [ceph-users] cephfs kernel client hangs

2018-08-07 Thread Reed Dier
This is the first I am hearing about this as well. Granted, I am using ceph-fuse rather than the kernel client at this point, but that isn’t etched in stone. Curious if there is more to share. Reed > On Aug 7, 2018, at 9:47 AM, Webert de Souza Lima > wrote: > > > Yan, Zheng mailto:uker...@

Re: [ceph-users] cephfs kernel client hangs

2018-08-07 Thread Webert de Souza Lima
Yan, Zheng 于2018年8月7日周二 下午7:51写道: > On Tue, Aug 7, 2018 at 7:15 PM Zhenshi Zhou wrote: > this can cause memory deadlock. you should avoid doing this > > > Yan, Zheng 于2018年8月7日 周二19:12写道: > >> > >> did you mount cephfs on the same machines that run ceph-osd? > >> I didn't know about this. I ru

Re: [ceph-users] cephfs kernel client hangs

2018-08-07 Thread Zhenshi Zhou
Hi Yan, thanks for the advice. I will umount the cephfs on osd servers and keep an eye on it.:) Yan, Zheng 于2018年8月7日周二 下午7:51写道: > On Tue, Aug 7, 2018 at 7:15 PM Zhenshi Zhou wrote: > > > > Yes, some osd servers mount cephfs > > > > this can cause memory deadlock. you should avoid doing this >

Re: [ceph-users] cephfs kernel client hangs

2018-08-07 Thread Yan, Zheng
On Tue, Aug 7, 2018 at 7:15 PM Zhenshi Zhou wrote: > > Yes, some osd servers mount cephfs > this can cause memory deadlock. you should avoid doing this > Yan, Zheng 于2018年8月7日 周二19:12写道: >> >> did you mount cephfs on the same machines that run ceph-osd? >> >> On Tue, Aug 7, 2018 at 5:14 PM Zhens

Re: [ceph-users] cephfs kernel client hangs

2018-08-07 Thread Zhenshi Zhou
Yes, some osd servers mount cephfs Yan, Zheng 于2018年8月7日 周二19:12写道: > did you mount cephfs on the same machines that run ceph-osd? > > On Tue, Aug 7, 2018 at 5:14 PM Zhenshi Zhou wrote: > > > > Hi Burkhard, > > Files located in /sys/kernel/debug/ceph/ are all new files > generated after I reboot

Re: [ceph-users] cephfs kernel client hangs

2018-08-07 Thread Yan, Zheng
did you mount cephfs on the same machines that run ceph-osd? On Tue, Aug 7, 2018 at 5:14 PM Zhenshi Zhou wrote: > > Hi Burkhard, > Files located in /sys/kernel/debug/ceph/ are all new files generated > after I reboot the server. > The clients were in blacklist and I manully remove them from the

Re: [ceph-users] cephfs kernel client hangs

2018-08-07 Thread Zhenshi Zhou
Hi Burkhard, Files located in /sys/kernel/debug/ceph/ are all new files generated after I reboot the server. The clients were in blacklist and I manully remove them from the blacklist. But the clients hung still. Thanks Burkhard Linke 于2018年8月7日周二 下午4:54写道: > Hi, > > > you are using the kernel

Re: [ceph-users] cephfs kernel client hangs

2018-08-07 Thread Burkhard Linke
Hi, you are using the kernel implementation of CephFS. In this case some information can be retrieved from the /sys/kernel/debug/ceph/ directory. Especially the mdsc, monc and osdc files are important, since they contain pending operations on mds, mon and osds. We have a similar problem in

[ceph-users] cephfs kernel client hangs

2018-08-07 Thread Zhenshi Zhou
Hi, I have a CEPH 12.2.5 cluster running on 4 CentOS 7.3 servers with kernel 4.17.0, Including 3 mons, 16 osds, 2 mds(1active+1backup). I have some cllients mounted cephfs in kernel mode. Client A is using kernel 4.4.145, and others are using kernel 4.12.8. All of them are using ceph client versi