[ceph-users] umount stuck on NFS gateways switch over by using Pacemaker

WD_Hwang Thu, 28 May 2015 00:34:26 -0700

Hello,
  I am testing NFS over RBD recently. I am trying to build the NFS HA 
environment under Ubuntu 14.04 for testing, and the packages version 
information as follows:
- Ubuntu 14.04 : 3.13.0-32-generic(Ubuntu 14.04.2 LTS)
- ceph : 0.80.9-0ubuntu0.14.04.2
- ceph-common : 0.80.9-0ubuntu0.14.04.2
- pacemaker (git20130802-1ubuntu2.3)
- corosync (2.3.3-1ubuntu1)
PS: I also tried ceph/ceph-common(0.87.1-1trusty and 0.87.2-1trusty) on 
3.13.0-48-generic(Ubuntu 14.04.2) server and I got same situations.


  The environment has 5 nodes int the Ceph cluster (3 MONs and 5 OSDs) and two 
NFS gateway (nfs1 and nfs2) for high availability. I issued the command, 'sudo 
service pacemaker stop', on 'nfs1' to force these resources stopped and 
transferred to 'nfs2', and vice versa.

When the two nodes are up, I issue 'sudo service pacemaker stop' on one node, 
the other node will take over all resources. Everything looks fine. Then I wait 
about 30 minutes and do nothing to the NFS gateways. I repeated the previous 
steps to test fail over procedure. I found the process code of 'umount' is 'D' 
(uninterruptible sleep), the 'ps' showed the following result

root 21047 0.0 0.0 17412 952 ? D 16:39 0:00 umount /mnt/block1

Have any idea to solve or work around? Because of 'umount' stuck, both 'reboot' 
and 'shutdown' command can't work well. So if I don't wait 20 minutes for 
'umount' time out, the only way I can do is powering off the server directly.
Any help would be much appreciated.

I attached my configurations and loggings as follows.

================================================================
Pacemaker configurations:

crm configure primitive p_rbd_map_1 ocf:ceph:rbd.in \
params user="admin" pool="block_data" name="data01" 
cephconf="/etc/ceph/ceph.conf" \
op monitor interval="10s" timeout="20s"

crm configure primitive p_fs_rbd_1 ocf:heartbeat:Filesystem \
params directory="/mnt/block1" fstype="xfs" device="/dev/rbd1" \
fast_stop="no" options="noatime,nodiratime,nobarrier,inode64" \
op monitor interval="20s" timeout="40s" \
op start interval="0" timeout="60s" \
op stop interval="0" timeout="60s"

crm configure primitive p_export_rbd_1 ocf:heartbeat:exportfs \
params directory="/mnt/block1" clientspec="10.35.64.0/24" 
options="rw,async,no_subtree_check,no_root_squash" fsid="1" \
op monitor interval="10s" timeout="20s" \
op start interval="0" timeout="40s"

crm configure primitive p_vip_1 ocf:heartbeat:IPaddr2 \
params ip="10.35.64.90" cidr_netmask="24" \
op monitor interval="5"

crm configure primitive p_nfs_server lsb:nfs-kernel-server \
op monitor interval="10s" timeout="30s"

crm configure primitive p_rpcbind upstart:rpcbind \
op monitor interval="10s" timeout="30s"

crm configure group g_rbd_share_1 p_rbd_map_1 p_fs_rbd_1 p_export_rbd_1 p_vip_1 
\
meta target-role="Started"

crm configure group g_nfs p_rpcbind p_nfs_server \
meta target-role="Started"

crm configure clone clo_nfs g_nfs \
meta globally-unique="false" target-role="Started"

================================================================
'crm_mon' status results for normal condition:
Online: [ nfs1 nfs2 ]

Resource Group: g_rbd_share_1
p_rbd_map_1 (ocf::ceph:rbd.in): Started nfs1
p_fs_rbd_1 (ocf::heartbeat:Filesystem): Started nfs1
p_export_rbd_1 (ocf::heartbeat:exportfs): Started nfs1
p_vip_1 (ocf::heartbeat:IPaddr2): Started nfs1
Clone Set: clo_nfs [g_nfs]
Started: [ nfs1 nfs2 ]

'crm_mon' status results for fail over condition:
Online: [ nfs1 nfs2 ]

Resource Group: g_rbd_share_1
p_rbd_map_1 (ocf::ceph:rbd.in): Started nfs1
p_fs_rbd_1 (ocf::heartbeat:Filesystem): Started nfs1 (unmanaged) FAILED
p_export_rbd_1 (ocf::heartbeat:exportfs): Stopped
p_vip_1 (ocf::heartbeat:IPaddr2): Stopped
Clone Set: clo_nfs [g_nfs]
Started: [ nfs2 ]
Stopped: [ nfs1 ]

Failed actions:
p_fs_rbd_1_stop_0 (node=nfs1, call=114, rc=1, status=Timed Out, 
last-rc-change=Wed May 13 16:39:10 2015, queued=60002ms, exec=1ms
): unknown error

================================================================
'demsg' messages:

[ 9470.284509] nfsd: last server has exited, flushing export cache
[ 9470.322893] init: rpcbind main process (4267) terminated with status 2
[ 9600.520281] INFO: task umount:2675 blocked for more than 120 seconds.
[ 9600.520445] Not tainted 3.13.0-32-generic #57-Ubuntu
[ 9600.520570] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this 
message.
[ 9600.520792] umount D ffff88003fc13480 0 2675 1 0x00000000
[ 9600.520800] ffff88003a4f9dc0 0000000000000082 ffff880039ece000 
ffff88003a4f9fd8
[ 9600.520805] 0000000000013480 0000000000013480 ffff880039ece000 
ffff880039ece000
[ 9600.520809] ffff88003fc141a0 0000000000000001 0000000000000000 
ffff88003a377928
[ 9600.520814] Call Trace:
[ 9600.520830] [<ffffffff817251a9>] schedule+0x29/0x70
[ 9600.520882] [<ffffffffa043b300>] _xfs_log_force+0x220/0x280 [xfs]
[ 9600.520891] [<ffffffff8109a9b0>] ? wake_up_state+0x20/0x20
[ 9600.520922] [<ffffffffa043b386>] xfs_log_force+0x26/0x80 [xfs]
[ 9600.520947] [<ffffffffa03f3b6d>] xfs_fs_sync_fs+0x2d/0x50 [xfs]
[ 9600.520954] [<ffffffff811edc22>] sync_filesystem+0x72/0xa0
[ 9600.520960] [<ffffffff811bfe30>] generic_shutdown_super+0x30/0xf0
[ 9600.520966] [<ffffffff811c0127>] kill_block_super+0x27/0x70
[ 9600.520971] [<ffffffff811c040d>] deactivate_locked_super+0x3d/0x60
[ 9600.520976] [<ffffffff811c09c6>] deactivate_super+0x46/0x60
[ 9600.520981] [<ffffffff811dd856>] mntput_no_expire+0xd6/0x170
[ 9600.520986] [<ffffffff811dedfe>] SyS_umount+0x8e/0x100
[ 9600.520991] [<ffffffff8173186d>] system_call_fastpath+0x1a/0x1f
[ 9720.520295] INFO: task xfsaild/rbd1:5577 blocked for more than 120 seconds.
[ 9720.520449] Not tainted 3.13.0-32-generic #57-Ubuntu
[ 9720.520574] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this 
message.
[ 9720.520797] xfsaild/rbd1 D ffff88003fc13480 0 5577 2 0x00000000
[ 9720.520805] ffff88003b571d58 0000000000000046 ffff88003c404800 
ffff88003b571fd8
[ 9720.520811] 0000000000013480 0000000000013480 ffff88003c404800 
ffff88003c404800
[ 9720.520815] ffff88003fc141a0 0000000000000001 0000000000000000 
ffff88003a377928
[ 9720.520819] Call Trace:
[ 9720.520835] [<ffffffff817251a9>] schedule+0x29/0x70
[ 9720.520887] [<ffffffffa043b300>] _xfs_log_force+0x220/0x280 [xfs]
[ 9720.520896] [<ffffffff8109a9b0>] ? wake_up_state+0x20/0x20
[ 9720.520927] [<ffffffffa043b386>] xfs_log_force+0x26/0x80 [xfs]
[ 9720.520958] [<ffffffffa043f920>] ? xfs_trans_ail_cursor_first+0x90/0x90 [xfs]
[ 9720.520986] [<ffffffffa043fa61>] xfsaild+0x141/0x5c0 [xfs]
[ 9720.521013] [<ffffffffa043f920>] ? xfs_trans_ail_cursor_first+0x90/0x90 [xfs]
[ 9720.521019] [<ffffffff8108b572>] kthread+0xd2/0xf0
[ 9720.521024] [<ffffffff8108b4a0>] ? kthread_create_on_node+0x1c0/0x1c0
[ 9720.521029] [<ffffffff817317bc>] ret_from_fork+0x7c/0xb0
[ 9720.521033] [<ffffffff8108b4a0>] ? kthread_create_on_node+0x1c0/0x1c0

Sincerely yours.
WD Hwang

---------------------------------------------------------------------------------------------------------------------------------------------------------------
This email contains confidential or legally privileged information and is for 
the sole use of its intended recipient. 
Any unauthorized review, use, copying or distribution of this email or the 
content of this email is strictly prohibited.
If you are not the intended recipient, you may reply to the sender and should 
delete this e-mail immediately.
---------------------------------------------------------------------------------------------------------------------------------------------------------------

_______________________________________________
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

[ceph-users] umount stuck on NFS gateways switch over by using Pacemaker

Reply via email to