Hi Sunil, No NFS being used here. In this occurrence it's proftpd that is in the D state and the previous was occurrence involved rsync. I don't believe rsync uses sendfile and sendfile is disabled in the proftpd config.
I guess it's probably worth trying out 1.4.7 in case it fixes the issue. It was almost a month between occurrences so not something that I'll know straight away. Cheers, Brad On Mon, 19 Apr 2010 11:28:07 -0700 Sunil Mushran <sunil.mush...@oracle.com> wrote: > RO Holders: 1 EX Holders: 0 > > So node 18 wants to upgrade to EX. For that to happen, > node 17 has to downgrade from PR. But it cannot because > there is 1 RO (readonly) holder. If you are using NFS and > see a nfsd in a D state, then that would be it. I've just > released 1.4.7 in which this issue has been addressed. > > Sunil > > > Brad Plant wrote: > > Hi Sunil, > > > > I managed to collect the fs_locks and dlm_locks output on both nodes this > > time. www1 is node 17 while www2 is node 18. I had to reboot www1 to fix > > the problem but of course www1 couldn't unmount the file system so the > > other nodes saw it as a crash. > > > > Both nodes are running 2.6.18-164.15.1.el5.centos.plusxen with the matching > > ocfs2 1.4.4-1 rpm downloaded from > > http://oss.oracle.com/projects/ocfs2/files/RedHat/RHEL5/x86_64/. > > > > Do you make anything of this? > > > > I read that there is going to be a new ocfs2 release soon. I'm sure there's > > lots of bug fixes, but are there any in there that you think might solve > > this problem? > > > > Cheers, > > > > Brad > > > > > > www2 ~ # ./scanlocks2 > > /dev/xvdd3 M0000000000000000095a0300000000 > > > > www2 ~ # debugfs.ocfs2 -R "fs_locks M0000000000000000095a0300000000" > > /dev/xvdd3 |cat > > Lockres: M0000000000000000095a0300000000 Mode: Protected Read > > Flags: Initialized Attached Busy > > RO Holders: 0 EX Holders: 0 > > Pending Action: Convert Pending Unlock Action: None > > Requested Mode: Exclusive Blocking Mode: No Lock > > PR > Gets: 6802 Fails: 0 Waits (usec) Total: 0 Max: 0 > > EX > Gets: 16340 Fails: 0 Waits (usec) Total: 12000 Max: 8000 > > Disk Refreshes: 0 > > > > www2 ~ # debugfs.ocfs2 -R "dlm_locks M0000000000000000095a0300000000" > > /dev/xvdd3 |cat > > Lockres: M0000000000000000095a0300000000 Owner: 18 State: 0x0 > > Last Used: 0 ASTs Reserved: 0 Inflight: 0 Migration Pending: No > > Refs: 4 Locks: 2 On Lists: None > > Reference Map: 17 > > Lock-Queue Node Level Conv Cookie Refs AST BAST > > Pending-Action > > Granted 17 PR -1 17:62487955 2 No No None > > Converting 18 PR EX 18:6599867 2 No No None > > > > > > www1 ~ # ./scanlocks2 > > > > www1 ~ # debugfs.ocfs2 -R "fs_locks M0000000000000000095a0300000000" > > /dev/xvdd3 |cat > > Lockres: M0000000000000000095a0300000000 Mode: Protected Read > > Flags: Initialized Attached Blocked Queued > > RO Holders: 1 EX Holders: 0 > > Pending Action: None Pending Unlock Action: None > > Requested Mode: Protected Read Blocking Mode: Exclusive > > PR > Gets: 110 Fails: 3 Waits (usec) Total: 32000 Max: 12000 > > EX > Gets: 0 Fails: 0 Waits (usec) Total: 0 Max: 0 > > Disk Refreshes: 0 > > > > www1 ~ # debugfs.ocfs2 -R "dlm_locks M0000000000000000095a0300000000" > > /dev/xvdd3 |cat > > Lockres: M0000000000000000095a0300000000 Owner: 18 State: 0x0 > > Last Used: 0 ASTs Reserved: 0 Inflight: 0 Migration Pending: No > > Refs: 3 Locks: 1 On Lists: None > > Reference Map: > > Lock-Queue Node Level Conv Cookie Refs AST BAST > > Pending-Action > > Granted 17 PR -1 17:62487955 2 No No None > > > > > > > > > > > > > > > > On Fri, 19 Mar 2010 08:48:39 -0700 > > Sunil Mushran <sunil.mush...@oracle.com> wrote: > > > > > >> In findpath <lockname>, the lockname needs to be in angular brackets. > >> > >> Did you manage to trap the oops stack trace of the crash? > >> > >> So the dlm on the master says that node 250 has a PR but the fslocks > >> on 250 says that it has requested a PR but not gotten a reply back as yet. > >> Next time also dump the dlm_lock on 250. (The message flow is fs on 250 > >> talsk to dlm on 250 which talkd to dlm on master which may have to talk > >> to other nodes but eventually replies to dlm on 250 which then pings the > >> fs on that node. Roundtrip happens in a couple hundred of usecs in gige.) > >> > >> Running a mix of localflock and not is not advisable. Not the end of the > >> world > >> though. It depends on how flocks are being used. > >> > >> Is this a mix of virtual and physical boxes? > >> > >> Brad Plant wrote: > >> > >>> Hi Sunil, > >>> > >>> I seem to have struck this issue, also I'm not using nfs. I've got other > >>> processes stuck in the D stat. It's a mail server and the processes are > >>> postfix and courier-imap. As per your instructions, I've run scanlocks2, > >>> and debugfs.ocfs2: > >>> > >>> mail1 ~ # ./scanlocks2 > >>> /dev/xvdc1 M0000000000000000808bc800000000 > >>> > >>> mail1 ~ # debugfs.ocfs2 -R "fs_locks -l M0000000000000000808bc800000000" > >>> /dev/xvdc1 |cat > >>> Lockres: M0000000000000000808bc800000000 Mode: Protected Read > >>> Flags: Initialized Attached Busy > >>> RO Holders: 0 EX Holders: 0 > >>> Pending Action: Convert Pending Unlock Action: None > >>> Requested Mode: Exclusive Blocking Mode: No Lock > >>> Raw LVB: 05 00 00 00 00 00 00 01 00 00 01 99 00 00 01 99 > >>> 12 1f c9 67 29 71 32 86 12 e8 e2 f6 d1 07 8c 15 > >>> 12 e8 e2 f6 d1 07 8c 15 00 00 00 00 00 00 10 00 > >>> 41 c0 00 05 00 00 00 00 4b b6 12 7d 00 00 00 00 > >>> PR > Gets: 471598 Fails: 0 Waits (usec) Total: 64002 Max: 8000 > >>> EX > Gets: 8041 Fails: 0 Waits (usec) Total: 28001 Max: 4000 > >>> Disk Refreshes: 0 > >>> > >>> mail1 ~ # debugfs.ocfs2 -R "dlm_locks -l M0000000000000000808bc800000000" > >>> /dev/xvdc1 |cat > >>> Lockres: M0000000000000000808bc800000000 Owner: 1 State: 0x0 > >>> Last Used: 0 ASTs Reserved: 0 Inflight: 0 Migration Pending: No > >>> Refs: 4 Locks: 2 On Lists: None > >>> Reference Map: 250 > >>> Raw LVB: 05 00 00 00 00 00 00 01 00 00 01 99 00 00 01 99 > >>> 12 1f c9 67 29 71 32 86 12 e8 e2 f6 d1 07 8c 15 > >>> 12 e8 e2 f6 d1 07 8c 15 00 00 00 00 00 00 10 00 > >>> 41 c0 00 05 00 00 00 00 4b b6 12 7d 00 00 00 00 > >>> Lock-Queue Node Level Conv Cookie Refs AST BAST > >>> Pending-Action > >>> Granted 250 PR -1 250:10866405 2 No No None > >>> Converting 1 PR EX 1:95 2 No No None > >>> > >>> mail1 *is* node number 1, so this is the master node. > >>> > >>> I managed to run scanlocks2 on node 250 (backup1) and also managed to get > >>> the following: > >>> > >>> backup1 ~ # debugfs.ocfs2 -R "fs_locks -l > >>> M00000000000000007e89e400000000" /dev/xvdc1 |cat > >>> Lockres: M00000000000000007e89e400000000 Mode: Invalid > >>> Flags: Initialized Busy > >>> RO Holders: 0 EX Holders: 0 > >>> Pending Action: Attach Pending Unlock Action: None > >>> Requested Mode: Protected Read Blocking Mode: Invalid > >>> Raw LVB: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 > >>> 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 > >>> 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 > >>> 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 > >>> PR > Gets: 0 Fails: 0 Waits (usec) Total: 0 Max: 0 > >>> EX > Gets: 0 Fails: 0 Waits (usec) Total: 0 Max: 0 > >>> Disk Refreshes: 0 > >>> > >>> A further run of scanlocks2 however resulted in backup1 (node 250) > >>> crashing. > >>> > >>> The FS is mounted by 3 nodes: mail1, mail2 and backup1. mail1 and mail2 > >>> are running the latest centos 5 xen kernel with NO localflocks. backup1 > >>> is running a 2.6.28.10 vanilla mainline kernel (pv-ops) WITH localflocks. > >>> > >>> I had to switch backup1 to a mainline kernel with localflocks because > >>> performing backups on backup1 using rsync seemed to take a long time (3-4 > >>> times longer) when using the centos 5 xen kernel with no localflocks. I > >>> was running all nodes on recent-ish mainline kernels, but have only > >>> recently converted most of them to centos 5 because of repeated ocfs2 > >>> stability issues with mainline kernels and ocfs2. > >>> > >>> When backup1 crashed, the lock held by mail1 seemed to be released and > >>> everything went back to normal. > >>> > >>> I tried to do a debugfs.ocfs2 -R "findpath > >>> M00000000000000007e89e400000000" /dev/xvdc1 |cat but it said "usage: > >>> locate <inode#>" despite the man page stating otherwise. -R "locate ..." > >>> said the same. > >>> > >>> I hope you're able to get some useful info from the above. If not, can > >>> you please provide the next steps that you would want me to run *in case* > >>> it happens again. > >>> > >>> Cheers, > >>> > >>> Brad > >>> > >>> > >>> On Thu, 18 Mar 2010 11:25:28 -0700 > >>> Sunil Mushran <sunil.mush...@oracle.com> wrote: > >>> > >>> > >>> > >>>> I am assuming you are mounting the nfs mounts with the nordirplus > >>>> mount option. If not, that is known to deadlock a nfsd thread leading > >>>> to what you are seeing. > >>>> > >>>> There are two possible reasons for this error. One is a dlm issue. > >>>> Other is a local deadlock like above. > >>>> > >>>> To see if the dlm is the cause for the hang, run scanlocks2. > >>>> http://oss.oracle.com/~smushran/.dlm/scripts/scanlocks2 > >>>> > >>>> This will dump the busy lock resources. Run it a few times. If > >>>> a lock resource comes up regularly, then it indicates a dlm problem. > >>>> > >>>> Then dump the fs and dlm lock state on that node. > >>>> debugfs.ocfs2 -R "fs_locks LOCKNAME" /dev/sdX > >>>> debugfs.ocfs2 -R "dlm_locks LOCKNAME" /dev/sdX > >>>> > >>>> The dlm lock will tell you the master node. Repeat the two dumps > >>>> on the master node. The dlm lock on the master node will point > >>>> to the current holder. Repeat the same on that node. Email all that > >>>> to me asap. > >>>> > >>>> michael.a.jaqu...@verizon.com wrote: > >>>> > >>>> > >>>>> All, > >>>>> > >>>>> I've seen a few posts about this issue in the past, but not a > >>>>> resolution. I have a 3 node cluster sharing ocfs2 volumes to app nodes > >>>>> via nfs. On occasion, one of our db nodes will have nfs go into an > >>>>> uninterruptable sleep state. The nfs daemon is completely useless at > >>>>> this point. The db node has to be rebooted to resolve. It seems that > >>>>> nfs is waiting on ocfs2_wait_for_mask. Any suggestions on a resolution > >>>>> would be appreciated. > >>>>> > >>>>> root 18387 0.0 0.0 0 0 ? S< Mar15 0:00 [nfsd4] > >>>>> root 18389 0.0 0.0 0 0 ? D Mar15 0:10 [nfsd] > >>>>> root 18390 0.0 0.0 0 0 ? D Mar15 0:10 [nfsd] > >>>>> root 18391 0.0 0.0 0 0 ? D Mar15 0:10 [nfsd] > >>>>> root 18392 0.0 0.0 0 0 ? D Mar15 0:13 [nfsd] > >>>>> root 18393 0.0 0.0 0 0 ? D Mar15 0:08 [nfsd] > >>>>> root 18394 0.0 0.0 0 0 ? D Mar15 0:09 [nfsd] > >>>>> root 18395 0.0 0.0 0 0 ? D Mar15 0:12 [nfsd] > >>>>> root 18396 0.0 0.0 0 0 ? D Mar15 0:13 [nfsd] > >>>>> > >>>>> 18387 nfsd4 worker_thread > >>>>> 18389 nfsd ocfs2_wait_for_mask > >>>>> 18390 nfsd ocfs2_wait_for_mask > >>>>> 18391 nfsd ocfs2_wait_for_mask > >>>>> 18392 nfsd ocfs2_wait_for_mask > >>>>> 18393 nfsd ocfs2_wait_for_mask > >>>>> 18394 nfsd ocfs2_wait_for_mask > >>>>> 18395 nfsd ocfs2_wait_for_mask > >>>>> 18396 nfsd ocfs2_wait_for_mask > >>>>> > >>>>> > >>>>> -Mike Jaquays > >>>>> _______________________________________________ > >>>>> Ocfs2-users mailing list > >>>>> Ocfs2-users@oss.oracle.com > >>>>> http://oss.oracle.com/mailman/listinfo/ocfs2-users > >>>>> > >>>>> > >>>>> > >>>> _______________________________________________ > >>>> Ocfs2-users mailing list > >>>> Ocfs2-users@oss.oracle.com > >>>> http://oss.oracle.com/mailman/listinfo/ocfs2-users > >>>> > >>>> > > > > > _______________________________________________ Ocfs2-users mailing list Ocfs2-users@oss.oracle.com http://oss.oracle.com/mailman/listinfo/ocfs2-users