Re: rpc.lockd brokenness (2)

2007-04-04 Thread Jun Kuriyama
At Wed, 8 Mar 2006 19:57:22 -0500, Kris Kennaway wrote: > > No, no, you got me wrong. The pidfile is left locked after cron stopped > > running (with /etc/rc.d/cron stop). This behaviour must be wrong. > > OK, I misunderstood. The rc.d script will signal cron to kill it, > which should be closing

Re: rpc.lockd brokenness (2)

2007-04-04 Thread Kris Kennaway
On Thu, Apr 05, 2007 at 12:16:43PM +0900, Jun Kuriyama wrote: > At Wed, 8 Mar 2006 19:57:22 -0500, > Kris Kennaway wrote: > > > No, no, you got me wrong. The pidfile is left locked after cron stopped > > > running (with /etc/rc.d/cron stop). This behaviour must be wrong. > > > > OK, I misunderstoo

Re: rpc.lockd brokenness (2)

2006-04-08 Thread Kris Kennaway
On Sat, Apr 08, 2006 at 09:52:35AM -0400, Rong-En Fan wrote: > On 4/8/06, Kris Kennaway <[EMAIL PROTECTED]> wrote: > > On Sat, Apr 08, 2006 at 01:28:55AM -0400, Rong-En Fan wrote: > > > On 3/6/06, Jun Kuriyama <[EMAIL PROTECTED]> wrote: > > > > > > > > I'm not yet received enough information to tra

Re: rpc.lockd brokenness (2)

2006-04-08 Thread Rong-En Fan
On 4/8/06, Kris Kennaway <[EMAIL PROTECTED]> wrote: > On Sat, Apr 08, 2006 at 01:28:55AM -0400, Rong-En Fan wrote: > > On 3/6/06, Jun Kuriyama <[EMAIL PROTECTED]> wrote: > > > > > > I'm not yet received enough information to track rpc.lockd problem. > > > > > > As Kris posted before, here is a patc

Re: rpc.lockd brokenness (2)

2006-04-07 Thread Kris Kennaway
On Sat, Apr 08, 2006 at 01:28:55AM -0400, Rong-En Fan wrote: > On 3/6/06, Jun Kuriyama <[EMAIL PROTECTED]> wrote: > > > > I'm not yet received enough information to track rpc.lockd problem. > > > > As Kris posted before, here is a patch to backout my suspected > > commit. If someone can easily rep

Re: rpc.lockd brokenness (2)

2006-04-07 Thread Rong-En Fan
On 3/6/06, Jun Kuriyama <[EMAIL PROTECTED]> wrote: > > I'm not yet received enough information to track rpc.lockd problem. > > As Kris posted before, here is a patch to backout my suspected > commit. If someone can easily reproduce this problem, please try with > this patch on both of server/clien

Re: rpc.lockd brokenness (2)

2006-03-13 Thread Kris Kennaway
On Mon, Mar 13, 2006 at 05:15:59PM +, Miguel Lopes Santos Ramos wrote: > > I'm not yet sure whether this is a regression in 6.x or another case > > that was broken forever. > > I didn't have problems in 5. I just compiled a 6.0-RELEASE kernel, and it > is also broken. I have verified (using

Re: rpc.lockd brokenness (2)

2006-03-13 Thread Miguel Lopes Santos Ramos
> I did some further testing and it turns out that rpc.lockd is broken > in some cases when operating over NFSv2 (this is the default for nfs > root mounts). > > Tracing the lock traffic I see the client making a request, the server > replying but the client never acting on the reply (or never rece

Re: rpc.lockd brokenness (2)

2006-03-10 Thread Kris Kennaway
I did some further testing and it turns out that rpc.lockd is broken in some cases when operating over NFSv2 (this is the default for nfs root mounts). Tracing the lock traffic I see the client making a request, the server replying but the client never acting on the reply (or never receiving it),

Re: rpc.lockd brokenness (2)

2006-03-08 Thread Kris Kennaway
On Thu, Mar 09, 2006 at 03:53:19AM +, Miguel Lopes Santos Ramos wrote: > > Can you try to narrow down this problem some more? e.g. look up the > > port used by rpc.lockd with rpcinfo on client and server and tcpdump > > to see what locking requests are being passed back and forth (you > > shou

Re: rpc.lockd brokenness (2)

2006-03-08 Thread Miguel Lopes Santos Ramos
> Can you try to narrow down this problem some more? e.g. look up the > port used by rpc.lockd with rpcinfo on client and server and tcpdump > to see what locking requests are being passed back and forth (you > should see the request from client -> server and the reply granting > the lock; or not

Re: rpc.lockd brokenness (2)

2006-03-08 Thread Kris Kennaway
On Thu, Mar 09, 2006 at 03:12:24AM +, Miguel Lopes Santos Ramos wrote: > > From: Kris Kennaway <[EMAIL PROTECTED]> > > Subject: Re: rpc.lockd brokenness (2) > > > > Yeah, the file is still locked on the server, and will never be > > unlocked unless you s

Re: rpc.lockd brokenness (2)

2006-03-08 Thread Miguel Lopes Santos Ramos
> From: Kris Kennaway <[EMAIL PROTECTED]> > Subject: Re: rpc.lockd brokenness (2) > > Yeah, the file is still locked on the server, and will never be > unlocked unless you stop and restart the rpc.lockd on the server > (which releases all the locks it holds). I did th

Re: rpc.lockd brokenness (2)

2006-03-08 Thread Kris Kennaway
On Thu, Mar 09, 2006 at 02:14:59AM +, Miguel Lopes Santos Ramos wrote: > > From: Kris Kennaway <[EMAIL PROTECTED]> > > > > The bug is triggered because the file is locked in the parent > > (i.e. the daemon process, which creates the pidfile) but unlocked by > > the child after the fork (in this

Re: rpc.lockd brokenness (2)

2006-03-08 Thread Miguel Lopes Santos Ramos
> From: Kris Kennaway <[EMAIL PROTECTED]> > > The bug is triggered because the file is locked in the parent > (i.e. the daemon process, which creates the pidfile) but unlocked by > the child after the fork (in this case, when the child is killed). On > the server, rpc.lockd compares the svid (=3D

Re: rpc.lockd brokenness (2)

2006-03-08 Thread Miguel Lopes Santos Ramos
> From: Kris Kennaway <[EMAIL PROTECTED]> > Subject: Re: rpc.lockd brokenness (2) > [...] > OK, I misunderstood. The rc.d script will signal cron to kill it, > which should be closing the file descriptors and causing rpc.lockd to > release the lock. Perhaps this part

Re: rpc.lockd brokenness (2)

2006-03-08 Thread Kris Kennaway
On Wed, Mar 08, 2006 at 07:57:22PM -0500, Kris Kennaway wrote: > On Thu, Mar 09, 2006 at 12:26:44AM +, Miguel Lopes Santos Ramos wrote: > > > From: Kris Kennaway <[EMAIL PROTECTED]> > > > Subject: Re: rpc.lockd brokenness (2) > > > > > > This is int

Re: rpc.lockd brokenness (2)

2006-03-08 Thread Kris Kennaway
On Thu, Mar 09, 2006 at 12:26:44AM +, Miguel Lopes Santos Ramos wrote: > > From: Kris Kennaway <[EMAIL PROTECTED]> > > Subject: Re: rpc.lockd brokenness (2) > > > > This is intentional. It's how pidfile_*() tests whether the process > > is still r

Re: rpc.lockd brokenness (2)

2006-03-08 Thread Miguel Lopes Santos Ramos
> From: Kris Kennaway <[EMAIL PROTECTED]> > Subject: Re: rpc.lockd brokenness (2) > > This is intentional. It's how pidfile_*() tests whether the process > is still running. The intention is that if someone tries to open the > pidfile again while the first proce

Re: rpc.lockd brokenness (2)

2006-03-08 Thread Kris Kennaway
On Wed, Mar 08, 2006 at 02:01:24PM +, Miguel Lopes Santos Ramos wrote: > > I wonder if something else is going wrong and it's not rpc.lockd at > > all. > > Oh, it's a locking problem alright. But perhaps not in rpc.lockd... OK, I think I understand what is going on now...sort of. > > It loo

Re: rpc.lockd brokenness (2)

2006-03-08 Thread Miguel Lopes Santos Ramos
> From: Kris Kennaway <[EMAIL PROTECTED]> > Subject: Re: rpc.lockd brokenness (2) > > I wonder if something else is going wrong and it's not rpc.lockd at > all. Oh, it's a locking problem alright. But perhaps not in rpc.lockd... > It looks like this wasn

Re: rpc.lockd brokenness (2)

2006-03-07 Thread Kris Kennaway
On Wed, Mar 08, 2006 at 12:30:02AM +, Miguel Lopes Santos Ramos wrote: > > From: Kris Kennaway <[EMAIL PROTECTED]> > > Subject: Re: rpc.lockd brokenness (2) > > > [...] > > but there's no evidence in the trace that it ever tries to write. Can > >

Re: rpc.lockd brokenness (2)

2006-03-07 Thread Miguel Lopes Santos Ramos
> From: Kris Kennaway <[EMAIL PROTECTED]> > Subject: Re: rpc.lockd brokenness (2) > [...] > but there's no evidence in the trace that it ever tries to write. Can > you also obtain a ktrace -i dump from cron? The file remains empty. I really don't know enough ab

Re: rpc.lockd brokenness (2)

2006-03-07 Thread Kris Kennaway
On Tue, Mar 07, 2006 at 05:43:37PM -0500, Kris Kennaway wrote: > but there's no evidence in the trace that it ever tries to write. Can > you also obtain a ktrace -i dump from cron? > > Kris Also while you're there, could you obtain a binary format tcpdump (tcpdump -w) instead? This may be pars

Re: rpc.lockd brokenness (2)

2006-03-07 Thread Kris Kennaway
On Tue, Mar 07, 2006 at 10:04:46PM +, Miguel Lopes Santos Ramos wrote: > > From: Kris Kennaway <[EMAIL PROTECTED]> > > Subject: Re: rpc.lockd brokenness (2) > > > > > Ok. There are two versions: > > > http://mega.ist.utl.pt/~mlsr/nfs.dump > &

Re: rpc.lockd brokenness (2)

2006-03-07 Thread Miguel Lopes Santos Ramos
> From: Kris Kennaway <[EMAIL PROTECTED]> > Subject: Re: rpc.lockd brokenness (2) > > > Ok. There are two versions: > > http://mega.ist.utl.pt/~mlsr/nfs.dump > > is the output of tcpdump -vvv host targa and udp port nfs > > http://mega.ist.ut

Re: rpc.lockd brokenness (2)

2006-03-07 Thread Kris Kennaway
On Tue, Mar 07, 2006 at 07:38:48PM +, Miguel Lopes Santos Ramos wrote: > > Can you put it at a URL somewhere? If not, send it privately. > > > > Kris > > Ok. There are two versions: > http://mega.ist.utl.pt/~mlsr/nfs.dump > is the output of tcpdump -vvv host targa and udp port nfs

Re: rpc.lockd brokenness (2)

2006-03-07 Thread Miguel Lopes Santos Ramos
> Can you put it at a URL somewhere? If not, send it privately. > > Kris Ok. There are two versions: http://mega.ist.utl.pt/~mlsr/nfs.dump is the output of tcpdump -vvv host targa and udp port nfs http://mega.ist.utl.pt/~mlsr/nfsx.dump is the output of tcpdump -X -vvv host

Re: rpc.lockd brokenness (2)

2006-03-07 Thread Kris Kennaway
On Tue, Mar 07, 2006 at 06:58:45PM +, Miguel Lopes Santos Ramos wrote: > > OK, thanks. Please try to obtain a tcpdump -vvv trace of the broken > > operation. > > > > Kris > > I have it. I had just to disable cron startup, which was the only daemon > that used pidfile_open and started after st

Re: rpc.lockd brokenness (2)

2006-03-07 Thread Miguel Lopes Santos Ramos
> OK, thanks. Please try to obtain a tcpdump -vvv trace of the broken > operation. > > Kris I have it. I had just to disable cron startup, which was the only daemon that used pidfile_open and started after statd/lockd, and then start it by hand. Now, perhaps it is better to post it off-list, sin

Re: rpc.lockd brokenness (2)

2006-03-07 Thread Kris Kennaway
On Tue, Mar 07, 2006 at 06:17:00PM +, Miguel Lopes Santos Ramos wrote: > > OK, thanks. Please try to obtain a tcpdump -vvv trace of the broken > > operation. > > > > Kris > > Will you help me? I've been trying to use tcpdump -vvv on this but I get > either a lot of trash or nothing. Which por

Re: rpc.lockd brokenness (2)

2006-03-07 Thread Miguel Lopes Santos Ramos
> OK, thanks. Please try to obtain a tcpdump -vvv trace of the broken > operation. > > Kris Will you help me? I've been trying to use tcpdump -vvv on this but I get either a lot of trash or nothing. Which ports should I dump? So far I tried running on the server # tcpdump -vvv 'host and (udp po

Re: rpc.lockd brokenness (2)

2006-03-07 Thread Kris Kennaway
On Tue, Mar 07, 2006 at 05:56:22PM +, Miguel Lopes Santos Ramos wrote: > > > 1- Only one client machine of all I have, the only one which is remote > > > booted, hangs on startup with rpc.lockd/rpc.statd enabled. > > > > Just to verify: lockd is enabled on BOTH client AND server? > > > > Kris

Re: rpc.lockd brokenness (2)

2006-03-07 Thread Miguel Lopes Santos Ramos
> > 1- Only one client machine of all I have, the only one which is remote > > booted, hangs on startup with rpc.lockd/rpc.statd enabled. > > Just to verify: lockd is enabled on BOTH client AND server? > > Kris Oh yes. If any of the daemons is not enabled on both machines there's no locking, ther

Re: rpc.lockd brokenness (2)

2006-03-07 Thread Kris Kennaway
On Tue, Mar 07, 2006 at 05:38:56PM +, Miguel Ramos wrote: > > The problem I'm having is not exactly what is described in pr bin/80389, > although it is very much related. So, I'm unable to verify if the patch > you gave corrects the problem, since the problem is not visible on my > systems. >

Re: rpc.lockd brokenness (2)

2006-03-07 Thread Miguel Ramos
The problem I'm having is not exactly what is described in pr bin/80389, although it is very much related. So, I'm unable to verify if the patch you gave corrects the problem, since the problem is not visible on my systems. I do have a lockd hang, however... Here is as much data as I can gather:

rpc.lockd brokenness (2)

2006-03-06 Thread Jun Kuriyama
I'm not yet received enough information to track rpc.lockd problem. As Kris posted before, here is a patch to backout my suspected commit. If someone can easily reproduce this problem, please try with this patch on both of server/client side of rpc.lockd (I'm not sure which of server/client side