Hi,

one of our FreeBSD-Servers is acting as NFS-Server for $HOME for approx. 50 HP-UX Workstations, since the WS itself and the disks in there become quite old in the meantime.

That works quite good with FreeBSD 6.3-RELEASE-pxx but doesn't work with 6.4/7.1 any more.

I looked with 'wireshark' on the problem and it seems to be a locking problen, probably related to PR 'kern/130628', but I'm not sure.

Here what I know so far:

Server-OS:      FreeBSD 6.4-RELEASE/7.1-RELEASE (same problems)
Workstation-OS: HP-UX 11iv1 (11.11)
NFS-Version:    V3/tcp or V3/udp (NFS-V2 works!)

I found no records of the problem on the client side (HP-UX) whereas on FreeBSD 'rpc.lockd -d 3'
produces the following entries in /var/log/messages:

Jan 21 12:07:33 bsd1dw kernel: NLM: new host hp13 (sysid 5)
Jan 21 12:07:33 bsd1dw kernel: nlm_do_cancel(): caller_name = hp13 (sysid = 5) Jan 21 12:07:53 bsd1dw kernel: nlm_do_cancel(): caller_name = hp13 (sysid = 5) Jan 21 12:08:13 bsd1dw kernel: nlm_do_cancel(): caller_name = hp13 (sysid = 5) Jan 21 12:08:32 bsd1dw kernel: nlm_do_lock(): caller_name = hp13 (sysid = 5) Jan 21 12:08:33 bsd1dw kernel: nlm_do_cancel(): caller_name = hp13 (sysid = 5) Jan 21 12:08:43 bsd1dw kernel: nlm_do_lock(): caller_name = hp13 (sysid = 5) Jan 21 12:08:53 bsd1dw kernel: nlm_do_cancel(): caller_name = hp13 (sysid = 5) Jan 21 12:09:03 bsd1dw kernel: nlm_do_lock(): caller_name = hp13 (sysid = 5) Jan 21 12:09:13 bsd1dw kernel: nlm_do_cancel(): caller_name = hp13 (sysid = 5) Jan 21 12:09:13 bsd1dw kernel: nlm_do_lock(): caller_name = hp13 (sysid = 5) Jan 21 12:09:23 bsd1dw kernel: nlm_do_lock(): caller_name = hp13 (sysid = 5) Jan 21 12:09:33 bsd1dw kernel: nlm_do_cancel(): caller_name = hp13 (sysid = 5)


What happens is as follows:

When logging in to an account with the home directory on the NFS- Server, the shell reads '.profile' and the tries to get a lock on '.sh_history'. From a FreeBSD 6.3 server the shell gets the lock whereas a 6.4/7.1 server replies with "V4 LOCK_RES Call NLM_FAILED".

Of course the HP-UX shell assumes the file is already locked, waits some time and tries again. This game leads to a complete lock of the account... :-( This does not happen if commandline-history is disabled but nontheless it's an error anyway.


I have recorded the network traffic for a NFSv2 session, a NFSv3/tcp session with a 6.3 server and a NFSv3/tcp session with a 7-STABLE server. If the wireshark dumps are of interest beyond of what I described here they are available on request.

I hope my informations help those who are able to fix it...

Matthew

--
Ciao/BSD - Matthias

Matthias Schuendehuette    <msch [at] snafu.de>, Berlin (Germany)



Reply via email to