Lucene 2.2, NFS, Lock obtain timed out

Patrick Kimber Fri, 29 Jun 2007 02:01:31 -0700

Hi,

We are sharing a Lucene index in a Linux cluster over an NFS share.  We have
multiple servers reading and writing to the index.

I am getting regular lock exceptions e.g.
Lock obtain timed out:
NativeFSLock@/mnt/nfstest/repository/lucene/lock/lucene-2d3d31fa7f19eabb73d692df44087d81-n-write.lock

- We are using Lucene 2.2.0
- We are using kernel NFS and lockd is running.
- We are using a modified version of the ExpirationTimeDeletionPolicy
found in the
Lucene test suite:
http://svn.apache.org/repos/asf/lucene/java/trunk/src/test/org/apache/lucene/index/TestDeletionPolicy.java
I have set the expiration time to 600 seconds (10 minutes).
- We are using the NativeFSLockFactory with the lock folder being
within the index
folder:
/mnt/nfstest/repository/lucene/lock/
- I have implemented a handler which will pause and retry an update or delete
operation if a LockObtainFailedException or StaleReaderException is
caught. The
handler will retry the update or delete once every second for 1 minute before
re-throwing the exception and aborting.

The issue appears to be caused by a lock file which is not deleted.
The handlers
keep retrying... the process holding the lock eventually aborts...
this deletes the
lock file - any applications still running then continue normally.

The application does not throw these exceptions when it is run on a
standard Linux
file system or Windows workstation.

I would really appreciate some help with this issue. The chances are I am doing
something stupid... but I cannot think what to try next.

Thanks for your help

Patrick

---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Lucene 2.2, NFS, Lock obtain timed out

Reply via email to