Hi,

We are sharing a Lucene index in a Linux cluster over an NFS share.  We have
multiple servers reading and writing to the index.

I am getting regular lock exceptions e.g.
Lock obtain timed out:
NativeFSLock@/mnt/nfstest/repository/lucene/lock/lucene-2d3d31fa7f19eabb73d692df44087d81-n-write.lock

- We are using Lucene 2.2.0
- We are using kernel NFS and lockd is running.
- We are using a modified version of the ExpirationTimeDeletionPolicy
found in the
 Lucene test suite:
http://svn.apache.org/repos/asf/lucene/java/trunk/src/test/org/apache/lucene/index/TestDeletionPolicy.java
 I have set the expiration time to 600 seconds (10 minutes).
- We are using the NativeFSLockFactory with the lock folder being
within the index
 folder:
 /mnt/nfstest/repository/lucene/lock/
- I have implemented a handler which will pause and retry an update or delete
 operation if a LockObtainFailedException or StaleReaderException is
caught.  The
 handler will retry the update or delete once every second for 1 minute before
 re-throwing the exception and aborting.

The issue appears to be caused by a lock file which is not deleted.
The handlers
keep retrying... the process holding the lock eventually aborts...
this deletes the
lock file - any applications still running then continue normally.

The application does not throw these exceptions when it is run on a
standard Linux
file system or Windows workstation.

I would really appreciate some help with this issue.  The chances are I am doing
something stupid... but I cannot think what to try next.

Thanks for your help

Patrick

---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to