On Tue, Apr 20, 2010 at 07:27:57AM -0400, Michael McCandless wrote: > There are elements of IW that still must be "centralized" -- managing > the merge policy/schedulers, deletion policy, writing/committing the > segments files, managing ongoing addIndexes, tracking pending > deletions, the reader pool, etc.
I've got a prototype BackgroundMerger working for KS/Lucy which can work concurrently with an Indexer. It drops a "merge.lock" file which blocks Indexer from merging any segments that existed at the moment of the lockfile's creation. When it's done merging, it acquires the write.lock, carries forward any deletions that Indexer has written against the segments it's merging away, then commits and releases both locks. Based on the success of this prototype, I believe merging policy is a theoretically solvable problem using mutexes to lay claim to the mergable segments while doing the heavy lifting, and the write lock to coordinate deletions and committing. However, The management of individual deletions seems like a daunting problem every time I consider how to expand this model out to multiple indexing processes operating against the same index when those processes must be allowed to create new deletions. I think there are insoluble race conditions until you get into document-level locking. I imagine NRT readers make this problem even harder. Marvin Humphrey --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
