My first question is how many documents would you be deleting on a pass for
option 2? If it's 10 documents out of 10,000, I'd consider just deleting
them and re-adding (see IndexModifier).

Personally, if posible, I prefer your first option, building a completely
new index and switching between them. This is especially useful if something
catastrophic happens to the index as you build it and it winds up being
unusable (power failures *do* happen). You can keep using your old index and
be happy.

Another question is how quickly the index builds and how soon do your users
require that they get up-to-date data?

And remember that no matter what, you must re-open your searcher to see the
updates.

I'd be really reluctant to remove all the items and re-build the index for
several reasons...
1> You wouldn't get the new data being added until you closed/reopened your
searcher.
2> The documents you deleted wouldn't be "gone" until you closed/reopened
your searcher.
3> In the interim, your users wouldn't have access to much of anything....

Best
Erick

On 12/20/06, Adam Fleming <[EMAIL PROTECTED]> wrote:


Hello Gentlemen (+Ladies?),

I'm integrating Lucene into a Spring web-app, and have found a plethora of
great web + print resources to make the integration quick and seamless.  One
thing that I have been hard-pressed to find is a good solution for
rebuilding the index on a regular basis.

I'm curious if a you know of a best-practice (or have found something
personally that works) for rebuilding a Lucene Index w/o service
interruptions.  The assumptions are a spring IOC container w/ an
IndexFactory bean.  I have the project configured to work with both
FSDirectory and RamDirectory implementations.   If you don't know Spring,
you are free to ignore the details - I'll adapt your comments to my code :)

So far I tried rebuilding the index on a regular schedule, but foolishly
only added duplicate documents to an existing index.

Things I have considered are
- Using two index directories, and rebuilding one while the other is
   in use + switching when the rebuilt index is ready.  This would
   cause the app to alternate between two indexes.
- Using a single index, and iterating over the index entirely,
   deleting documents 1 by 1 and re-adding them with fresh data
- Using a single index, and deleting ALL the documents at once
   and then adding them all back as quickly as possible.


All of my proposed ideas seem fly in the face of Lucene's sipmlicity, and
I will be so thankful to be pointed in the right direction.


Happy Holidays and  a big Thank You to the active list users,


Adam Fleming

_________________________________________________________________
Try amazing new 3D maps
http://maps.live.com/?wip=51
---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]


Reply via email to