Why not switch where the searchers look rather than copy the index and
restart? That is, your searcher is pointing at index1, and you build the new
one in a a new dir (index2). On some signal, your server closes the searcher
pointing to index1 and opens one pointing to index2 and uses that until
tomorrow, when you do the opposite.

You could even warm up the searcher after you open it but before you start
searching with it if you wanted.

Or, if you are using Linux, say, your index directory could be a symlink and
your process would be
1> build/test the new index
2> shut down the server
3> switch the symlink to point at the new index directory
4> start the server.

You'd still have a small interruption for your users, but we're probably
talking 2 seconds plus however long it takes you to stop/start your
server.....

Erick


On 12/20/06, Scott Sellman <[EMAIL PROTECTED]> wrote:

Note: I have changed the title of this thread to match its content

I am currently facing a similar issue.  I am dealing with a large index
that is constantly used and needs to be updated on a daily basis.  For
fear of corruption I would rather rebuild the index each time,
performing tests against it before using it.  However the problem I am
having is switching in the old index without causing service
interruption.  As long as queries are being made against the index I am
running into locking issues with the index files, preventing me from
putting the new index in place. Any suggestions?

Thanks,
Scott
-----Original Message-----
From: Erick Erickson [mailto:[EMAIL PROTECTED]
Sent: Wednesday, December 20, 2006 7:59 AM
To: java-user@lucene.apache.org
Subject: Re: MultiFieldQueryParser doesn't properly filter out documents
when the query string specifies to exclude certain terms

My first question is how many documents would you be deleting on a pass
for
option 2? If it's 10 documents out of 10,000, I'd consider just deleting
them and re-adding (see IndexModifier).

Personally, if posible, I prefer your first option, building a
completely
new index and switching between them. This is especially useful if
something
catastrophic happens to the index as you build it and it winds up being
unusable (power failures *do* happen). You can keep using your old index
and
be happy.

Another question is how quickly the index builds and how soon do your
users
require that they get up-to-date data?

And remember that no matter what, you must re-open your searcher to see
the
updates.

I'd be really reluctant to remove all the items and re-build the index
for
several reasons...
1> You wouldn't get the new data being added until you closed/reopened
your
searcher.
2> The documents you deleted wouldn't be "gone" until you
closed/reopened
your searcher.
3> In the interim, your users wouldn't have access to much of
anything....

Best
Erick

On 12/20/06, Adam Fleming <[EMAIL PROTECTED]> wrote:
>
>
> Hello Gentlemen (+Ladies?),
>
> I'm integrating Lucene into a Spring web-app, and have found a
plethora of
> great web + print resources to make the integration quick and
seamless.  One
> thing that I have been hard-pressed to find is a good solution for
> rebuilding the index on a regular basis.
>
> I'm curious if a you know of a best-practice (or have found something
> personally that works) for rebuilding a Lucene Index w/o service
> interruptions.  The assumptions are a spring IOC container w/ an
> IndexFactory bean.  I have the project configured to work with both
> FSDirectory and RamDirectory implementations.   If you don't know
Spring,
> you are free to ignore the details - I'll adapt your comments to my
code :)
>
> So far I tried rebuilding the index on a regular schedule, but
foolishly
> only added duplicate documents to an existing index.
>
> Things I have considered are
> - Using two index directories, and rebuilding one while the other is
>    in use + switching when the rebuilt index is ready.  This would
>    cause the app to alternate between two indexes.
> - Using a single index, and iterating over the index entirely,
>    deleting documents 1 by 1 and re-adding them with fresh data
> - Using a single index, and deleting ALL the documents at once
>    and then adding them all back as quickly as possible.
>
>
> All of my proposed ideas seem fly in the face of Lucene's sipmlicity,
and
> I will be so thankful to be pointed in the right direction.
>
>
> Happy Holidays and  a big Thank You to the active list users,
>
>
> Adam Fleming
>
> _________________________________________________________________
> Try amazing new 3D maps
> http://maps.live.com/?wip=51
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [EMAIL PROTECTED]
> For additional commands, e-mail: [EMAIL PROTECTED]
>
>

---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]


Reply via email to