Thank you . That does look like what I want JS Eyal <[EMAIL PROTECTED]> wrote: Run a search on "Lucene ParallelReader" in google - You'll find something Doug Cutting wrote that I believe is what you're looking for.
Eyal > -----Original Message----- > From: John Smith [mailto:[EMAIL PROTECTED] > Sent: Thursday, August 11, 2005 21:12 PM > To: java-user@lucene.apache.org > Subject: Updating existing documents in index: Solutions > > > Hi all > > > > This is a slightly long email. Pardon me. > > > > As Lucene does not allow for updating an existing document in > the index, the only option is to delete and reindex the > message.When you have too many updates, this gets a little > cumbersome. In our case, as such the actual content of the > document being indexed does > > not change, but the fields around the content, like say > "LastReadby" or something like Folder associated with it etc > change. These are all fields that have been indexed as a part > of the original document in the index. > > > > I have been contemplating putting these "commonly changing > fields" into one index and allow for delete and reindex on > this index alone and keep the static data in another index. > DocumentID will be a stored field and will be stored in both > the static and dynamic index, as a way of identifying the document. > > > > Static index: Contains content of document indexed and > documentID stored. > > Dynamic index: Contains all fields about the document which > change frequently indexed and documentID stored. > > > > > > Questions > > > > 1. First of all, is there a better solution to this > frequently changing fields having to be reindexed ? > > > > 2. Let's say I go with the 2 index approach, > > > > Example query: Content: "Hello world" AND Folder:Folder1 AND > LastReadBy: jane. If we execute these queries on our static > and dynamic indexes, they will obviously fail to get hits. > > > > > > Let's say I have a way of splitting my queries such that > all content queries go to static (content) index only and > queries on other fields go to the dynamic index, basically > allow for queries to come in such a way that it is always a > AND between the dynamic index result set and static index > result set. So on the results set, I would have to retrieve > the document ID and make sure we have the same documentID in > both the result sets, in order for it to be a match. > > In cases where the result sets are really huge from > both the queries, then even to get the number of hits, I will > have to retrieve each and every document from the results, in > order to get the documentID for comparison. Queries can get > really slow. > > > > Has anyone faced similar problems, If so what was your solution? > > Any comments/thoughts will be appreciated. > > > > Thank you > > JS > > > > > --------------------------------- > Start your day with Yahoo! - make it your home page --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] __________________________________________________ Do You Yahoo!? Tired of spam? Yahoo! Mail has the best spam protection around http://mail.yahoo.com