The index contains about a several ten thousand documents, with a field count of about fifty. The index is going to be rebuild approx. every day, but varies, since the searchable content doesn't change very often. Now I face the challenge to work in more dynamic data into the index, and even make this searchable, or more using it to sort the documents in the search result. The change frequency of this data is very high, at minimum every minute, but can be every few seconds.

I know that it is impossible to have a real time up to date index and that there will be gap in time. But as long as this is not to large, this is ok. I've tried to do some incremental update every X seconds or minutes. This would eventually work for small indexes, with less data to be indexed for a single document, but I think that this is not the right approach here. It's also not the right way, because of collecting all the static, searchable data that didn't changed for a document, although only the dynamic information changed, you know?

I'm sorry, what exactly do you mean with "reopen the IndexSearcher every N minutes"?

You say, that I could also store the data in an external data store. Do you mean the dynamic data? If yes, I some how need this information within an index, in order to sort by it, right? Or do I overlook something here?

Thanks,
Tobias

Otis Gospodnetic schrieb:
Tobias,

The question is a little too open, I think.  Perhaps start by saying what 
you've tried, what doesn't work, what you think won't work, the actual rate of 
change, the size of your index and, very importantly, how quickly you need to 
see index changes (adds, deletes, updates).

How about this for the boostrap question: just update (delete+add) the whole 
Document and reopen the IndexSeearcher every N minutes.  Would that work for 
you?

Does only the stored data change or also searchable data?  If the former, you 
could choose to store that in the external data store (e.g. RDBMS, BDB...)

Otis
--
Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch

----- Original Message ----
From: Tobias Lohr <[EMAIL PROTECTED]>
To: java-user@lucene.apache.org
Sent: Tuesday, January 15, 2008 11:33:56 AM
Subject: Integrating dynamic data into Lucene search/ranking

I have a more architectural question, which is maybe sort of off topic,
 but as I want to implement it using Java  and Lucene, it's the right
 forum however:

I'm thinking of an approach to design a system that integrates dynamic
 information into a search (and a ranking) functionality using Lucene.
 With dynamic data I mean, data which changes very often within the
 typical index rebuild cycle, i.e. transactional data.

A good example is the sorting of products in an online store by product
 availability.

Does anybody know good reading resources (approaches, whitepapers,
 books etc.) for integrating such dynamic information into a search/ranking
 functionality?

(I already searched at Google, but couldn't find anything useful
 though.)

Thanks in advance!


---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to