The index contains about a several ten thousand documents, with a field
count of about fifty. The index is going to be rebuild approx. every
day, but varies, since the searchable content doesn't change very often.
Now I face the challenge to work in more dynamic data into the index,
and even make this searchable, or more using it to sort the documents in
the search result. The change frequency of this data is very high, at
minimum every minute, but can be every few seconds.
I know that it is impossible to have a real time up to date index and
that there will be gap in time. But as long as this is not to large,
this is ok. I've tried to do some incremental update every X seconds or
minutes. This would eventually work for small indexes, with less data to
be indexed for a single document, but I think that this is not the right
approach here. It's also not the right way, because of collecting all
the static, searchable data that didn't changed for a document, although
only the dynamic information changed, you know?
I'm sorry, what exactly do you mean with "reopen the IndexSearcher every
N minutes"?
You say, that I could also store the data in an external data store. Do
you mean the dynamic data? If yes, I some how need this information
within an index, in order to sort by it, right? Or do I overlook
something here?
Thanks,
Tobias
Otis Gospodnetic schrieb:
Tobias,
The question is a little too open, I think. Perhaps start by saying what
you've tried, what doesn't work, what you think won't work, the actual rate of
change, the size of your index and, very importantly, how quickly you need to
see index changes (adds, deletes, updates).
How about this for the boostrap question: just update (delete+add) the whole
Document and reopen the IndexSeearcher every N minutes. Would that work for
you?
Does only the stored data change or also searchable data? If the former, you
could choose to store that in the external data store (e.g. RDBMS, BDB...)
Otis
--
Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch
----- Original Message ----
From: Tobias Lohr <[EMAIL PROTECTED]>
To: java-user@lucene.apache.org
Sent: Tuesday, January 15, 2008 11:33:56 AM
Subject: Integrating dynamic data into Lucene search/ranking
I have a more architectural question, which is maybe sort of off topic,
but as I want to implement it using Java and Lucene, it's the right
forum however:
I'm thinking of an approach to design a system that integrates dynamic
information into a search (and a ranking) functionality using Lucene.
With dynamic data I mean, data which changes very often within the
typical index rebuild cycle, i.e. transactional data.
A good example is the sorting of products in an online store by product
availability.
Does anybody know good reading resources (approaches, whitepapers,
books etc.) for integrating such dynamic information into a search/ranking
functionality?
(I already searched at Google, but couldn't find anything useful
though.)
Thanks in advance!
---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]