hi *,
i am searching for a fulltext index capeable of the following requirements:
index everyday 3 000 000 new records with a validity of N days (e.g.
90 days expiration)
== 34,7 / s
one record is e.g. an url and can be up to 2 k big
http://example.com/somedir/some.html
lucene should use "/" as
e index unless you really have lots of RAM or you don't need
> queries to be quick. In other words, you may have to spread this over
> multiple indices/machines.
>
>
> Otis --
> Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch
>
>
>
> - Origin
ocmuent inside the index (since this
consumes so many space)? e.g. extracting the keywords that were stored
for the item?
any hints appreciated.
regards chris
--
Christian Brennsteiner
Salzburg / Austria / Europe
-
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org
earch
>> "XYZ ZAB MILESTONE1"
>> i am getting 3 times EVENTID 3
>> --> this is bad since when i get 100 of such events how do i rank them?
>>
>> CONCLUSION:
>> my biggest problem is that my lucene document given to the index
>> curren
could use this mechanism to insure that. Simply choose
> an IncrementGap greater than the maximum number of terms in
> an event description, then when you want to search in the
> description field, just use a proximity less than the IncrementGap.
> It may not apply at all for you, but