lucene suiteable ? 6 mio recods / day 1k

2008-12-19 Thread Christian Brennsteiner
hi *, i am searching for a fulltext index capeable of the following requirements: index everyday 3 000 000 new records with a validity of N days (e.g. 90 days expiration) == 34,7 / s one record is e.g. an url and can be up to 2 k big http://example.com/somedir/some.html lucene should use "/" as

Re: lucene suiteable ? 6 mio recods / day 1k

2008-12-21 Thread Christian Brennsteiner
e index unless you really have lots of RAM or you don't need > queries to be quick. In other words, you may have to spread this over > multiple indices/machines. > > > Otis -- > Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch > > > > - Origin

stream of events never to know when it ends? how to index such things & search

2009-02-18 Thread Christian Brennsteiner
ocmuent inside the index (since this consumes so many space)? e.g. extracting the keywords that were stored for the item? any hints appreciated. regards chris -- Christian Brennsteiner Salzburg / Austria / Europe - To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org For additional commands, e-mail: java-user-h...@lucene.apache.org

Re: stream of events never to know when it ends? how to index such things & search

2009-02-19 Thread Christian Brennsteiner
earch >> "XYZ ZAB MILESTONE1" >> i am getting 3 times EVENTID 3 >> --> this is bad since when i get 100 of such events how do i rank them? >> >> CONCLUSION: >> my biggest problem is that my lucene document given to the index >> curren

Re: stream of events never to know when it ends? how to index such things & search

2009-02-19 Thread Christian Brennsteiner
could use this mechanism to insure that. Simply choose > an IncrementGap greater than the maximum number of terms in > an event description, then when you want to search in the > description field, just use a proximity less than the IncrementGap. > It may not apply at all for you, but