date:20081221

Re: Re: Re: lucene suiteable ? 6 mio recods / day 1k

2008-12-21 Thread tom

AUTOMATIC REPLY LUX is closed until 5th January 2009 - To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org For additional commands, e-mail: java-user-h...@lucene.apache.org

Re: Re: lucene suiteable ? 6 mio recods / day 1k

2008-12-21 Thread tom

AUTOMATIC REPLY LUX is closed until 5th January 2009 - To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org For additional commands, e-mail: java-user-h...@lucene.apache.org

Re: lucene suiteable ? 6 mio recods / day 1k

2008-12-21 Thread Christian Brennsteiner

hi otis, i think that out of 2 k 80 % can be stemmed and many of the words are duplicates so they would not need full space. can you give me an idea what in your opinion would mean "don't need queries to be quick" ... i have no idea in what timeframe it could be handeled if it is not completely i

Re: lucene suiteable ? 6 mio recods / day 1k

2008-12-21 Thread Otis Gospodnetic

Christian, You can certainly purge old documents on a daily basis in order to keep the corpus from growing, but note that 3M*90=270M 2K docs may be a bit too much for a single index unless you really have lots of RAM or you don't need queries to be quick. In other words, you may have to spread

Re: Url Analyzer

2008-12-21 Thread Otis Gospodnetic

Mark, This is simple enough that it should be easy to put together. If you search the ML archives you'll see that one of the common "tricks" is to "flip" host name parts (e.g. com.sematext.www). The details of this have been discussed before, so have a look. Otis -- Sematext -- http://semat

Re: BooleanQuery Performance Help

2008-12-21 Thread Prafulla Kiran

Hi, Here's the code which I am using to time the query: long startTime = System.currentTimeMillis(); TopDocCollector collector = new TopDocCollector(10); is.search(query,collector); ScoreDoc[] hits = collector.topDocs().scoreDocs; long endTime = System.currentTimeMillis(); Most of the clauses w

Re: Default and optimal use of RAMDirectory

2008-12-21 Thread Otis Gospodnetic

Let me add to that that I clearly recall having a hard time getting the tests for that particular section of LIA1 to clearly and consistently show that using the RAMDirectory buffering approach instead of vanilla IndexWriter yields faster indexing. Even back then IndexWriter buffered indexed da

Re: Re: Re: Inquiry on Lucene Stemming

2008-12-21 Thread tom

AUTOMATIC REPLY LUX is closed until 5th January 2009 - To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org For additional commands, e-mail: java-user-h...@lucene.apache.org

Re: Re: Inquiry on Lucene Stemming

2008-12-21 Thread tom

AUTOMATIC REPLY LUX is closed until 5th January 2009 - To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org For additional commands, e-mail: java-user-h...@lucene.apache.org

Re: Inquiry on Lucene Stemming

2008-12-21 Thread Otis Gospodnetic

If Hoss is referring to synonym expansion, allow me to point out that freely downloadable code from Lucene in Action (first edition) has code for that, if you'd like to have a look, OP. Otis -- Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch - Original Message > From: Chri

Re: Re: Re: lucene suiteable ? 6 mio recods / day 1k

Re: Re: lucene suiteable ? 6 mio recods / day 1k

Re: lucene suiteable ? 6 mio recods / day 1k

Re: lucene suiteable ? 6 mio recods / day 1k

Re: Url Analyzer

Re: BooleanQuery Performance Help

Re: Default and optimal use of RAMDirectory

Re: Re: Re: Inquiry on Lucene Stemming

Re: Re: Inquiry on Lucene Stemming

Re: Inquiry on Lucene Stemming

10 matches

Site Navigation

Mail list logo

Footer information