Re: PhraseQuery Performance Issues [Lucene 2.9.0]

2010-03-29 Thread Rafael Turk
unsubscribe - To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org For additional commands, e-mail: java-user-h...@lucene.apache.org

Re: PhraseQuery Performance Issues [Lucene 2.9.0]

2010-03-22 Thread Daniel Shane
java-user@lucene.apache.org Sent: Friday, March 19, 2010 7:14:06 PM Subject: Re: PhraseQuery Performance Issues [Lucene 2.9.0] Nutch/Solr's CommonGrams is the right way to solve this. It combines frequent terms (eg stopwords) with adjacent terms. So "the wizard of oz" will be indexed e

Re: PhraseQuery Performance Issues [Lucene 2.9.0]

2010-03-19 Thread Michael McCandless
Nutch/Solr's CommonGrams is the right way to solve this. It combines frequent terms (eg stopwords) with adjacent terms. So "the wizard of oz" will be indexed eg as the_wizard wizard_of of_oz. It'll require a full re-index though, and you have to fixup searching so that the same term expansion wo

PhraseQuery Performance Issues [Lucene 2.9.0]

2010-03-19 Thread Daniel Shane
I'm running a medium size web search with a index size just shy of 9GB with 80 docs in it. We are suing Lucene version 2.9.0 (we have not checked yet to see if this applies to older versions as well). By looking at my logs, I'm finding that phrase queries are especially long to perform. In