thanks Erik.

I tried putting the query in "double quotes" and it made some difference but still not exactly what I'm looking for.

so what's my best solution? to avoid using the QueryParser and instead "parse" the query myself? is there a different (better) query parser for this situation?


Igal


On 1/13/2013 5:42 AM, Erik Hatcher wrote:
The analyzer through QueryParser is invoked for each "clause" and thus in your 
example it's invoked 4 times and thus each invocation only sees one word/term.

     Erik

On Jan 13, 2013, at 2:13, "Igal @ getRailo.org" <i...@getrailo.org> wrote:

hi,

I've created an Analyzer that performs a few filtering tasks, including 
creating Shingles and term Replacements among other things.

I use that Analyzer with IndexWriter and it works as expected.  but when I use that same 
Analyzer with QueryParser (org.apache.lucene.queryparser.classic.QueryParser) it behaves 
differently.  specifically it does not create shingles.  see below the output for a 
simple phrase of 4 terms:  "word1 word2 word3 word4"

from IndexWriter (shingle terms created as expected -- total of 7 terms):

term: word1    term=word1,bytes=[77 6f 72 64 
31],startOffset=0,endOffset=5,punct=0,positionIncrement=1,position=1,type=word,keyword=false
term: word2    term=word2,bytes=[77 6f 72 64 
32],startOffset=6,endOffset=11,punct=0,positionIncrement=1,position=2,type=word,keyword=false
term: word1 word2    term=word1 word2,bytes=[77 6f 72 64 31 20 77 6f 72 64 
32],startOffset=6,endOffset=11,punct=0,positionIncrement=0,position=2,type=SHINGLE,keyword=false
term: word3    term=word3,bytes=[77 6f 72 64 
33],startOffset=12,endOffset=17,punct=0,positionIncrement=1,position=3,type=word,keyword=false
term: word2 word3    term=word2 word3,bytes=[77 6f 72 64 32 20 77 6f 72 64 
33],startOffset=12,endOffset=17,punct=0,positionIncrement=0,position=3,type=SHINGLE,keyword=false
term: word4    term=word4,bytes=[77 6f 72 64 
34],startOffset=18,endOffset=23,punct=0,positionIncrement=1,position=4,type=word,keyword=false
term: word3 word4    term=word3 word4,bytes=[77 6f 72 64 33 20 77 6f 72 64 
34],startOffset=18,endOffset=23,punct=0,positionIncrement=0,position=4,type=SHINGLE,keyword=false


from QueryParser (shingle terms not created -- only 4 terms):

term: word1    term=word1,bytes=[77 6f 72 64 
31],startOffset=0,endOffset=5,punct=0,positionIncrement=1,position=1,type=word,keyword=false
term: word2    term=word2,bytes=[77 6f 72 64 
32],startOffset=0,endOffset=5,punct=0,positionIncrement=1,position=1,type=word,keyword=false
term: word3    term=word3,bytes=[77 6f 72 64 
33],startOffset=0,endOffset=5,punct=0,positionIncrement=1,position=1,type=word,keyword=false
term: word4    term=word4,bytes=[77 6f 72 64 
34],startOffset=0,endOffset=5,punct=0,positionIncrement=1,position=1,type=word,keyword=false


can anyone tell me what I'm doing wrong?

thank you,


Igal






---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org

Reply via email to