The analyzer through QueryParser is invoked for each "clause" and thus in your example it's invoked 4 times and thus each invocation only sees one word/term.
Erik On Jan 13, 2013, at 2:13, "Igal @ getRailo.org" <i...@getrailo.org> wrote: > hi, > > I've created an Analyzer that performs a few filtering tasks, including > creating Shingles and term Replacements among other things. > > I use that Analyzer with IndexWriter and it works as expected. but when I > use that same Analyzer with QueryParser > (org.apache.lucene.queryparser.classic.QueryParser) it behaves differently. > specifically it does not create shingles. see below the output for a simple > phrase of 4 terms: "word1 word2 word3 word4" > > from IndexWriter (shingle terms created as expected -- total of 7 terms): > > term: word1 term=word1,bytes=[77 6f 72 64 > 31],startOffset=0,endOffset=5,punct=0,positionIncrement=1,position=1,type=word,keyword=false > term: word2 term=word2,bytes=[77 6f 72 64 > 32],startOffset=6,endOffset=11,punct=0,positionIncrement=1,position=2,type=word,keyword=false > term: word1 word2 term=word1 word2,bytes=[77 6f 72 64 31 20 77 6f 72 64 > 32],startOffset=6,endOffset=11,punct=0,positionIncrement=0,position=2,type=SHINGLE,keyword=false > term: word3 term=word3,bytes=[77 6f 72 64 > 33],startOffset=12,endOffset=17,punct=0,positionIncrement=1,position=3,type=word,keyword=false > term: word2 word3 term=word2 word3,bytes=[77 6f 72 64 32 20 77 6f 72 64 > 33],startOffset=12,endOffset=17,punct=0,positionIncrement=0,position=3,type=SHINGLE,keyword=false > term: word4 term=word4,bytes=[77 6f 72 64 > 34],startOffset=18,endOffset=23,punct=0,positionIncrement=1,position=4,type=word,keyword=false > term: word3 word4 term=word3 word4,bytes=[77 6f 72 64 33 20 77 6f 72 64 > 34],startOffset=18,endOffset=23,punct=0,positionIncrement=0,position=4,type=SHINGLE,keyword=false > > > from QueryParser (shingle terms not created -- only 4 terms): > > term: word1 term=word1,bytes=[77 6f 72 64 > 31],startOffset=0,endOffset=5,punct=0,positionIncrement=1,position=1,type=word,keyword=false > term: word2 term=word2,bytes=[77 6f 72 64 > 32],startOffset=0,endOffset=5,punct=0,positionIncrement=1,position=1,type=word,keyword=false > term: word3 term=word3,bytes=[77 6f 72 64 > 33],startOffset=0,endOffset=5,punct=0,positionIncrement=1,position=1,type=word,keyword=false > term: word4 term=word4,bytes=[77 6f 72 64 > 34],startOffset=0,endOffset=5,punct=0,positionIncrement=1,position=1,type=word,keyword=false > > > can anyone tell me what I'm doing wrong? > > thank you, > > > Igal > > > > > > > --------------------------------------------------------------------- > To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org > For additional commands, e-mail: java-user-h...@lucene.apache.org > --------------------------------------------------------------------- To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org For additional commands, e-mail: java-user-h...@lucene.apache.org