hi,

I've created an Analyzer that performs a few filtering tasks, including creating Shingles and term Replacements among other things.

I use that Analyzer with IndexWriter and it works as expected. but when I use that same Analyzer with QueryParser (org.apache.lucene.queryparser.classic.QueryParser) it behaves differently. specifically it does not create shingles. see below the output for a simple phrase of 4 terms: "word1 word2 word3 word4"

from IndexWriter (shingle terms created as expected -- total of 7 terms):

term: word1     term=word1,bytes=[77 6f 72 64 
31],startOffset=0,endOffset=5,punct=0,positionIncrement=1,position=1,type=word,keyword=false
term: word2     term=word2,bytes=[77 6f 72 64 
32],startOffset=6,endOffset=11,punct=0,positionIncrement=1,position=2,type=word,keyword=false
term: word1 word2       term=word1 word2,bytes=[77 6f 72 64 31 20 77 6f 72 64 
32],startOffset=6,endOffset=11,punct=0,positionIncrement=0,position=2,type=SHINGLE,keyword=false
term: word3     term=word3,bytes=[77 6f 72 64 
33],startOffset=12,endOffset=17,punct=0,positionIncrement=1,position=3,type=word,keyword=false
term: word2 word3       term=word2 word3,bytes=[77 6f 72 64 32 20 77 6f 72 64 
33],startOffset=12,endOffset=17,punct=0,positionIncrement=0,position=3,type=SHINGLE,keyword=false
term: word4     term=word4,bytes=[77 6f 72 64 
34],startOffset=18,endOffset=23,punct=0,positionIncrement=1,position=4,type=word,keyword=false
term: word3 word4       term=word3 word4,bytes=[77 6f 72 64 33 20 77 6f 72 64 
34],startOffset=18,endOffset=23,punct=0,positionIncrement=0,position=4,type=SHINGLE,keyword=false


from QueryParser (shingle terms not created -- only 4 terms):

term: word1     term=word1,bytes=[77 6f 72 64 
31],startOffset=0,endOffset=5,punct=0,positionIncrement=1,position=1,type=word,keyword=false
term: word2     term=word2,bytes=[77 6f 72 64 
32],startOffset=0,endOffset=5,punct=0,positionIncrement=1,position=1,type=word,keyword=false
term: word3     term=word3,bytes=[77 6f 72 64 
33],startOffset=0,endOffset=5,punct=0,positionIncrement=1,position=1,type=word,keyword=false
term: word4     term=word4,bytes=[77 6f 72 64 
34],startOffset=0,endOffset=5,punct=0,positionIncrement=1,position=1,type=word,keyword=false


can anyone tell me what I'm doing wrong?

thank you,


Igal






---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org

Reply via email to