Hello, For some time I have been trying to apply ShingleFilter. I have a string: "The users get program in the User RPC API in Apache Rave"
and I would like to get: [the users get] [users get program] [get program in] [program in the] [in the user] [the user rpc] [user rpc api] [rpc api in] [api in apache] [in apache rave][apache rave 0.11] however I'm getting : [the users get] [users] [users get program] [get] [get program in] [program] [program in the] [in the user] [the user rpc] [user] [user rpc api] [rpc] [rpc api in] [api] [api in apache] [in apache rave] [apache] [apache rave 0.11] [rave] part of my code: protected TokenStreamComponents createComponents(String fieldName, Reader reader){ StandardTokenizer source = new StandardTokenizer(Version.LUCENE_43, reader); TokenStream tokenStream = new StandardFilter(Version.LUCENE_43, source); tokenStream = new LowerCaseFilter(Version.LUCENE_43, tokenStream); tokenStream = new ShingleFilter(tokenStream,3,3); tokenStream = new StopFilter(Version.LUCENE_43,tokenStream,StopAnalyzer.ENGLISH_STOP_WORDS_SET); return new TokenStreamComponents(source, tokenStream) could please, somebody explain me why I'm getting single shinglers when I set min size 3. Thanks, -- gosia --------------------------------------------------------------------- To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org For additional commands, e-mail: java-user-h...@lucene.apache.org