hi, I'm trying to migrate to Lucene 4.
in Lucene 3.5 I extended org.apache.lucene.analysis.FilteringTokenFilter and overrode accept() to remove undesired shingles. in Lucene 4 org.apache.lucene.analysis.FilteringTokenFilter does not exist?
I'm trying to achieve two things: 1) remove shingles that have an empty item. 2) remove shingles when the phrase contains a comma, for example: for the phrase: "delicious red apples, green pears, and oranges" I want the following shingles (with a shingle size of 2): "delicious red", "red apples", "green pears", "and oranges" (no "apples green" because there's a comma) (no "pears and" because there's a comma) any ideas? TIA --------------------------------------------------------------------- To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org For additional commands, e-mail: java-user-h...@lucene.apache.org