Hello, I have a search project which uses the Lucene PatternAnalyzer for its text/query analysis.
At the moment it's configured like so: analyzer = new PatternAnalyzer(Version.LUCENE_35, Pattern.compile("\\s+"), true, null); My goal here was to split words based on spaces and make things case insensitive. In thinking about this however I probably want to be a little bit more sophisticated. I'd like to ignore punctuation which occurs at the end or beginning of a word. Is this simply a matter of writing a regex which treats those cases the same as a space? Would I use something like this: analyzer = new PatternAnalyzer(Version.LUCENE_35, Pattern.compile("\\s+|\\p{Punct}+\\w|\\w\\p{Punct}"), true, null); Thanks so much! Dave --------------------------------------------------------------------- To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org For additional commands, e-mail: java-user-h...@lucene.apache.org