On Jul 23, 2014, at 7:43 PM, Milind <mili...@gmail.com> wrote: >>> input=esl2.gbr >>> output=[esl2.gb][r] >>> >>> This is a bug, which was fixed in Lucene 4.7 - see < > https://issues.apache.org/jira/browse/LUCENE-5391> > > BTW, I changed the POM dependency to 4.7.1, but I'm still seeing the same > output. I can't go beyond 4.7 since it seems 4.8 onwards, Lucene is being > compiled against Java 7 and I'm still on Java 6. Hopefully, this will be > a non-issue with PerFieldAnalyzerWrapper. But I just wanted to point that > out.
I checked out the source code for the 4.7.1 release and added a test for “esl2.gbr” to TestUAX29URLEmailAnalyzer.testNoSchemeURLs() <http://svn.apache.org/viewvc/lucene/dev/tags/lucene_solr_4_7_1/lucene/analysis/common/src/test/org/apache/lucene/analysis/core/TestUAX29URLEmailAnalyzer.java?view=markup#l262>: BaseTokenStreamTestCase.assertAnalyzesTo (a, "esl2.gbr", new String[] { "esl2", "gbr" }, new String[] { "<ALPHANUM>", "<ALPHANUM>" }); This passes: the string is broken up into “esl2” and “gbr” tokens, both with type <ALPHANUM>. Are you sure that you’re running against the 4.7.1 version for all Lucene dependencies (including lucene-analyzers-common)? Also, you need to change the value of the matchVersion parameter to the constructor to match the version you’re using; unless you do this, the behavior will remain the same as that of the version referred to by the matchVersion parameter. Steve --------------------------------------------------------------------- To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org For additional commands, e-mail: java-user-h...@lucene.apache.org