Hi, folks! This is not a trivial question, but I appeal to your experience with Lucene...
Lucene Implementation Version: 2.9.1 Solr Implementation Version: 1.4 Java version: 1.6 This is legacy environment with a huge amount of indexed data. The main question that I encountered few days ago was idea to migrate Java version on Java 8. For reference, JRE major versions with their corresponding Unicode versions: * Java 6, Unicode 4.0 * Java 8, Unicode 6.2 The first thing I found was JRE_VERSION_MIGRATION<https://github.com/apache/lucene-solr/blob/trunk/lucene/JRE_VERSION_MIGRATION.txt> document. But it says only about one-known problem associated with changes in the Unicode version and Java 1.4 to Java 5 transition. So, do you know any known issues related to Unicode 4.0 till Unicode 6.2 transition? Additionally, this is a list of all analyzers and flites that I use now: * WhitespaceTokenizerFactory * WordDelimiterFilterFactory * LowerCaseFilterFactory * SnowballPorterFilterFactory * RemoveDuplicatesTokenFilterFactory * ElisionFilterFactory * CJKTokenizerFactory * ThaiWordFilterFactory * ChineseSentenceTokenizerFactory * ChineseWordTokenFilterFactory Sincerely, Bogdan.