I meant to say. Now my analser chain looks like this. <analyzer type="index"> <charFilter class="solr.PatternReplaceCharFilterFactory" pattern="[-_]" replacement=" " /> <charFilter class="solr.PatternReplaceCharFilterFactory" pattern="[^\p{L}\p{Nd}\p{Mn}\p{Mc}\s+]" replacement="" /> <tokenizer class="solr.WhitespaceTokenizerFactory" /> <filter class="solr.LowerCaseFilterFactory" /> <filter class="solr.StopWordFilterFactory" ignoreCase="true" words="words.txt" /> <filter class="org.ctown.solr.analysis.CTConcatFilterFactory" /> </analyzer> <analyzer type="query"> <charFilter class="solr.PatternReplaceCharFilterFactory" pattern="[-_]" replacement=" " /> <charFilter class="solr.PatternReplaceCharFilterFactory" pattern="[^\p{L}\p{Nd}\p{Mn}\p{Mc}\s+]" replacement="" /> <tokenizer class="solr.KeywordTokenizerFactory" /> </analyzer>
But only my first document is getting indexed. Is there any logging I can enable to see what is going wrong? -- View this message in context: http://lucene.472066.n3.nabble.com/Writing-a-TokenConcatenateFilter-junk-characters-appearing-on-output-tp3383684p3384419.html Sent from the Lucene - Java Users mailing list archive at Nabble.com. --------------------------------------------------------------------- To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org For additional commands, e-mail: java-user-h...@lucene.apache.org