I created this simple StripSpacesAndSeparatorsAnalyzer so that it ignores certain characters such as hypens in the field so that I can search for

catno:WRATHCD25
catno:WRATHCD-25

and get the same results, and that works (the original value of the field added to the index was WRATHCD-25)

However there is a problem with wildcard searching

catno:WRATHCD25*

works, but

catno:WRATHCD-25*

does not

If I amend the analyzer to comment out the initReader() method then

catno:WRATHCD-25*

now works but of course

catno:WRATHCD25

no longer works.


Wham I doing wrong please


public class StripSpacesAndSeparatorsAnalyzer extends Analyzer {

    protected NormalizeCharMap charConvertMap;

    protected void setCharConvertMap() {

        NormalizeCharMap.Builder builder = new NormalizeCharMap.Builder();
        builder.add(" ","");
        builder.add("-","");
        builder.add("_","");
        builder.add(":","");
        charConvertMap = builder.build();
    }

    public StripSpacesAndSeparatorsAnalyzer() {
        setCharConvertMap();
    }

    @Override
protected TokenStreamComponents createComponents(String fieldName, Reader reader) {
        Tokenizer source = new KeywordTokenizer(reader);
        TokenStream filter = new LowercaseFilter(source);
        return new TokenStreamComponents(source, filter);
    }


    @Override
    protected Reader initReader(String fieldName,
                                Reader reader)
    {
        return new MappingCharFilter(charConvertMap, reader);
    }
}

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org

Reply via email to