I've found a simpler subclass, that illustrates the same problem

package org.musicbrainz.search.analysis;

import org.apache.lucene.analysis.*;

import java.io.IOException;
import java.io.Reader;

/**
* For analyzing catalogno so can compare values containing spaces with values that do not
 * Removes any spaces and lowercases the remaining text
 */
public class StripSpacesAnalyzer extends Analyzer {

    protected NormalizeCharMap charConvertMap;

    protected void setCharConvertMap() {
        charConvertMap = new NormalizeCharMap();
        charConvertMap.add(" ","");

    }

    public StripSpacesAnalyzer() {
        setCharConvertMap();
    }

    public final TokenStream tokenStream(String fieldName,
                                   final Reader reader) {
CharFilter mappingCharFilter = new MappingCharFilter(charConvertMap,reader);
        TokenStream result = new KeywordTokenizer(mappingCharFilter);
        result = new LowercaseFilter(result);
        return result;
    }

    private static final class SavedStreams {
        KeywordTokenizer tokenStream;
        TokenStream filteredTokenStream;
    }

    public final TokenStream reusableTokenStream(String fieldName,
final Reader reader) throws IOException {

        SavedStreams streams = (SavedStreams) getPreviousTokenStream();
        if (streams == null) {
            streams = new SavedStreams();
            setPreviousTokenStream(streams);
streams.tokenStream = new KeywordTokenizer(new MappingCharFilter(charConvertMap, reader)); streams.filteredTokenStream = new LowercaseFilter(streams.tokenStream);
        }
        else {
streams.tokenStream.reset(new MappingCharFilter(charConvertMap,reader));
        }
        return streams.filteredTokenStream;
    }
}

So to reiterate looking at a dump one instance of a StripSpacesAnalyzer can find itself with multiple instances of the SavedStreams class conected to it via a WeakHashMap called hardRefs in the tokenStreams (CloseableThreadLocal) of the superclass, shoudnt there just be the one per instance of the analyzer ?

(Using Lucene 3.6.0)

Paul
Paul

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org

Reply via email to