(When I sent this message earlier I had used HTML to make it more clear and easier to read. I see now that the list software removed that leaving an unreadable mess. I'm sending this again, in case somebody could be kind enough to guide me here a bit. =) )

------

I'm trying to make synonyms work right and for that I'm trying to understand better graphs in a token stream.

For that purpose I've built this code:

            Builder builder = CustomAnalyzer.builder();
            builder.withTokenizer(StandardTokenizerFactory.class);
MySynonymGraphFilterFactory.registerSynonyms(Arrays.asList(
                     Arrays.asList("go to", "navigate", "open")
                     ));
builder.addTokenFilter(MySynonymGraphFilterFactory.class, "synonyms", "unused");

(MySynonymGraphFilterFactory is just a hack to pass a list of lists for synonyms. It expands everything mapping everything to everything.)

            builder.addTokenFilter(FlattenGraphFilterFactory.class); // nothing changes with this!
            Analyzer analyzer = builder.build();
            TokenStream ts = analyzer.tokenStream("*", new StringReader("go to the webpage!"));

Then I call a function that just dumps terms, position increments and position lengths:

System.out.println(LoggingFilter.tokenStreamToString(ts));

What I don't understand is this. I get the same output whether I include FlattenGraphFilter or not. This is the output:

   navigate<2>  (0)open<2>  (0)go  to  the  webpage

(angle brackets show position lengths of the preceding term; parenthesis show position increments of the following term)

There's something I'm not understanding here. I'd thought that flattening the stream meant that no token will have position length > 1... was I wrong? I would greatly appreciate any help with understanding this.

Thanks!

Nicolás.-


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org

Reply via email to