Years ago (2007) I've installed Eurovoc Thesaurus to work with our Search Engine as multilingual search (terms and phrases in 22 languages).
http://www.ub.uni-bielefeld.de/~befehl/base/solr/InsideBase_eurovocThesaurus.html The synonyms.txt file is 8.8MB in size and gets as FST over 300.000 mappings as n-to-m due to permutation. You can get from a single term/token several single and multi-word synonyms and from multi-word terms/tokens also single and multi-word synonyms. Position increment and position length is handled correct. And the originating search term with their direct synonyms is/can be boosted. I will look into SynonymGraphFilter and FlattenGraphFilter to see how it compares to my development. Regards Bernd Am 07.02.2017 um 12:34 schrieb Michael McCandless: > That's great that multi-token synonyms are working for you; can you > describe how use them? > > This blog post describes some of the problems: > http://blog.mikemccandless.com/2012/04/lucenes-tokenstreams-are-actually.html > > I'm working on another blog post to describe the recent changes ... > should be out in maybe a week or so. > > Anyway, to just keep doing what you are doing today, you should switch > to SynonymGraphFilter followed by FlattenGraphFilter: it will make the > same tokens as the current SynonymFilter, but will necessarily be > buggy in the multi-token case. > > Mike McCandless > > http://blog.mikemccandless.com > > > On Tue, Feb 7, 2017 at 6:07 AM, Bernd Fehling > <bernd.fehl...@uni-bielefeld.de> wrote: >> I just tried Solr 6.4.1 and noticed that SynonymFilterFactory is >> deprecated, as reported in the logs. >> >> I hope that this is just to note that there is also an alternative >> SynonymGraphFilterFactory now available. >> >> And _not_ that SynonymFilterFactory will disappear, because it runs my >> multi-word Synonyms Thesaurus now for years like a charme. >> I hate to reinvent the wheel. >> >> Regards >> Bernd >> >> --------------------------------------------------------------------- >> To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org >> For additional commands, e-mail: java-user-h...@lucene.apache.org >> > > --------------------------------------------------------------------- > To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org > For additional commands, e-mail: java-user-h...@lucene.apache.org > -- ************************************************************* Bernd Fehling Bielefeld University Library Dipl.-Inform. (FH) LibTec - Library Technology Universitätsstr. 25 and Knowledge Management 33615 Bielefeld Tel. +49 521 106-4060 bernd.fehling(at)uni-bielefeld.de BASE - Bielefeld Academic Search Engine - www.base-search.net ************************************************************* --------------------------------------------------------------------- To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org For additional commands, e-mail: java-user-h...@lucene.apache.org