Re: Indexing of multilingual labels

2011-03-14 Thread Paul Libbrecht
Stephane, I think that you have the freedom to put what you want in the stored value of a field. The simplest would even be to make it that the fields that you want to use for display are stored, preformatted, xml-ished, owl-ified, or json-ized, to be separate from the indexed fields (where yo

Re: Indexing of multilingual labels

2011-03-14 Thread Vinaya Kumar Thimmappa
Hello Stephane, I think a better way is to have resource file with different language and store pointer in the index to get to correct resource file ( Something like I18N and L10N approach). Store the internationalised string in index and all related localised string in resource file . Thi

Re: Indexing of multilingual labels

2011-03-11 Thread Stephane Fellah
Erick, I am trying to index multilingual taxonomies such as SKOS, Wordnet, Eurowordnet. Taxonomies are composed of concepts which have preferred and alternative labels in different languages. Some labels are the same lexical form in different languages. I want to be able to index these concepts in

Re: Indexing of multilingual labels

2011-03-11 Thread Erick Erickson
It's not so much a matter of problems with indexing/searching as it is with search behavior. The reason these strategies are implemented is that using English stemming, say, on other languages will produce "interesting" results. There's no a-priori reason you can't index multiple languages in the

Indexing of multilingual labels

2011-03-10 Thread Stephane Fellah
I am trying to index in Lucene a field that could have label of concepts in different languages. Most of the approaches I have seen so far are: - Use a single index, where each document has a field per each language it uses, or - Use M indexes, M being the number of languages in