But for this, you need a skillfully designed: - set of fields - multiplexing analyzer - query expansion In one of my projects, we do not split language by fields and it's a pain... I'm having recurring issues in one sense or the other. - the "die" example that Oti s mentioned is a good one: stop-word in German, essential verb in English - I had recently issues with the contribution of the word Fourier (for the name of series): in English it stays fourier, in French in becomes fouri. So: if the resource is contributed in French, the indexed value is fouri, English seekers won't find it; if the resource is contributed in English, French seekers won't find it. So my last lesson: always have a whitespace-lowercase unstemmed field also at hand and prefer it over the others in your query expansion.
A wiki page should probably be made. paul Le 19 janv. 2011 à 07:53, Vinaya Kumar Thimmappa a écrit : > I think we should be using lucene with snowball jar's which means one index > for all languages (ofcourse size of index is always a matter of concerns). > > Hope this helps. > -vinaya > > On Tuesday 18 January 2011 11:23 PM, Clemens Wyss wrote: >> What is the "best practice" to support multiple languages, i.e. >> Lucene-Documents that have multiple language content/fields? >> Should >> a) each language be indexed in a seperate index/directory or should >> b) the Documents (in a single directory) hold the diverse localized fields? >> >> We most often will be searching "language dependent" which (at least >> performance wise) mandates one-directory-per-language... >> >> Any (lucene specific) white papers on this topic? >> >> Thx in advance >> Clemens >> >> --------------------------------------------------------------------- >> To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org >> For additional commands, e-mail: java-user-h...@lucene.apache.org >> >> > > --------------------------------------------------------------------- > To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org > For additional commands, e-mail: java-user-h...@lucene.apache.org > --------------------------------------------------------------------- To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org For additional commands, e-mail: java-user-h...@lucene.apache.org