Hi David, thanks for your advice I'll keep it in mind. Best regards, François Jurain.
> Message du 22/03/11 à 17h40 > De : "David Causse" <dcau...@spotter.com> > A : java-user@lucene.apache.org > Copie à : > Objet : Re: Wanted: a directory of quick-and-(not too)dirty analyzers for > multi-language RDF. > > > On Tue, Mar 22, 2011 at 04:15:53PM +0100, fr.jur...@voila.fr wrote: > > The only thing I need is the middle layer: a Java component extending > > Lucene, that'd pull a plausible Analyzer out of its magic hat, for every > > ISO 639-1 language tag however unlikely that turns up in the RDF input. > > Not just an analyzer, mark: a *plausible* one. I mean one that'll > > generate usable indexes right out of the box in most cases; I cannot > > afford to bring the system back into dev & study the arcanes of > > automatic indexing in just every language we're working with. So > > lucene-contrib + covering the rest with StandardAnalyzer/English > > stopwords is not an option. > > Hi, > > maybe you could have a look at java.text.* and specifically BreakIterator > (Thai analyzer use it) it could be better than STDAnalyzer for a fallback. > Don't forget that if you use multiple analyzers at index time you'll > have to use multiple analyzers at query time (tricky part of the > process). > > Regards. > > -- > David Causse > Spotter > http://www.spotter.com/ > > --------------------------------------------------------------------- > To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org > For additional commands, e-mail: java-user-h...@lucene.apache.org > > > ____________________________________________________ Retrouvez les 10 conseils pour économiser votre carburant sur Voila : http://actu.voila.fr/evenementiel/LeDossierEcologie/l-eco-conduite/ --------------------------------------------------------------------- To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org For additional commands, e-mail: java-user-h...@lucene.apache.org