Hello everybody,
First of congrats for that great piece of software!
I am working on a Europe-wide project, where we have texts on more than
one European language, namely French, German, and English. Having tried
the German and the FrenchAnalyzer both are not satisfying for what I need.
The GermanAnalyer should do a classic German umlaut conversion:
ä -> ae
ö -> oe
u -> ue
It does ä->a, ö->o, ü->u. This is not useful. If a word appears like
"Oeffner" and i search for "Öffner", i dont find it! It does the
conversion right for "ß", which converts to "ss".
For French I tried the FrenchAnalyzer, but it does not work (at least
not the one in lukeall.jar, which is pretty up-to-date, I guess).
Well, in short, it would be nice to have a simple Analyzer which does
the great job of the StandardAnalyzer, PLUS a few extras for European
languages, and that is pretty easy:
For German: See above,
For French: Remove all the ` ´ ^ and the hook below the c
For Swedish, Polish, Czech ... remove everything which crosses, slashes
or whatever the ascci letters a-z
Before I write my own Analyzer for that I was wondering if anybody has
had the same problems and found already a solution for that!
Thanks for your help!
Take Care,
Martin
---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]