Otis,
Thanks for the quick reply.
The idea to emit multiple tokens is great!
I was looking for a solution of another problem: I want to present a
word completition list to the user, so I use reader.terms(new
Term("start","here"). If I start searching at "henrie", the
reader.terms() should re
Hi Martin,
When you write your own tokenizer/analyzer for this, you'll probably
want to emit multiple tokens for words that have umlauts and such - one
version with ä -> ae, the other with ä -> a perhaps.
As for stripping accents from characters, somebody posted
ISOLatinFilter.java (I think that
Hello everybody,
First of congrats for that great piece of software!
I am working on a Europe-wide project, where we have texts on more than
one European language, namely French, German, and English. Having tried
the German and the FrenchAnalyzer both are not satisfying for what I need.
The