Re: European Languages search problem

2005-07-28 Thread Martin Rode
Otis, Thanks for the quick reply. The idea to emit multiple tokens is great! I was looking for a solution of another problem: I want to present a word completition list to the user, so I use reader.terms(new Term("start","here"). If I start searching at "henrie", the reader.terms() should re

Re: European Languages search problem

2005-07-28 Thread Otis Gospodnetic
Hi Martin, When you write your own tokenizer/analyzer for this, you'll probably want to emit multiple tokens for words that have umlauts and such - one version with ä -> ae, the other with ä -> a perhaps. As for stripping accents from characters, somebody posted ISOLatinFilter.java (I think that

European Languages search problem

2005-07-28 Thread Martin Rode
Hello everybody, First of congrats for that great piece of software! I am working on a Europe-wide project, where we have texts on more than one European language, namely French, German, and English. Having tried the German and the FrenchAnalyzer both are not satisfying for what I need. The