If you are constrained in such a way as to not use the French Analyzer you might instead consider transforming the input as an additional step at both search/indexing time.

Use something like a regex that looks for é and always replaces it with e in the index, and at search time. (expand this transformation step as needed)

You likely also need to store the original word somewhere, so I would suggest adding a second stored, but unindexed field that stores the original value of the word, so when you match on your search criteria, you will also get the original form of the word in your hits object.

Hope this helps,

Matt

egrand thomas wrote:
Dear all,

I'd like my lucene searches to be insensitive to (French) accents. For example, considering a indexed term 
"métal", I want to get it when searching for "metal" or "métal" . I use lucene-2.3.2 and 
the searches are performed with: IndexSearcher.search(query,filter,sorter), Another filter is already used together 
with a "Sort" object. Futrhermore, I cannot use the FrenchAnalyzer as my index does not only contain French 
words.

Can anybody help ?
Thanks in advance,
Tom





---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org

Reply via email to