Hi All
I am trying to match accented characters with non-accented characters in French/Spanish and other Western European languages. The use case is that the users may type letters without accents in error and we still want to be able to retrieve valid matches. The one idea, albeit naïve, is to normalize the data on the inbound side as well as the data in the database (prior to full text indexing) and retrieve matches. For instance if the database contains a word like BE/BE/ (/ being the equivalent of aigu since I don't have a French keyboard:-)) and the input is erroneously provided as BE/BE (last aigu missing), we still want to be able retrieve BE/BE/ as a candidate match admittedly with an error margin. Has anyone using Lucene successfully (ie in terms of decent performance AND validity of results) to match non-accented characters with accented ones using some method? Any method? Anyone have suggestions to improve the suggestion above? Any input will be greatly appreciated! Merci beaucoup :-) Renuka The information contained in this communication may be CONFIDENTIAL and is intended only for the use of the recipient(s) named above. If you are not the intended recipient, you are hereby notified that any dissemination, distribution, or copying of this communication, or any of its contents, is strictly prohibited. If you have received this communication in error, please notify the sender and delete/destroy the original message and any copy of it from your computer or paper files.