in the case of ArabicAnalyzer it will only change Arabic tokens, and will leave english words as-is (it will not convert them to lowercase or anything like that)
so if you want to have good Arabic and English behavior you would want to create a custom analyzer that looks like Arabic analyzer but also invokes lowercasefilter, perhaps also some english stemmer, etc etc. On Thu, May 14, 2009 at 10:11 AM, weidong sun <lmcw...@gmail.com> wrote: > Hello, > > I am a newbie in Lucene world. I might ask some obvious question which > unfortunately I don't know the answer. Please help me 'grow'. > > We have a project intend to use Lucene search engine for search some user's > info stored our system. The user info might not be in English even it will > be stored in UTF-8 encoding. > > My question is, if I use one particular Lucene analyzer for a language > other > than English (e.g. ChineseAnalyzer or ArabicAnalyzer), can it still able to > handle it correctly if user info is mixed with English character/word? > > Really appreciated with any answers. > > :-) > -- Robert Muir rcm...@gmail.com