The whole question of multilingual indexing has been discusses at length, you might find some ideas if you search the archive...
Erick On 10/1/07, Dino Korah <[EMAIL PROTECTED]> wrote: > > Thanks Erick. > > The PerFieldAnalyzerWrapper could fit in but in the current world of > multilingual anywhere, (even in programming languages.. %$£%#@), almost > any > field in an email (addresses, subject, body, attachment filenames, ... ) > document could be multilingual. > > I will have a go anyway. > > -----Original Message----- > From: Erick Erickson [mailto:[EMAIL PROTECTED] > Sent: 01 October 2007 14:35 > To: java-user@lucene.apache.org > Subject: Re: mixing analyzer > > Sure, but there's a time/space tradeoff. Isn't there always <G>.... > > PerFieldAnalyzerWrapper is your friend. It would require that your > index be built on a per-language basis. Say indexing > text from French documents in a field "french_text", Chinese > documents in a field chinese_text. You'd construct your query > something like "chinese_text:blah OR french_text:blah" and > your PerFieldAnalyzer would just handle things for you. > > But this may not fit your problem space ideally.... > > Best > Erick > > > On 10/1/07, Dino Korah <[EMAIL PROTECTED]> wrote: > > > > Hi, > > > > > > > > I am working on a lucene email indexing system which potentially can get > > documents in various languages. Currently I am using StandardAnalyzer, > > which > > works for English but not for many of the other languages. One of the > > requirements for the search interface is that they have to search > without > > selecting languages. > > > > > > > > Is it possible to search across multiple indexes that has been created > > with > > different analyzers; ie, when some one search for say "earth" lucene > will > > search for "earth" over CJK Analyzed IndexCJK and StandardAnalyzer > > analyzed > > IndexSA in one search() call? If not is there a way of combining the > > result > > from multiple search() call. > > > > > > > > Thanks in advance. > > > > > > > > d i n o k o r a h > > Tel: +44 795 66 65 283 > > -------------------------------- > > 51°21'52"N 0°5' 14.16" > > > > > > > > > > > --------------------------------------------------------------------- > To unsubscribe, e-mail: [EMAIL PROTECTED] > For additional commands, e-mail: [EMAIL PROTECTED] > >