I would say in general, yes.
when i say 'change arabic text', I mean the arabic analyzer will standardize
and stem arabic words. but it won't modify any of your english words.
and no, there is no case in arabic. this is why if you are handling mixed
arabic/english text I recommend creating a cust
i.de
> eMail: u...@thetaphi.de
>
> > -Original Message-
> > From: weidong sun [mailto:lmcw...@gmail.com]
> > Sent: Thursday, May 14, 2009 5:19 PM
> > To: java-user@lucene.apache.org
> > Subject: Re: Question wrt Lucene analyzer for different language
> >
&
> Thanks for the quick answer. :-)
>
> So can I say, for ArabicAnalyzer, generally it can tokenize the mixed
> content with Arabic and English? :-)
>
> I am not really familiar with Arabic language. What do you mean for
> "change
> Arabic tokens"? Does Arabic has something like upper/lower case
o: java-user@lucene.apache.org
> Subject: Re: Question wrt Lucene analyzer for different language
>
> Thanks for the suprising quick response. :-)
>
> What I mean "correctly" here is that the specific analyzer can tokenize a
> text mixed with English and that sepcfic langauge, fo
Thanks for the quick answer. :-)
So can I say, for ArabicAnalyzer, generally it can tokenize the mixed
content with Arabic and English? :-)
I am not really familiar with Arabic language. What do you mean for "change
Arabic tokens"? Does Arabic has something like upper/lower case as English
does?
Thanks for the suprising quick response. :-)
What I mean "correctly" here is that the specific analyzer can tokenize a
text mixed with English and that sepcfic langauge, for example, "12345 "
or "Text???" (where '?' is a character of that specific language and
"12345" and "Text" is english
in the case of ArabicAnalyzer it will only change Arabic tokens, and will
leave english words as-is (it will not convert them to lowercase or anything
like that)
so if you want to have good Arabic and English behavior you would want to
create a custom analyzer that looks like Arabic analyzer but a
No. What is "correctly"? Are you stemming? in which case using thesame
analyzer on different languages will not work.
This topic have been discussed on the user list frequently, so if you
searched
that archive (see: http://wiki.apache.org/lucene-java/MailingListArchives)
you'd find a wealth of inf
Hello,
I am a newbie in Lucene world. I might ask some obvious question which
unfortunately I don't know the answer. Please help me 'grow'.
We have a project intend to use Lucene search engine for search some user's
info stored our system. The user info might not be in English even it will
be sto