Re: Text search in Arabic

2021-05-23 Thread Mete Kural
Thank you for all this information Uwe and Walter! Let me digest this information and education myself on these matters and figure out a way forward. Have a great one, Mete > On May 20, 2021, at 6:43 PM, Walter Underwood wrote: > > I recommend normalizing all characters with a compatibility

Re: Text search in Arabic

2021-05-20 Thread Walter Underwood
I recommend normalizing all characters with a compatibility transformation, whether they are Arabic or not. We use this charFilter as the first step in every query and indexing analysis chain. You’ll also need to include the ICU library, which should be included by default. Actually

Re: Text search in Arabic

2021-05-20 Thread Uwe Schindler
This is only for Arabic language. If you don't know the language and just want to assist people searching with different scripts (search with latin letters for Arabic text), see my other answer. Uwe Am May 20, 2021 2:38:26 PM UTC schrieb Mete Kural : >Hello Michael, > >Thank you very much fo

Re: Text search in Arabic

2021-05-20 Thread Uwe Schindler
Hi, As answer to your question looking for character substitutions. There is the ICU library doing this with ICU Transformers. It may also change all Cyrillic text to latin during indexing and search. This greatly helps people to find stuff. A great example of a transformer is here as part of

Re: Text search in Arabic

2021-05-20 Thread Mete Kural
Hello Michael, Thank you very much for this information. I will try at java-u...@lucene.apache.org also. By the way, is the Arabic analyzer referenced here (https://github.com/apache/lucene/tree/main/lucene/analysis/common/src/java/org/apache/lucene/analysis/ar) just for the Arabic language

Re: Text search in Arabic

2021-05-20 Thread Michael Wechner
Hi Mete You might also want to try the java-u...@lucene.apache.org mailing list https://lucene.apache.org/core/discussion.html#java-user-list-java-userluceneapacheorg Re languages other than english you might find more information at https://cwiki.apache.org/confluence/display/lucene/LuceneFAQ

Text search in Arabic

2021-05-20 Thread Mete Kural
Hello Lucene Community, I hope this finds you all well. I want to ask you if this would be the right medium to discuss some matters surrounding text search in relation to variant Unicode codings of words in Arabic and Arabic scripted languages. This is not a great example but the said matters a