If it is the diacritics, then the solution is a patch which was submitted (but probably never applied) a year or so ago.
Peter -------- Original-Nachricht -------- > Datum: Mon, 26 Nov 2012 11:33:06 +0200 > Von: pola ashraf <5...@hotmail.com> > An: SWORD Developers\' Collaboration Forum <sword-devel@crosswire.org> > Betreff: Re: [sword-devel] Search bug & New Arabic Bible, Not Shaped SVD > Version > Sorry for choosing the wrong word > this wikipedia article talking about this topic > https://en.wikipedia.org/wiki/Arabic_diacritics > > Thanks Chris for your reply about the filter, Actually I don't have any > contact details for the developers of the frontends to report them this > problem, hope someone in this list report them about all this discussion :) > > So now we know the problem and the solution . > > > Date: Mon, 26 Nov 2012 01:05:16 -0800 > > From: chris...@crosswire.org > > To: sword-devel@crosswire.org > > Subject: Re: [sword-devel] Search bug & New Arabic Bible, Not Shaped > SVD Version > > > > You're talking about vowels, not shaping. Shaping in Arabic changes the > > shape of the letter according to its context in the word (initial, > > medial, final, or isolated). I imagine unshaped Arabic would be very > > difficult to read. Arabic without vowel marks, on the other hand, is > > standard. > > > > I would have thought that the indexing would have been done without > > vowels or both with and without vowels. It should be easy to recover the > > vowel-less text for indexing by applying the UTF8ArabicPoints filter. > > > > --Chris > > > > On 11/25/2012 11:45 PM, pola ashraf wrote: > > > Using a comparison tool from ICU the two strings resulted in different > > > character numbers > > > Words to compare > > > يَسُوعَ > > > يسوع > > > Which is the Name of JESUS Christ in Arabic but one is shaped and the > > > other isn't > > > > > > Words converted to HEX Format > > > \u064a \u064e \u0633 \u064f \u0648 \u0639 \u064e > > > \u064a \u0633 \u0648 \u0639 > > > > > > That's why search engines of some frontends doesn't come with any > > > results for not shaped words > > > > > > The suggestion is to make the index contain the shaped words plus the > > > same words without shaping > > > > > > Comparison Tool link https://ssl.icu-project.org/icu-bin/scompare > > > > > > Note: to clarify the meaning of shaping, shaping is the usage of > > > Characters like the following ( ٌ ُ ٍ َ ْ ً ) > > > these special characters are shapes, and may change the whole word > > > meaning and help in correct reading, but as mentioned before, it make > > > reading harder and make problem with search functions > > > > > > Note: And Bible search normally without problems, but the desktop > > > programs like Xiphos and Bible Time have this problem > > > > > > Pola > > > > ------------------------------------------------------------------------ > > > > > > I think Arabic shapes add extra Unicode characters that's why the 2 > same > > > words - i mentioned before - don't give the same results > > > > > > ------------------ > > > Any Arabic search problem is unconnected to shaping. > > > > > > Modules are routinely created and stored in a normalised format, user > > > entries, e.g. for search ate equally normalised > > > > > > > > > > > > _______________________________________________ > > > sword-devel mailing list: sword-devel@crosswire.org > > > http://www.crosswire.org/mailman/listinfo/sword-devel > > > Instructions to unsubscribe/change your settings at above page > > > > > > > > > _______________________________________________ > > sword-devel mailing list: sword-devel@crosswire.org > > http://www.crosswire.org/mailman/listinfo/sword-devel > > Instructions to unsubscribe/change your settings at above page > _______________________________________________ sword-devel mailing list: sword-devel@crosswire.org http://www.crosswire.org/mailman/listinfo/sword-devel Instructions to unsubscribe/change your settings at above page