Re: [sword-devel] testing for diacritics

2015-09-02 Thread Peter von Kaehne
On Wed, 2015-09-02 at 07:18 -0700, David Haslam wrote: > Windows users may find it useful to know that BabelPad has a menu > option to > strip diacritics. Again, the point of my request is not to remove diacritics per se, but to test their presence (the presence only of those we have a filter for

Re: [sword-devel] testing for diacritics

2015-09-02 Thread Peter von Kaehne
On Wed, 2015-09-02 at 10:21 -0700, David Haslam wrote: > Isn't that only a LocalStripFilter? > > http://www.crosswire.org/wiki/DevTools:conf_Files#Strip_Filters > > Or can any of these existing filters be used as a GlobalOptionFilter, > becoming usable only when front-ends (and the relevant sword

Re: [sword-devel] testing for diacritics

2015-09-02 Thread David Haslam
Isn't that only a LocalStripFilter? http://www.crosswire.org/wiki/DevTools:conf_Files#Strip_Filters Or can any of these existing filters be used as a GlobalOptionFilter, becoming usable only when front-ends (and the relevant sword utilities) provide UI options to toggle them? cf. Diatheke alread

Re: [sword-devel] testing for diacritics

2015-09-02 Thread Matěj Cepl
On 2015-09-02, 14:43 GMT, David Haslam wrote: > For an online utility see http://www.harakat.ae/ > > فِي الْبَدْءِ خَلَقَ اللهُ السَّمَاوَاتِ وَالأَرْضَ، > becomes > في البدء خلق الله السماوات والأرض، With a bit of web-scrapping, one could make a library using it as a webservice, couldn't we? M

Re: [sword-devel] testing for diacritics

2015-09-02 Thread Peter von Kaehne
We have a filter which does that - UTF8ArabicPoints. Peter On Wed, 2015-09-02 at 09:08 -0700, David Haslam wrote: > Peter, > > Are you also contemplating a new configuration item? e.g. > > GlobalOptionFilter=UTF8ArabicHarraket > > Might this be a useful enhancement to module AraNAV ? > > Dav

Re: [sword-devel] testing for diacritics

2015-09-02 Thread David Haslam
Peter, Are you also contemplating a new configuration item? e.g. GlobalOptionFilter=UTF8ArabicHarraket Might this be a useful enhancement to module AraNAV ? David -- View this message in context: http://sword-dev.350566.n4.nabble.com/testing-for-diacritics-tp4655091p4655188.html Sent from t

Re: [sword-devel] testing for diacritics

2015-09-02 Thread David Haslam
Devocalize = Strip the diacritics from "Arabic with harakat" is not such a simple Unicode conversion. i.e. One doesn't just have to remove diacritic characters from the encoded text! Rather to replace individual characters with harakat by characters without them. For an online utility see http://

Re: [sword-devel] testing for diacritics

2015-09-02 Thread David Haslam
Windows users may find it useful to know that BabelPad has a menu option to strip diacritics. Convert | Other | Strip diacritics It certainly works well for Cyrillic & Latin scripts, as well as Hebrew & Greek. It may not work for Arabic/Persian scripts. Can you provide some examples of such wi

Re: [sword-devel] testing for diacritics

2015-09-01 Thread Peter von Kaehne
On Fri, 2015-08-28 at 14:13 -0400, Ryan wrote: > On Thu, 2015-08-27 at 23:22 +0100, Peter von Kaehne wrote: > > Is there a clever and reliable way one could test in a given OSIS > > text > > to see whether it contains diacritically enhanced texts or not? > > Perl, > > preferably. > > > > Specif

Re: [sword-devel] testing for diacritics

2015-08-28 Thread Ryan
On Thu, 2015-08-27 at 23:22 +0100, Peter von Kaehne wrote: > Is there a clever and reliable way one could test in a given OSIS text > to see whether it contains diacritically enhanced texts or not? Perl, > preferably. > > Specifically Hebrew, Arabic type alphabets and Greek - for all of which > w

Re: [sword-devel] testing for diacritics

2015-08-28 Thread Peter Von Kaehne
> Gesendet: Freitag, 28. August 2015 um 16:59 Uhr > Von: "Matěj Cepl" > > This would probably work on latin scripts with diacritics, but not on > > the scripts I am interested in - Hebrew, Arabic derrived and Greek. > > Did you try? Yes :-) ___ sw

Re: [sword-devel] testing for diacritics

2015-08-28 Thread Matěj Cepl
On 2015-08-28, 08:21 GMT, Peter von Kaehne wrote: > On Fri, 2015-08-28 at 01:27 +0200, Matěj Cepl wrote: >> iconv -f utf8 -t us-ascii//translit file.xml \ >> |diff -u - file.xml > > This would probably work on latin scripts with diacritics, but not on > the scripts I am interested in - Hebr

Re: [sword-devel] testing for diacritics

2015-08-28 Thread Peter Von Kaehne
factor out of module making. Peter > Gesendet: Freitag, 28. August 2015 um 15:42 Uhr > Von: "David Troidl" > An: sword-devel@crosswire.org > Betreff: Re: [sword-devel] testing for diacritics > > How about regular expressions: > > Modern Greek Accented > [\u03

Re: [sword-devel] testing for diacritics

2015-08-28 Thread David Troidl
How about regular expressions: Modern Greek Accented [\u0370-\u0390 \u03AA-\u03B0 \u03CA-\u03D4] Polytonic Greek Accented [\u1F00-\u1FFE] Hebrew Vowel Points [\u05BB-\u05B0] Hebrew Cantillation [\u0591-\u05AE] I don't know about Arabic. Peace, David On 8/28/2015 4:21 AM, Peter von Kaehne w

Re: [sword-devel] testing for diacritics

2015-08-28 Thread Peter von Kaehne
On Fri, 2015-08-28 at 09:21 +0100, Peter von Kaehne wrote: > On Fri, 2015-08-28 at 01:27 +0200, Matěj Cepl wrote: > > iconv -f utf8 -t us-ascii//translit file.xml \ > > |diff -u - file.xml > > Thanks Matej, > > This would probably work on latin scripts with diacritics, but not on > the sc

Re: [sword-devel] testing for diacritics

2015-08-28 Thread Peter von Kaehne
On Fri, 2015-08-28 at 01:27 +0200, Matěj Cepl wrote: > iconv -f utf8 -t us-ascii//translit file.xml \ > |diff -u - file.xml Thanks Matej, This would probably work on latin scripts with diacritics, but not on the scripts I am interested in - Hebrew, Arabic derrived and Greek. Peter _

Re: [sword-devel] testing for diacritics

2015-08-27 Thread Matěj Cepl
On 2015-08-27, 22:22 GMT, Peter von Kaehne wrote: > Is there a clever and reliable way one could test in a given OSIS text > to see whether it contains diacritically enhanced texts or not? Perl, > preferably. What about the following? $ iconv -f utf8 -t us-ascii//translit file.xml \

[sword-devel] testing for diacritics

2015-08-27 Thread Peter von Kaehne
Is there a clever and reliable way one could test in a given OSIS text to see whether it contains diacritically enhanced texts or not? Perl, preferably. Specifically Hebrew, Arabic type alphabets and Greek - for all of which we have special a GlobalOptionFilter. I create most of the conf files a