Re: [sword-devel] usfm2osis.py appears to be very broken

2017-02-23 Thread Peter von Kaehne
On Thu, 2017-02-23 at 14:59 -0500, Ryan V wrote: > And reporting it on this > list is the only way I know to guarantee that they see that there is > a problem with it. Thanks Ryan. > The breakage occurred on November 30th, 2016... and this is the > commit that broke it... > > https://github.co

Re: [sword-devel] usfm2osis.py appears to be very broken

2017-02-23 Thread Ryan V
On Thu, Feb 23, 2017 at 2:04 PM, Greg Hellings wrote: > You'll have to give more information before anyone can help you out > I don't need help. I am trying to make it known that there is a problem with your usfm2osis.py conversion script so that it can be fixed. > > On Thu, Feb 23, 2017 at 12

Re: [sword-devel] usfm2osis.py appears to be very broken

2017-02-23 Thread Greg Hellings
You'll have to give more information before anyone can help you out On Thu, Feb 23, 2017 at 12:46 PM, Ryan V wrote: > usfm2osis.py appears to be very broken now. I ran it on a number of bibles > and it failed to properly convert all of them. For the World English Bible > it reported the followin

[sword-devel] usfm2osis.py appears to be very broken

2017-02-23 Thread Ryan V
usfm2osis.py appears to be very broken now. I ran it on a number of bibles and it failed to properly convert all of them. For the World English Bible it reported the following: Unhandled USFM tags: \+bk, \c, \li1 ___ sword-devel mailing list: sw

[sword-devel] usfm2osis.py issue

2016-11-13 Thread Cyrille
Hi, When I use usfm2osis.py I have an issue with the \r marker the script convert : \r (Mc 1, 7-8; Lc 3, 15-18; Jn 1, 24-28) like that : (Mc 1, 7-8; Lc 3, 15-18; Jn 1, 24-28) And when I want to read the reference link on xiphos, it doesn’t. Then I did some "odd jobs" to make it working, I

Re: [sword-devel] usfm2osis.py in bibledit-web

2013-06-09 Thread Chris Little
On 6/8/2013 11:12 PM, David Haslam wrote: Teus's method of [also] having a merged USFM file is not without merit. e.g. During source text development in collaboration with translators and/or publishers. I myself use the same technique - independent to Bibledit. FWIW, it readily facilitates doin

Re: [sword-devel] usfm2osis.py in bibledit-web

2013-06-08 Thread David Haslam
Teus's method of [also] having a merged USFM file is not without merit. e.g. During source text development in collaboration with translators and/or publishers. I myself use the same technique - independent to Bibledit. FWIW, it readily facilitates doing a global search, which is great to check f

Re: [sword-devel] usfm2osis.py in bibledit-web

2013-06-08 Thread Teus Benschop
H ​i friends, Thank you for the input. I mentioned about basic input consisting of \id, \p, and \v markers only, but did not mention that the \c marker is there too. Thank you Michael for pointing that out. ​On my workstation, it will now download the newest version of the Python script every ni

Re: [sword-devel] usfm2osis.py in bibledit-web

2013-06-08 Thread Chris Little
On 6/8/2013 8:36 AM, Kahunapule Michael Johnson wrote: On 06/07/2013 11:53 PM, Teus Benschop wrote: When converting basic USFM, consisting of \id, \p, and \v markers only, through the script, it says "Unhandled USFM tags: \id, \v (2 total)". Perhaps I am doing something wrong? Perhaps. You sho

Re: [sword-devel] usfm2osis.py in bibledit-web

2013-06-08 Thread Chris Little
On 6/8/2013 7:52 AM, David Haslam wrote: Teus, Despite recent edits to the developers' wiki, usfm2osis.py is not yet a production release. There are even some unfixed issues with it. See http://www.crosswire.org/tracker/browse/MODTOOLS as well as within the file itself. With MODTOOLS-42 now c

Re: [sword-devel] usfm2osis.py in bibledit-web

2013-06-08 Thread Kahunapule Michael Johnson
Title: signature On 06/07/2013 11:53 PM, Teus Benschop wrote: When converting basic USFM, consisting of \id, \p, and \v markers only, through the script, it says "Unhandled USFM tags: \id, \v (2 total)". Perhaps I am doing something wrong?

Re: [sword-devel] usfm2osis.py in bibledit-web

2013-06-08 Thread David Haslam
Teus, Despite recent edits to the developers' wiki, usfm2osis.py is not yet a production release. There are even some unfixed issues with it. See http://www.crosswire.org/tracker/browse/MODTOOLS as well as within the file itself. Maintaining a Bibledit export choice between our Perl or Python sc

[sword-devel] usfm2osis.py in bibledit-web

2013-06-08 Thread Teus Benschop
Hi friends, I downloaded the usfm2osis.py script, and integrated it into bibledit-web for conversion from USFM to OSIS. The copyright notice in the script says "CrossWire Bible Society", so I'd like to thank the maker for creating the script. When converting basic USFM, consisting of \id, \p, an

Re: [sword-devel] usfm2osis.py

2013-05-22 Thread Chris Little
On 5/22/2013 5:49 PM, Robert Hunt wrote: Thanks Chris. Yes, multiprocessing is great on my i7. :-) Actually the bug I reported was "usfm2osis.py enters infinite loop". The bug that was fixed was something like "\periph USFM doesn't process pr

Re: [sword-devel] usfm2osis.py

2013-05-22 Thread Robert Hunt
Thanks Chris. Yes, multiprocessing is great on my i7. :-) Actually the bug I reported was "usfm2osis.py enters infinite loop". The bug that was fixed was something like "\periph USFM doesn't process properly". I don't think it really needs a new bug report. To

Re: [sword-devel] usfm2osis.py

2013-05-22 Thread Chris Little
On 5/22/2013 3:26 AM, Robert Hunt wrote: Yes, it seems that Chris did indeed fix the script so that my supplied minimal test case no longer causes the program to require a manual halt. :-) Unfortunately though, processing of that particular USFM field wasn't my main issue. The main issue seems t

Re: [sword-devel] usfm2osis.py

2013-05-22 Thread Robert Hunt
Yes, it seems that Chris did indeed fix the script so that my supplied minimal test case no longer causes the program to require a manual halt. :-) Unfortunately though, processing of that particular USFM field wasn't my main issue. The main issue seems to be that the program does not fail gr

Re: [sword-devel] usfm2osis.py

2013-05-22 Thread Nic Carter
Look at the latest SVN commit. Seems related :) Sent from my phone, hence this email may be short... On 22/05/2013, at 18:13, David Haslam wrote: > Robert, > > http://www.crosswire.org/tracker/browse/MODTOOLS-40 has been closed by > Chris, but without a explanatory closing comment. > > It doe

Re: [sword-devel] usfm2osis.py

2013-05-22 Thread David Haslam
Robert, http://www.crosswire.org/tracker/browse/MODTOOLS-40 has been closed by Chris, but without a explanatory closing comment. It does rather look as though you'll need to isolate the next USFM tag which causes such a loop, and then create a new issue. As it happens, I'm still waiting for Chri

Re: [sword-devel] usfm2osis.py

2013-05-21 Thread Robert Hunt
Hi all, Now that usfm2osis.pl is deprecated in favour of usfm2osis.py, is there any chance that the infinite loop problem noted in February might get looked at. (I've hit it a number of times on my Ubuntu set up on different USFM Bibles.) See a minimal test file at http://www.crosswire.or

Re: [sword-devel] usfm2osis.py

2013-05-09 Thread David Haslam
Further to my earlier reply, I have also moved content (from the talk page) to this main page. http://crosswire.org/wiki/List_of_eXtensions_to_OSIS_used_in_SWORD#eXtensions_generated_by_usfm2osis.py David -- View this message in context: http://sword-dev.350566.n4.nabble.com/usfm2osis-py-tp46

Re: [sword-devel] usfm2osis.py

2013-05-09 Thread David Haslam
Hi Chris, I had some time free - so I've just edited the relevant section of our wiki page. http://crosswire.org/wiki/Converting_SFM_Bibles_to_OSIS#Converting_USFM_files_to_OSIS Please check whether anything needs to be improved. Regards, David -- View this message in context: http://sword

Re: [sword-devel] usfm2osis.py

2013-05-08 Thread David Haslam
To clarify one item of terminology. *Python* is the name of the programming language. *CPython* is the default byte-code interpreter of Python, which is written in C. CPython is Guido van Rossum's reference version of the Python computing language. It's most often called simply "Python"; speake

Re: [sword-devel] usfm2osis.py

2013-05-08 Thread David Haslam
OK - thanks Chris. If I do some edits, feel free to improve or correct any details I get slightly wrong. Best regards, David -- View this message in context: http://sword-dev.350566.n4.nabble.com/usfm2osis-py-tp4652232p4652254.html Sent from the SWORD Dev mailing list archive at Nabble.com.

Re: [sword-devel] usfm2osis.py

2013-05-08 Thread Chris Little
On 05/08/2013 12:01 AM, Chris Burrell wrote: I think we should also add hailo to the page. .. We don't, as a rule, link to a lot of non-CrossWire/non-Sword software. Haiola in particular is currently not sufficiently mature to warrant linking to as a solution for USFM to OSIS conversion. It t

Re: [sword-devel] usfm2osis.py

2013-05-08 Thread Chris Little
On 05/07/2013 11:45 PM, David Haslam wrote: Hi Chris, Should we edit the wiki page to indicate that usfm2osis.py is released and is now the preferred conversion tool? Sure, feel free. And should we also

Re: [sword-devel] usfm2osis.py

2013-05-08 Thread David Haslam
Chris B, Spelling: It's called Haiola . "Haiola is derived from the Hawaiian phrase “ha’i ola”, which means “preach salvation”. " David -- View this message in context: http://sword-dev.350566.n4.nabble.com/usfm2osis-py-tp4652232p4652251.html Sent from the SWORD Dev m

Re: [sword-devel] usfm2osis.py

2013-05-08 Thread Chris Burrell
I think we should also add hailo to the page. .. On 8 May 2013 07:46, "David Haslam" wrote: > Hi Chris, > > Should we edit the wiki page > < > http://crosswire.org/wiki/Converting_SFM_Bibles_to_OSIS#Converting_USFM_files_to_OSIS > > > to indicate that usfm2osis.py is released and is now the pref

Re: [sword-devel] usfm2osis.py

2013-05-07 Thread David Haslam
Hi Chris, Should we edit the wiki page to indicate that usfm2osis.py is released and is now the preferred conversion tool? And should we also indicate that the Perl script usfm2osis.pl is henceforth deprec

Re: [sword-devel] usfm2osis.py

2013-05-07 Thread Kahunapule Michael Johnson
Title: signature My apologies, Chris. I did indeed confuse usfm2osis.py with usfm2osis.pl. I have not tested the former. I'm pleased that a better USFM to OSIS option is being worked on. Haiola currently supports all USFM tags in normal "reader's

Re: [sword-devel] usfm2osis.py

2013-05-07 Thread Chris Little
On 05/07/2013 11:20 AM, Kahunapule Michael Johnson wrote: On 05/07/2013 02:18 AM, David Haslam wrote: Apart from Chris, has anyone else done any testing on his Python script usfm2osis.py ? Seehttp://crosswire.org/wiki/Converting_SFM_Bibles_to_OSIS#usfm2osis.py Yes. I found that it works for a

Re: [sword-devel] usfm2osis.py

2013-05-07 Thread Chris Little
It should work fine on CPython 2.6+ and 3.0+ (or might require 3.1 or so--I don't think I tested very low versions of Python3). I have tested on a few different platforms with various versions of Python and haven't seen any problems on that front. IIRC, one or two of the non-CPython Pythons do

Re: [sword-devel] usfm2osis.py

2013-05-07 Thread Kahunapule Michael Johnson
Title: signature On 05/07/2013 02:18 AM, David Haslam wrote: Apart from Chris, has anyone else done any testing on his Python script usfm2osis.py ? See http://crosswire.org/wiki/Converting_SFM_Bibles_to_OSIS#usfm2osis.py Yes. I found that it works for

Re: [sword-devel] usfm2osis.py

2013-05-07 Thread Greg Hellings
On Tue, May 7, 2013 at 12:46 PM, David Haslam wrote: > Hi Peter, > > Which version of Python did you use it with ? > > Has anyone been successful with Python 3.3.x ? > > Chris designed it for CPython 2.7+ (but support CPython 3 and other > interpreters if possible). > I used it with CPython 2.7.

Re: [sword-devel] usfm2osis.py

2013-05-07 Thread David Haslam
Hi Peter, Which version of Python did you use it with ? Has anyone been successful with Python 3.3.x ? Chris designed it for CPython 2.7+ (but support CPython 3 and other interpreters if possible). cf. python.org states: If you don't know which version to use, try Python 3.3. Some existing th

Re: [sword-devel] usfm2osis.py

2013-05-07 Thread ref...@gmx.net
Yes, it works fine Sent from my HTC - Reply message - From: "David Haslam" To: Subject: [sword-devel] usfm2osis.py Date: Tue, May 7, 2013 13:18 Apart from Chris, has anyone else done any testing on his Python script usfm2osis.py ? See http://crosswir

Re: [sword-devel] usfm2osis.py

2013-05-07 Thread Greg Hellings
It has been a while, but I also used it successfully. --Greg On Tue, May 7, 2013 at 7:18 AM, David Haslam wrote: > Apart from Chris, has anyone else done any testing on his Python script > usfm2osis.py ? > > See http://crosswire.org/wiki/Converting_SFM_Bibles_to_OSIS#usfm2osis.py > > David > >

Re: [sword-devel] usfm2osis.py

2013-05-07 Thread Chris Burrell
I've tested the Hailo outputs by Michael, which seem to work ok, apart from some of the metadata sometimes displaying when it shouldn't. Chris On 7 May 2013 13:18, David Haslam wrote: > Apart from Chris, has anyone else done any testing on his Python script > usfm2osis.py ? > > See http://cros

[sword-devel] usfm2osis.py

2013-05-07 Thread David Haslam
Apart from Chris, has anyone else done any testing on his Python script usfm2osis.py ? See http://crosswire.org/wiki/Converting_SFM_Bibles_to_OSIS#usfm2osis.py David -- View this message in context: http://sword-dev.350566.n4.nabble.com/usfm2osis-py-tp4652232.html Sent from the SWORD Dev mail

Re: [sword-devel] usfm2osis.py and non-ASCII filenames

2012-12-16 Thread David Haslam
Hi Robert, Chris is already collecting issues relating to his python script in http://www.crosswire.org/tracker/browse/MODTOOLS Please create a new issue therein. Thanks. David -- View this message in context: http://sword-dev.350566.n4.nabble.com/usfm2osis-py-and-non-ASCII-filenames-tp465

[sword-devel] usfm2osis.py and non-ASCII filenames

2012-12-15 Thread Robert Hunt
Hi there,     I'm running Ubuntu Linux and was trying to use usfm2osis in a script, but hit this error: .../sword-tools/modules/python/usfm2osis.py:1460: UnicodeWarning: Unicode equal comparison failed to convert both arguments to Unicode - interpreting them

Re: [sword-devel] usfm2osis.py and crossreferences

2012-10-13 Thread Peter von Kaehne
boration Forum > Betreff: Re: [sword-devel] usfm2osis.py and crossreferences > On 10/12/2012 10:53 PM, Peter von Kaehne wrote: > > Currently usfm2osis.py does not produce complete cross references. > > > > a) It translates the in the \xo tag contained origin reference as

Re: [sword-devel] usfm2osis.py and crossreferences

2012-10-13 Thread ref...@gmx.net
Sent from my HTC - Reply message - From: "Chris Little" To: "SWORD Developers' Collaboration Forum" Subject: [sword-devel] usfm2osis.py and crossreferences Date: Sat, Oct 13, 2012 7:41 am On 10/12/2012 10:53 PM, Peter von Kaehne wrote: > Currently u

Re: [sword-devel] usfm2osis.py and crossreferences

2012-10-12 Thread Chris Little
On 10/12/2012 10:53 PM, Peter von Kaehne wrote: Currently usfm2osis.py does not produce complete cross references. a) It translates the in the \xo tag contained origin reference as a There's a roadmap in usfm2osis.py that includes reference parsing as a post-1.0 feature. At the present, usfm2

[sword-devel] usfm2osis.py and crossreferences

2012-10-12 Thread Peter von Kaehne
Currently usfm2osis.py does not produce complete cross references. a) It translates the in the \xo tag contained origin reference as a http://www.crosswire.org/mailman/listinfo/sword-devel Instructions to unsubscribe/change your settings at above page

Re: [sword-devel] usfm2osis.py and tag \cp

2012-10-12 Thread Peter von Kaehne
> Von: Chris Little > \cp (like \vp) is a workaround for a limitation in Paratext. Thanks, this was me being confused. > You should look into \ca or \cl as alternatives. Thanks. \cl is probably what I looked for. WIll see. Thanks, even more so, for fixing the bug/crash! Peter __

Re: [sword-devel] usfm2osis.py and tag \cp

2012-10-12 Thread Chris Little
On 10/12/2012 4:00 AM, Peter von Kaehne wrote: Sorry, while the crash has gone, the function is not correct - at all. \cp is meant to give a printed chapter number which has no influence on the underlying counting of verses and chapters. How exactly to represent it in OSIS, we would need to figu

Re: [sword-devel] usfm2osis.py and tag \cp

2012-10-12 Thread Peter von Kaehne
> Von: Chris Little > I hope I've fixed this now. (I haven't tested that it functions > correctly, but the error was fairly obvious from the traceback below.) Hi Chris, Sorry, while the crash has gone, the function is not correct - at all. \cp is meant to give a printed chapter number which

Re: [sword-devel] usfm2osis.py and tag \cp

2012-10-11 Thread Chris Little
I hope I've fixed this now. (I haven't tested that it functions correctly, but the error was fairly obvious from the traceback below.) The application will almost always need Ctrl-C to break out because of the multithreading (and because I haven't bothered to add much exception handling). --

Re: [sword-devel] usfm2osis.py and tag \cp

2012-10-11 Thread David Haslam
Bugs & tasks for usfm2osis.py may be reported as issues in JIRA under MODTOOLS. Chris has already begun to use JIRA for this purpose; see http://www.crosswire.org/tracker/browse/MODTOOLS-32 http://www.crosswire.org/tracker/browse/MODTOOLS-33 http://www.crosswire.org/tracker/browse/MODTOOLS-34 htt

[sword-devel] usfm2osis.py and tag \cp

2012-10-11 Thread Peter von Kaehne
The USFM \cp tag (used for chapter markers different from those of the used versification) crashes usfm2osis.py reliably. The programme needs a Ctrl-C interrupt to get out of its state. Following minimal USFM code creates below attached error message. \id EST \h ESTER \c 1 \cp A \s En Mordekai

Re: [sword-devel] usfm2osis.py

2012-09-30 Thread Chris Little
On 09/26/2012 03:15 PM, Greg Hellings wrote: Chris, I just tried to switch over to using usfm2osis.py and there are two minor issues: 1) The script is giving me an output language on the container tag of xml:lang="und". This should read xml:lang="tke" but I don't know if it's possible to determ

Re: [sword-devel] usfm2osis.py

2012-09-27 Thread David Haslam
Of the six instances of "x-indent-" in the script, three of them use the passed parameter in the form of "\1". I guess those must be the places to look. Probably something being mis-parsed. The regexps are not simple in these lines. Search "x-indent-\1" (3 hits in 1 file) D:\Download\Bible Soft

[sword-devel] usfm2osis.py

2012-09-26 Thread Greg Hellings
Chris, I just tried to switch over to using usfm2osis.py and there are two minor issues: 1) The script is giving me an output language on the container tag of xml:lang="und". This should read xml:lang="tke" but I don't know if it's possible to determine that. I'd like to be able to set that as a

Re: [sword-devel] usfm2osis.py

2012-08-07 Thread Chris Little
Further to the issue of wide (UCS-4) vs. narrow (UCS-2) builds, I just thought I'd report that the Python that ships with MacOS X is also compiled with UCS-2. That's both on Lion (Python 2.6.6, I think) and Mountain Lion (2.7.2). And trying to get them to acknowledge characters beyond 0x ge

Re: [sword-devel] usfm2osis.py

2012-08-06 Thread David Haslam
Just pasted the character counts into Excel. FWIW, the non-characters (FDD0-FDEF) since adopted by Chris in the latest edit of the script can be made visible by choosing the Symbol font in Windows. This font messes up everything else, but you can at least distinguish between (most of) these (only

Re: [sword-devel] usfm2osis.py

2012-08-06 Thread David Haslam
Chris writes, "I'll be honest that I don't know what if any use \w...\w* is to us. I think none. I put an tag in, but this will just get passed through Sword without any rendering effect at all. The same applies to the \wg, \wh, and \ndx tags in that section." If and when we can get some action

Re: [sword-devel] usfm2osis.py

2012-08-06 Thread Chris Little
On 08/06/2012 01:40 AM, David Haslam wrote: Chris, We have definitely encountered the glossary tag pair: *\w...\w** yet I noticed that this was listed (so far) as unsupported in your script (line 660). Every tag is already supported in some way*. I just forgot to move the list of supported

Re: [sword-devel] usfm2osis.py

2012-08-06 Thread Chris Little
On 08/05/2012 08:11 PM, Greg Hellings wrote: ActiveState Python also uses 65535 for maxunicode in their 2.7 builds. --Greg So Python is universally hopeless I'll rewrite in Ruby. --Chris ___ sword-devel mailing list: sword-devel@crosswire.org h

Re: [sword-devel] usfm2osis.py

2012-08-06 Thread David Haslam
Chris, We have definitely encountered the glossary tag pair: *\w...\w** yet I noticed that this was listed (so far) as unsupported in your script (line 660). It's used in the source files we have for the New Turkish Bible. It's also used in the Shona translation we had from Teus. David -- V

Re: [sword-devel] usfm2osis.py

2012-08-06 Thread David Haslam
Thanks for the detailed observations about the blocks in the Supplementary Multilingual Plane. I stand corrected. David -- View this message in context: http://sword-dev.350566.n4.nabble.com/usfm2osis-py-tp4650735p4650763.html Sent from the SWORD Dev mailing list archive at Nabble.com. ___

Re: [sword-devel] usfm2osis.py

2012-08-06 Thread David Haslam
This question takes us back to my comparison of the two C2C conversion tools. - The utility within Mac OS X - BabelPad (latest release 6.1.0.7) When I use the latter to convert Traditional Chinese to Simplified Chinese, the output includes several codepoints in the Secondary Ideographic Plane,

Re: [sword-devel] usfm2osis.py

2012-08-06 Thread Chris Little
On 8/6/2012 12:01 AM, David Haslam wrote: Further to my last reply, I think we can safely assume that we are more likely to process *Chinese* text than any of the scripts that require characters from the *Supplementary Multilingual Plane*. Range Block Code Points 1..1007FLinear B Syl

Re: [sword-devel] usfm2osis.py

2012-08-06 Thread Chris Little
On 8/5/2012 11:56 PM, David Haslam wrote: The text of some *Chinese Bibles* includes CJK ideograms that are in the Supplementary Ideographic Plane. Range Block Code Points 2..2A6DFCJK Unified Ideographs Extension B 42,720 2A6E0..2A6FF32 2A700..2B73FCJK Unified Id

Re: [sword-devel] usfm2osis.py

2012-08-06 Thread David Haslam
Further to my last reply, I think we can safely assume that we are more likely to process *Chinese* text than any of the scripts that require characters from the *Supplementary Multilingual Plane*. Range Block Code Points 1..1007FLinear B Syllabary 128 10080..100FFLinear B Id

Re: [sword-devel] usfm2osis.py

2012-08-05 Thread David Haslam
The text of some *Chinese Bibles* includes CJK ideograms that are in the Supplementary Ideographic Plane. Range Block Code Points 2..2A6DFCJK Unified Ideographs Extension B 42,720 2A6E0..2A6FF 32 2A700..2B73FCJK Unified Ideographs Extension C 4,160 2B740..2B81F

Re: [sword-devel] usfm2osis.py

2012-08-05 Thread Greg Hellings
ActiveState Python also uses 65535 for maxunicode in their 2.7 builds. --Greg ___ sword-devel mailing list: sword-devel@crosswire.org http://www.crosswire.org/mailman/listinfo/sword-devel Instructions to unsubscribe/change your settings at above page

Re: [sword-devel] usfm2osis.py

2012-08-05 Thread Chris Little
On 8/5/2012 7:40 PM, Robert Hunt wrote: On 06/08/12 14:20, Chris Little wrote: Linux packagers apparently go the UCS-4 route, so I didn't notice any issue with using the Language Tags. But trying the above on Windows shows that the cygwin build and the builds from python.org (2.7 & 3.2) all use

Re: [sword-devel] usfm2osis.py

2012-08-05 Thread Robert Hunt
On 06/08/12 14:20, Chris Little wrote: Linux packagers apparently go the UCS-4 route, so I didn't notice any issue with using the Language Tags. But trying the above on Windows shows that the cygwin build and the builds from python.org (2.7 & 3.2) all use UCS-2. So my script won't work correctl

Re: [sword-devel] usfm2osis.py

2012-08-05 Thread Chris Little
On 8/5/2012 5:28 PM, Greg Hellings wrote: On Sun, Aug 5, 2012 at 7:19 PM, Chris Little wrote: On Aug 5, 2012, at 11:37 AM, David Haslam wrote: FWIW, I just came across this http://www.pythonregex.com/ Python Regular Expression Testing Tool Does Python support the full 21-bit Unicode rang

Re: [sword-devel] usfm2osis.py

2012-08-05 Thread Greg Hellings
On Sun, Aug 5, 2012 at 7:19 PM, Chris Little wrote: > > > On Aug 5, 2012, at 11:37 AM, David Haslam wrote: > >> FWIW, I just came across this http://www.pythonregex.com/ Python Regular >> Expression Testing Tool >> >> Does Python support the full 21-bit Unicode range? >> >> cf. Many other regula

Re: [sword-devel] usfm2osis.py

2012-08-05 Thread Chris Little
On Aug 5, 2012, at 11:37 AM, David Haslam wrote: > FWIW, I just came across this http://www.pythonregex.com/ Python Regular > Expression Testing Tool > > Does Python support the full 21-bit Unicode range? > > cf. Many other regular expression engines only support the Basic > Multilingual Pl

Re: [sword-devel] usfm2osis.py

2012-08-05 Thread David Haslam
See also the http://en.wikipedia.org/wiki/Comparison_of_regular_expression_engines Comparison of regular expression engines on Wikipedia. If the table is not out of date, it would appear that Perl can do some regexp things that Python can't. e.g. Recursion, etc. David -- View this message i

Re: [sword-devel] usfm2osis.py

2012-08-05 Thread David Haslam
FWIW, I just came across this http://www.pythonregex.com/ Python Regular Expression Testing Tool Does Python support the full 21-bit Unicode range? cf. Many other regular expression engines only support the Basic Multilingual Plane. David -- View this message in context: http://sword-dev.3

Re: [sword-devel] usfm2osis.py

2012-08-05 Thread Chris Little
On 8/5/2012 12:38 AM, David Haslam wrote: Although I haven't done this yet, I understand that it's feasible to install more than one version of Python in the same computer. So (assuming I do get that far), I should be able to install Python 2.7. I'm not certain of this, especially on Windows,

Re: [sword-devel] usfm2osis.py

2012-08-05 Thread Chris Little
On 8/5/2012 12:29 AM, David Haslam wrote: Chris, Thanks for the explanation. Nice to "learn something new each day." It was new to me, and probably also for Peter. However, such tag characters have become deprecated in Unicode 5.1 (2008). See http://en.wikipedia.org/wiki/Unicode_control_chara

Re: [sword-devel] usfm2osis.py

2012-08-05 Thread David Haslam
Chris, Thanks for the explanation. Nice to "learn something new each day." It was new to me, and probably also for Peter. However, such tag characters have become deprecated in Unicode 5.1 (2008). See http://en.wikipedia.org/wiki/Unicode_control_characters#Language_tags http://en.wikipedia.org

Re: [sword-devel] usfm2osis.py

2012-08-05 Thread David Haslam
Although I haven't done this yet, I understand that it's feasible to install more than one version of Python in the same computer. So (assuming I do get that far), I should be able to install Python 2.7. Hmmm! The *Software History* page in the help for Python 3.2.x jumps straight from 2.6.4 to

Re: [sword-devel] usfm2osis.py

2012-08-05 Thread Chris Little
On 08/04/2012 10:10 PM, Robert Hunt wrote: On 05/08/12 00:15, Chris Little wrote: Bug reports are welcome if you try it, but this is still largely untested stuff, so expect bugs. The other script in the above directory can be used to identify all of the USFM tags used in a set of files and wil

Re: [sword-devel] usfm2osis.py

2012-08-04 Thread Robert Hunt
On 05/08/12 00:15, Chris Little wrote: Bug reports are welcome if you try it, but this is still largely untested stuff, so expect bugs. The other script in the above directory can be used to identify all of the USFM tags used in a set of f

Re: [sword-devel] usfm2osis.py

2012-08-04 Thread Chris Little
On 08/04/2012 10:22 AM, David Haslam wrote: Wow! What Peter means is that after all the ASCII stuff (up to the tilde), these are also counted: 0E0030 󠀰 14 TAG DIGIT ZERO 0E0031 󠀱 11 TAG DIGIT ONE 0E0032 󠀲 10 TAG DIGIT TWO 0E0033 󠀳 7 TAG DIGIT THR

Re: [sword-devel] usfm2osis.py

2012-08-04 Thread Chris Little
On 08/04/2012 10:19 AM, Greg Hellings wrote: I'm not at a place where I can check it out right now, but does it cover the functionality that previously was required in xreffix.pl? Since the Perl bindings seem to have gone belly-up on 64-bit machines, it would be great if all of this could be comb

Re: [sword-devel] usfm2osis.py

2012-08-04 Thread Chris Little
On 08/04/2012 07:04 AM, David Haslam wrote: I see after downloading your script that this is already answered. # Target Python 2.7+ (but not 3) David Right. Python 3 is significantly different. I haven't bothered to learn it and don't plan to make usfm2osis.py a Python 3 application at any

Re: [sword-devel] usfm2osis.py

2012-08-04 Thread David Haslam
Wow! What Peter means is that after all the ASCII stuff (up to the tilde), these are also counted: 0E0030 󠀰 14 TAG DIGIT ZERO 0E0031 󠀱 11 TAG DIGIT ONE 0E0032 󠀲 10 TAG DIGIT TWO 0E0033 󠀳 7 TAG DIGIT THREE 0E0034 󠀴 6 TAG DIGIT FOUR 0E0

Re: [sword-devel] usfm2osis.py

2012-08-04 Thread Greg Hellings
I'm not at a place where I can check it out right now, but does it cover the functionality that previously was required in xreffix.pl? Since the Perl bindings seem to have gone belly-up on 64-bit machines, it would be great if all of this could be combined in a single step (even if it's an optional

Re: [sword-devel] usfm2osis.py

2012-08-04 Thread Peter von Kaehne
On 04/08/12 13:15, Chris Little wrote: > usfm2osis.py is posted now, at > Bug reports are welcome if you try it, but this is still largely > untested stuff, so expect bugs. Is it meant to be that there are some very odd characters in the file? Peter _

Re: [sword-devel] usfm2osis.py

2012-08-04 Thread David Haslam
I see after downloading your script that this is already answered. # Target Python 2.7+ (but not 3) David -- View this message in context: http://sword-dev.350566.n4.nabble.com/usfm2osis-py-tp4650735p4650737.html Sent from the SWORD Dev mailing list archive at Nabble.com. ___

Re: [sword-devel] usfm2osis.py

2012-08-04 Thread David Haslam
Chris, Do you foresee any issues if I try to run it with Python 3.2.3 x64 in Windows 7 ? >From the readme.txt file "Python 3.x is a new version of the language, which is incompatible with the 2.x line of releases. The language is mostly the same, but many details, especially how built-in object

[sword-devel] usfm2osis.py

2012-08-04 Thread Chris Little
usfm2osis.py is posted now, at http://www.crosswire.org/svn/sword-tools/trunk/modules/python/ It was developed on/for CPython 2.7.3, but 2.6+ should work. PyPy works fine too, but takes more than twice as long to run. And Jython is not supported at all. The utility is not perfect & the code i