Title: signature
I'm breaking my long period of ignoring and avoiding OSIS, and working on building a USFX to MOSIS converter into the open source Haiola software, both into the UI tool and as a stand-alone cross-platform executable. The "M" in "MOSIS" is for "Modified". The only significant modification is a shift in the semantics of <q who="Jesus" sID="somethingunique" marker=""> to be used only in milestone form, only for quotations by Jesus (as an equivalent of the <wj> tag, and, rather than only at the beginning and end of the quotation, to stop and start at verse boundaries. The proper quotation punctuation for the translations are always in the text of the translation, where almost all translators believe they belong. The result is not exactly in line with the original intentions of OSIS, but should validate against the Schema fine, and actually be easier to display. This is a fairly harmless exception for Sword use, since the result is processed to display on a verse-by-verse basis, anyway.

It takes more than simple replacement of tags, i.e. with awk, to get the conversion right, if you really understand both the source and destination standards. I'm working in C#, because that is the tool I know best, although other languages could work, too. It is the actual logic implemented that matters.

Although there is a fair amount of varying interpretation of what USFM markers should mean and some historical artifacts left over from when other SFM predecessors were in use with different meaning, not to mention intentional variation from the current USFM standard, I have 241 USFX Scripture texts in 237 dialects of 212 languages that are all "clean" enough with respect to markup that I can and did produce web sites from them. They should be clean enough to convert to sword modules in an automated fashion. 11 of those are Public Domain. The rest are available under the terms at http://PNGScriptures.org/terms.htm. The exceptions to markup cleanness that remain are generally problems with peripheral materials other than the actual Scriptures, which could be stripped out until such time as someone manually cleans them up.

Some of the metadata expected by OSIS isn't present in raw USFM source, but I have that stored in other XML files in Haiola project configurations, so I'll pull that in for the merge.

I have more texts that can be added to the set of 241 mentioned above, but I haven't cleaned them up and processed them, yet. So much work, so little time... time to pray and code!


On 11/08/2012 12:39 AM, Chris Burrell wrote:
Thanks for all the info. On the last point, I did mean read directly from USFM. I don't know the format well-enough, but presumably if other software uses it, then maybe we could have a go at displaying the best we can...
Chris



On 8 November 2012 10:17, Peter von Kaehne <ref...@gmx.net> wrote:
Hi Chris,

> Von: Chris Burrell <ch...@burrell.me.uk>

> I've found some instructions on transforming usfm/x to osis on the wiki
> but
> was wondering how difficult it would be to automate a lot of it?

Several of us have been starting to think and experiment with this too.

Basically it is easy to automate as such. The problem is around cleaning up.

There is a thread earlier this year where some of this was discussed. The basic plan is to use a git repository and git hooks with scripts attached to that. Some infrastructure is up, but not much else has happened yet.
>
> is it such that there is too much manual cleaning up?

Manual/mechanical cleaning up is a huge need, unfortunately. I have not yet encountered a truly clean USFM text, despite all claims by various USFM experts.
>
> also, I was wondering if there's any appetite in developing a driver to
> read such modules, within sword or jsword...

You mean to read directly USFM?

Peter



_______________________________________________
sword-devel mailing list: sword-devel@crosswire.org
http://www.crosswire.org/mailman/listinfo/sword-devel
Instructions to unsubscribe/change your settings at above page


--

Aloha,
Michael Johnson

mljohnson.org
PO BOX 5278
KAILUA KONA HI 96745-5278
USA

Phone: +1 808-333-6921
Skype: kahunapule



_______________________________________________
sword-devel mailing list: sword-devel@crosswire.org
http://www.crosswire.org/mailman/listinfo/sword-devel
Instructions to unsubscribe/change your settings at above page

Reply via email to