The aggregation into a large USFM file is 1 command line (cat .\*\*.dat > shona.sfm) .
Splitting that into standard book files is 1 more command (csplit /\\id / shona.sfm) . You need to check each \c tag has it's own line , incase the chapter files end abnormally without a final newline/return. However, you end up with files numbered 001.dat, 002.dat that then need to be renamed. still trivial, but measured in minutes not seconds. On Tue, Aug 29, 2017 at 11:27 AM, David Haslam <dfh...@googlemail.com> wrote: > Teus has since added all the missing *\toc#* markers to the Shona > <https://github.com/teusbenschop/shona> repo. > > After the last commit, the USFM tag statistics were as follows: > > Count SFM tag Description (updated for USFM 3.0) > ----- -------- ----------------------------------- > 04948 \add Translator's added words begin > 04948 \add* Translator's added words end > 01189 \c Chapter > 00066 \h Running header (h=h1) > 00066 \id Identification > 00065 \mt Major title (mt=mt1) > 00001 \mt1 Major title (portion 1) > 00031 \mt2 Major title (portion 2) > 00009 \nb No break with previous paragraph > 06445 \p Paragraph > 00066 \rem Remark > 01774 \s Section heading (s=s1) > 00066 \toc1 Table of contents 1 (Long table of contents text) > 00066 \toc2 Table of contents 2 (Short table of contents text) > 00066 \toc3 Table of contents 3 (Book abbreviation) > 31102 \v Verse[s] > 15739 \x Cross reference element begin > 15739 \x* Cross reference element end > > Observation: > The data structure in the GitHub repository is not one USFM file per book, > but one [USFM] data file per chapter, each in a suitable numbered > directory, > plus a separate data file (in directory 0) for the USFM header lines. > > In order to convert the text to OSIS, some preprocessing would be required > to get the source text to one USFM file per book (as used by ParaTExt). > > Best regards, > > David > > > > > > -- > View this message in context: http://sword-dev.350566.n4. > nabble.com/Module-upload-Shona-tp4657457p4657513.html > Sent from the SWORD Dev mailing list archive at Nabble.com. > > _______________________________________________ > sword-devel mailing list: sword-devel@crosswire.org > http://www.crosswire.org/mailman/listinfo/sword-devel > Instructions to unsubscribe/change your settings at above page >
_______________________________________________ sword-devel mailing list: sword-devel@crosswire.org http://www.crosswire.org/mailman/listinfo/sword-devel Instructions to unsubscribe/change your settings at above page