Thank you Michael for your help! Let me know if you succeed to do something.
Il 13/05/2019 15:57, Michael H ha scritto: > Cyrille > > LibreOffice Draw attempts to open the pagemaker file, with limited > success. But it confirms that even in the pagemaker source, the verse > numbers are a separate text stream. With this source, there is no way > to copy the text with verse numbers intact. It appears to be stored > with each book in it's own text stream. Each book is a separate text > stream in the page maker file. LO Draw isn't rendering all of the > pages, only the first 10, So I've only explored Matthew further. > > Based on Matthew only, the verses seem to all end with the character > "-" or ";/", which should aid in the reconstruction. I've looked > through the PDF and this seems to be the case for all books visually > as well. However, this isn't perfect: I find 1107 of these characters > in Matthew, instead of the expected 1071 verses. But since the text > stream has a book introduction, this is likely easily explained. > Hopefully this gets you well down the path to creating a stream with > verses. > > I would NOT start from the PDF file, but from the pagemaker file. The > PDF almost certainly has a lot of text rearranging and extra > characters like page numbers and running heads. Pagemaker has the > book text in a single stream, in a form that will convert to unicode > relatively easily. > > > _______________________________________________ > sword-devel mailing list: sword-devel@crosswire.org > http://www.crosswire.org/mailman/listinfo/sword-devel > Instructions to unsubscribe/change your settings at above page
_______________________________________________ sword-devel mailing list: sword-devel@crosswire.org http://www.crosswire.org/mailman/listinfo/sword-devel Instructions to unsubscribe/change your settings at above page