While reading through Hebreux 6 in the French BBB (Go Bible on my K750i), today I found some inappropriate spaces occurring immediately after a hyphen. A search for "- " found 45 of these bad hyphenations, but three of these were valid. I have done a manual search and replace on the source-text, and then rebuilt the FrenchBBB Go Bible.
I am reporting this because CrossWire also has a SWORD beta-module for the FrenchBBB. Generalising from this, detecting bad hyphenation requires a knowledge of the language, else how can one distinguish it from valid hyphenation. The instance that caught my eye was "pe- tit", which should be "petit". The usual rejoinder one gets from CrossWire when even minor source text issues are observed is, "Wait until we get a better source!" From a practical viewpoint, we should admit that this rarely happens, especially for such minor blemishes that can easily occur because of word-wrapping or during OCR. I don't have a generic solution, but I do wish to start a discussion. Any ideas? What can we do to help our "suppliers" when such "proof-reading errors" are found? -- David Haslam -- View this message in context: http://www.nabble.com/Detecting-and-correcting-poor-hyphenation-in-source-texts--tp21253460p21253460.html Sent from the SWORD Dev mailing list archive at Nabble.com. _______________________________________________ sword-devel mailing list: sword-devel@crosswire.org http://www.crosswire.org/mailman/listinfo/sword-devel Instructions to unsubscribe/change your settings at above page