Eugene Reimer wrote, On 2009-06-23 23:11: > Thanks Ray. However I'm unable to accept your explanation of those > "box overlaps no blobs or blobs in multiple rows" messages. The first > of those in my boxfile occurs for the "." line reproduced here > together with all its neighbours: > > v 2647 5678 2689 5726 > e 2690 5679 2732 5725 > > s 2638 5577 2675 5624 > . 2676 5577 2694 5593 > s 2727 5575 2762 5621 > > a 2664 5474 2703 5523 > > You'll notice that its box does not overlap the box of any neighbour. > > Another reason why I'm not convinced that overlapping boxes is what > the program is complaining about: the distributed training package > for German (boxtiff-2.01.deu.tar.gz), contains a boxfile for the > arialbi font which does have outright overlapping adjacent boxes, for > the adjacent characters "{j", whose lines are: > { 2759 3073 2777 3111 0 > j 2776 3073 2795 3111 0 > where the box for "{" ends at x:2777, and the one for "j" begins at > x:2776. And yet that boxfile appears to have been acceptable, and to > have produced usable training info. > > One thing that's unusual about the boxes being complained about is > that each has a y-upper-bound considerably lower than the other > characters in the same row. AHA, revising those y-upper-bounds > upwards to agree with its same-row neighbours gets rid of those > complaints!! Who would have thought it? > > > Ray Smith wrote, On 2009-06-23 11:37: > >> I put the answers to these questions on the training page. >> Ray. >
--~--~---------~--~----~------------~-------~--~----~ You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To post to this group, send email to tesseract-ocr@googlegroups.com To unsubscribe from this group, send email to tesseract-ocr+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/tesseract-ocr?hl=en -~----------~----~----~----~------~----~------~--~---