[tesseract-ocr] v4.1.1 - Segmentation fault on train data generation; all .lstmf files are exactly 1GB

2021-09-20 Thread Sim Tov
Hello, I use v4.1.1 on Linux (Debian 11) and try to generate train and evaluate data. The commands I used were: train: usr/share/tesseract-ocr/tesstrain.sh --fonts_dir FontsRashi/Working --lang heb --linedata_only --noextract_font_properties --langdata_dir ./langdata --tessdata_dir /usr/shar

[tesseract-ocr] Tesseract unable to recognize simple text when image is closely cropped

2021-09-20 Thread Teofilis Martisius
Hello, I have tried OCRing this image (hello.png, attached). Results come out empty. It works if I add a border (image attached). I run: $ tesseract hello.png stdout Estimating resolution as 528 Empty page!! Estimating resolution as 528 Empty page!! $ tesseract hello_with_border.png stdout Hel

Re: [tesseract-ocr] Tesseract unable to recognize simple text when image is closely cropped

2021-09-20 Thread Zdenko Podobny
> > Should this issue be reported? Absolutely no. Just follow the doc and result will be fine. Zdenko po 20. 9. 2021 o 18:50 Teofilis Martisius napĂ­sal(a): > Hello, > > I have tried OCRing this image (hello.png, attached). Results come out > empty. It works if I add a border (image attached)