date:20201010

Re: [tesseract-ocr] Fine-tuning via tesstrain repo gives me poorer results than built-in eng model

2020-10-10 Thread Shree Devi Kumar

What command did you use? Difficult to help without seeing what training data you used. On Sat, Oct 10, 2020, 09:31 Fazle Rabbi wrote: > Hi. I have a similar goal in mind about finetuning the 'ben' traineddata > with the pictures i am working with. The picture will be an id so the names > of pe

[tesseract-ocr] Resize image of the text to 36 pixels high

2020-10-10 Thread koa...@gmail.com

Hi, I read somewhere in this group that the eng model is based on 36 pixel high text. How do I go about resizing my document image such that the text is 36 pixel high ? I am using python and pytesseract. Thanks. -- You received this message because you are subscribed to the Google Groups "

Re: [tesseract-ocr] Fine-tuning via tesstrain repo gives me poorer results than built-in eng model

2020-10-10 Thread Fazle Rabbi

i did the process manually for 5-6 images. i attached some samples of the line images and ground truth. then i ran >> make training MODEL_NAME= START_MODEL=ben TESSDATA= the resulting .traineddata file seem to not have any connection with the original 'ben' file. the ocr produces unreadable text