[tesseract-ocr] Re: Training Tesseract 4 on real images

Murtuza Dahodwala Fri, 08 Jan 2021 00:32:37 -0800

I also want to know that how we can train on real images which are not 
single lines?


On Thursday, October 8, 2020 at 1:37:02 PM UTC+5:30 smn...@gmail.com wrote:

> Hello,
>
> I would like to train *Tesseract 4* to recognize certain 
> scripts/languages based on real images rather than synthetic ones. Here are 
> my questions:
>
> 1. Is there a tool, preferably cross-platform (Windows/Linux) GUI, that 
> assists in creating .box file based on scanned images? How to get 
> coordinates of textlines? etc...
>
> 2. Is there a youtube/video tutorial describing .tiff/.box files 
> preparation based on real scans?
>
> 3. What provides better recognition - training on real images or training 
> on synthetic images?
>
> 4. How many textlines of real scans do I need to get proper recognition?
>
> Thank you very much!
> ST
>

-- 
You received this message because you are subscribed to the Google Groups 
"tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to tesseract-ocr+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/tesseract-ocr/7ca8dded-7323-4690-b7b4-30f1a7fa605cn%40googlegroups.com.

[tesseract-ocr] Re: Training Tesseract 4 on real images

Reply via email to