Thank you very much for the link. Can we use non-unicode fonts as well? I
have attached a sinhala font that I'm struggling to train.

Thank you very much

On Thu, Oct 6, 2022 at 11:10 AM Saman Kurdi <saman.uk...@gmail.com> wrote:

> Hello,
>
> This might help.
>
> https://www.mdpi.com/2076-3417/11/20/9752
>
> Refards.
>
> On Thu, Oct 6, 2022 at 07:37 Umanda Dikwatta <abey.u...@gmail.com> wrote:
>
>> Hello,
>>
>> I've been using Tesseract 4.1 for some time. I am using Tesseract with
>> Sinhala language. I got good results for most of the images I tried. I
>> trained Tesseract with different fonts. But as the documentation says, I
>> had to preprocess my images to obtain good results.
>>
>> Then I tried Tesseract 5 with line images as .tif and the labels as
>> .gt.txt. Then I used the generated .traineddata file to extract the text.
>> But that didn't give me good results. I used image processing segmentation
>> to obtain line images. Is it wrong to obtain line images using python
>> segmentation?
>>
>> Could someone please explain me the possible reason?
>>
>> Thank you very much
>>
>> --
>> You received this message because you are subscribed to the Google Groups
>> "tesseract-ocr" group.
>> To unsubscribe from this group and stop receiving emails from it, send an
>> email to tesseract-ocr+unsubscr...@googlegroups.com.
>> To view this discussion on the web visit
>> https://groups.google.com/d/msgid/tesseract-ocr/40a95c6f-b459-4937-930f-1eb103bc4f82n%40googlegroups.com
>> <https://groups.google.com/d/msgid/tesseract-ocr/40a95c6f-b459-4937-930f-1eb103bc4f82n%40googlegroups.com?utm_medium=email&utm_source=footer>
>> .
>>
> --
> You received this message because you are subscribed to the Google Groups
> "tesseract-ocr" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to tesseract-ocr+unsubscr...@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/tesseract-ocr/CAH4VOMLc9f9choNcjUkJVNSt%3DHJazzxBNb-MfDeLvwVUqDMO7Q%40mail.gmail.com
> <https://groups.google.com/d/msgid/tesseract-ocr/CAH4VOMLc9f9choNcjUkJVNSt%3DHJazzxBNb-MfDeLvwVUqDMO7Q%40mail.gmail.com?utm_medium=email&utm_source=footer>
> .
>

-- 
You received this message because you are subscribed to the Google Groups 
"tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to tesseract-ocr+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/tesseract-ocr/CAFGR8aAuxrn-XtN3b_PUvjPFKZRAubRBG6Y%2Bwm3jExY5UL0m6Q%40mail.gmail.com.

Attachment: apex_a.pura-042.ttf
Description: Binary data

Reply via email to