I am also new to OCR
What helped me in a similar issue was to modify what PSM tesseract was using
https://tesseract.patagames.com/help/html/T_Patagames_Ocr_Enums_PageSegMode.htm
Perhaps try all different PSMs

Not sure how to filter languages as I use custom traineddata

On Fri, 3 Sept 2021 at 23:06, Shailesh Kulkarni <shailesh.k...@gmail.com>
wrote:

> HI,
>
> I am new to OCR & using library in C# & Vb.net.
> I have converted pdf to image (pdf is actually print out and contains a
> Image.).
>
> Image contain two languages English and Marathi/Hindi.
> I only want to process English Data.
>
> Also when I process images its giving output in single line from two
> different columns which are beside one another.
>
> Can you please guide how to go for it.
>
> Thank You, Regards,
> Shailesh
>
> --
> You received this message because you are subscribed to the Google Groups
> "tesseract-ocr" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to tesseract-ocr+unsubscr...@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/tesseract-ocr/57f66a04-dc3d-4ac8-9c4e-689ab903791an%40googlegroups.com
> <https://groups.google.com/d/msgid/tesseract-ocr/57f66a04-dc3d-4ac8-9c4e-689ab903791an%40googlegroups.com?utm_medium=email&utm_source=footer>
> .
>

-- 
You received this message because you are subscribed to the Google Groups 
"tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to tesseract-ocr+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/tesseract-ocr/CAJg03TLsi%3DDP1pS83_EqNkcnAQL36r3pzvJ1tDmaqCaqiS01%3DA%40mail.gmail.com.

Reply via email to