I am also new to OCR What helped me in a similar issue was to modify what PSM tesseract was using https://tesseract.patagames.com/help/html/T_Patagames_Ocr_Enums_PageSegMode.htm Perhaps try all different PSMs
Not sure how to filter languages as I use custom traineddata On Fri, 3 Sept 2021 at 23:06, Shailesh Kulkarni <shailesh.k...@gmail.com> wrote: > HI, > > I am new to OCR & using library in C# & Vb.net. > I have converted pdf to image (pdf is actually print out and contains a > Image.). > > Image contain two languages English and Marathi/Hindi. > I only want to process English Data. > > Also when I process images its giving output in single line from two > different columns which are beside one another. > > Can you please guide how to go for it. > > Thank You, Regards, > Shailesh > > -- > You received this message because you are subscribed to the Google Groups > "tesseract-ocr" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to tesseract-ocr+unsubscr...@googlegroups.com. > To view this discussion on the web visit > https://groups.google.com/d/msgid/tesseract-ocr/57f66a04-dc3d-4ac8-9c4e-689ab903791an%40googlegroups.com > <https://groups.google.com/d/msgid/tesseract-ocr/57f66a04-dc3d-4ac8-9c4e-689ab903791an%40googlegroups.com?utm_medium=email&utm_source=footer> > . > -- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To unsubscribe from this group and stop receiving emails from it, send an email to tesseract-ocr+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/tesseract-ocr/CAJg03TLsi%3DDP1pS83_EqNkcnAQL36r3pzvJ1tDmaqCaqiS01%3DA%40mail.gmail.com.