Use the hocr option. On Thu, Mar 24, 2022, 10:52 Muraliraj DK <dk.murali...@gmail.com> wrote:
> I am not sure if you have looked at the image. What i meant on Multi line > text is when the sentence is wrapped to next line i would like to extract > as single sentence instead of 2 lines (paragraph). > > Single line is - sentence which is not wrapped to next line. > > Tessearct can read line by line but is there an option that it detects the > content as paragraph in the image. > > Murali > > > > On Tue, Mar 1, 2022 at 3:09 PM Zdenko Podobny <zde...@gmail.com> wrote: > >> Hello, >> >> First of all: use the recent version of tesseract (5.0.1). >> >> Next: I am not sure what you expected / needed. Can you please be more >> specific? How do you extract single/multi line text? >> What do you mean by "Default training model"? Fast model? >> >> BR, >> >> Zdenko >> >> >> ut 1. 3. 2022 o 6:10 Muraliraj DK <dk.murali...@gmail.com> napĂsal(a): >> >>> Hi, >>> >>> I have an situation to read multi line text (paragraph) as well as >>> single line text. Which psm can be used to satisfy both the case. Example >>> images attached. >>> >>> Image 1 - Multiline text >>> Image 2 - Single line text >>> >>> I have tried psm 1, 3, and 6. Some how i managed to extract single line >>> text but struggling in multi line text. >>> >>> Any help is appreciated. >>> >>> Environment >>> ---------------------- >>> Tesseract version - 4.0 >>> Windows environment >>> Default training model >>> >>> Murali >>> >>> -- >>> You received this message because you are subscribed to the Google >>> Groups "tesseract-ocr" group. >>> To unsubscribe from this group and stop receiving emails from it, send >>> an email to tesseract-ocr+unsubscr...@googlegroups.com. >>> To view this discussion on the web visit >>> https://groups.google.com/d/msgid/tesseract-ocr/8eb27604-01fd-4fae-b2d5-01f805fb9a9dn%40googlegroups.com >>> <https://groups.google.com/d/msgid/tesseract-ocr/8eb27604-01fd-4fae-b2d5-01f805fb9a9dn%40googlegroups.com?utm_medium=email&utm_source=footer> >>> . >>> >> -- >> You received this message because you are subscribed to a topic in the >> Google Groups "tesseract-ocr" group. >> To unsubscribe from this topic, visit >> https://groups.google.com/d/topic/tesseract-ocr/5_v3oDgPXkg/unsubscribe. >> To unsubscribe from this group and all its topics, send an email to >> tesseract-ocr+unsubscr...@googlegroups.com. >> To view this discussion on the web visit >> https://groups.google.com/d/msgid/tesseract-ocr/CAJbzG8zvjjHJak9NTZkkbwANTBTngsX8%2BkdYKzFdk6dnXLtqWA%40mail.gmail.com >> <https://groups.google.com/d/msgid/tesseract-ocr/CAJbzG8zvjjHJak9NTZkkbwANTBTngsX8%2BkdYKzFdk6dnXLtqWA%40mail.gmail.com?utm_medium=email&utm_source=footer> >> . >> > -- > You received this message because you are subscribed to the Google Groups > "tesseract-ocr" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to tesseract-ocr+unsubscr...@googlegroups.com. > To view this discussion on the web visit > https://groups.google.com/d/msgid/tesseract-ocr/CACc_0BMXPhz39p3SqGL8Wbxn2GQ1SJ0FxKmmxegbf84pY7Zxag%40mail.gmail.com > <https://groups.google.com/d/msgid/tesseract-ocr/CACc_0BMXPhz39p3SqGL8Wbxn2GQ1SJ0FxKmmxegbf84pY7Zxag%40mail.gmail.com?utm_medium=email&utm_source=footer> > . > -- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To unsubscribe from this group and stop receiving emails from it, send an email to tesseract-ocr+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/tesseract-ocr/CAG2NduWP7Sj1iVb3s2acYQE_wkwi7%3DT-htEVN5B%3Du8uwZG68ig%40mail.gmail.com.