Thanks. I read it before but I saw some examples of searchable pdf files that were generated by tesseract. I do not know what was the process so I'm asking here.
Thanks, Elishai On Wed, Jan 5, 2022, 21:53 Zdenko Podobny <zde...@gmail.com> wrote: > Maybe you can start with this reading: > > https://github.com/tesseract-ocr/tesseract/issues/238 > > Zdenko > > > st 5. 1. 2022 o 19:30 Elishai Cohen <elishaico...@gmail.com> napĂsal(a): > >> Hi, >> >> I'm focus on generate a searchable pdf file in Right to Left language >> (e.g. Hebrew and Arabic) >> >> I'm working with python on ubuntu and windows. >> >> while I'm using tesseract or pytesseract I'm getting the results that >> are in the wrong orientation. (Left to right instead RTL) >> >> should i add any language type or something else ? there is a another way >> to extract text in Alto xml or hocr and after that combine with the jpg >> file and create a searchable pdf file? >> >> looking forward your advice, >> >> thanks in advance, >> Elishai >> >> -- >> You received this message because you are subscribed to the Google Groups >> "tesseract-ocr" group. >> To unsubscribe from this group and stop receiving emails from it, send an >> email to tesseract-ocr+unsubscr...@googlegroups.com. >> To view this discussion on the web visit >> https://groups.google.com/d/msgid/tesseract-ocr/22c40308-4200-4f31-bd29-14cff1425c40n%40googlegroups.com >> <https://groups.google.com/d/msgid/tesseract-ocr/22c40308-4200-4f31-bd29-14cff1425c40n%40googlegroups.com?utm_medium=email&utm_source=footer> >> . >> > -- > You received this message because you are subscribed to a topic in the > Google Groups "tesseract-ocr" group. > To unsubscribe from this topic, visit > https://groups.google.com/d/topic/tesseract-ocr/5Xk0WcwCzwU/unsubscribe. > To unsubscribe from this group and all its topics, send an email to > tesseract-ocr+unsubscr...@googlegroups.com. > To view this discussion on the web visit > https://groups.google.com/d/msgid/tesseract-ocr/CAJbzG8w3eM9%2B7Os2o0%2Bsis6VFMKjhFEoRwPPBZuv4Sct_7xXZg%40mail.gmail.com > <https://groups.google.com/d/msgid/tesseract-ocr/CAJbzG8w3eM9%2B7Os2o0%2Bsis6VFMKjhFEoRwPPBZuv4Sct_7xXZg%40mail.gmail.com?utm_medium=email&utm_source=footer> > . > -- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To unsubscribe from this group and stop receiving emails from it, send an email to tesseract-ocr+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/tesseract-ocr/CAMRz0YMha-PuhwxR7SDZQLzQFsqxa9uTs%3DW_VdEBHUyZYwARuQ%40mail.gmail.com.