No, it is not possible (tesseract uses an image used for OCR for pdf creation, OCR output for the position of text...)
Zdenko st 5. 7. 2023 o 7:12 lbr <lbr7...@gmail.com> napĂsal(a): > I'm trying to create a searchable pdf out of a scanned one. I want to use > Textract as an OCR engine instead of Tesseract. Is there a way to make > libtesseract skip the OCR step and just create the invisible text layer > (with the extracted chars from Textract) and apply it to the input pdf? > > I read that libtesseract is what ocrmypdf uses to create the invisible > text layer for searchable pdfs. > > -- > You received this message because you are subscribed to the Google Groups > "tesseract-ocr" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to tesseract-ocr+unsubscr...@googlegroups.com. > To view this discussion on the web visit > https://groups.google.com/d/msgid/tesseract-ocr/08bb441a-6edb-47be-b314-b0638a0bce1an%40googlegroups.com > <https://groups.google.com/d/msgid/tesseract-ocr/08bb441a-6edb-47be-b314-b0638a0bce1an%40googlegroups.com?utm_medium=email&utm_source=footer> > . > -- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To unsubscribe from this group and stop receiving emails from it, send an email to tesseract-ocr+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/tesseract-ocr/CAJbzG8z_UcvpWKrL6GNxm%3D5LfcOT7NGSw%3DjmMC7XiCKi581e1w%40mail.gmail.com.