[tesseract-ocr] TessPDFRenderer outputs invalid PDF file (+gosseract)

'blaumedia' via tesseract-ocr Sun, 21 Nov 2021 00:27:24 -0800

Described already in 
issue: https://github.com/tesseract-ocr/tesseract/issues/3652


I'm trying to generate a searchable PDF outgoing from a jpg image, but the 
file that gets output is an invalid pdf file that can't be read by any pdf 
reader.

I have added an docker image for reproduction of the problem in the issue, 
but here is the bash snippet for it:

*git clone g...@github.com:dnnspaul/gosseract.git*
*git checkout tesseract/bug/3652*

*docker build -t tessbug .*
*docker run -it -v $PWD/tmp:/tmp tessbug go run main.go*

When I'm inputting the file in the tesseract cli, the outcoming pdf is 
readable, but I can't find any difference between the cli and my snippet.

Thanks in advance for any help! I'm very sorry, I'm more a GoLang 
developer, than a C ++ developer so I have kind of problems with the 
simplest syntax, but tried my best.

-- 
You received this message because you are subscribed to the Google Groups 
"tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to tesseract-ocr+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/tesseract-ocr/f34562d3-d11e-4385-9c78-b24092413dean%40googlegroups.com.

[tesseract-ocr] TessPDFRenderer outputs invalid PDF file (+gosseract)

Reply via email to