Hi all,

I publish my test / example how to use tesseract C-API in python3 via
cffi[1].

I am aware of pytesseract module, which seems to be widely used. It is
wrapping tesseract executable, so IMO it could have some limitation e.g.
from point of performance (it using disk operation for input and output).

It is in form of jupyter notebook[3] (github is able to show it, but not
run ;-)) so you can interactively view what is happening.

My aim is not to create new tesseract python wrapper (I do not have a time
for it, and I am not able to create nice python code as pytesseract  has
:-) ) so it is not robust: I just did it on windows 64 bit, but IMO is
should be possible with small modification to use in Linux and Mac. If
needed I can add 32bit windows libs...

Personally I would like have python tesseract and leptonica module using
directly its API... I know that James Barlow already started to wrapping
leptonica, but it is (not yet?) available as independent module (it is part
of OCRmyPDF).

Anyway I hope this will help somebody.

[1] https://github.com/zdenop/SimpleTesseractPythonWrapper
[2] https://pypi.org/project/pytesseract/
[3]
https://github.com/zdenop/SimpleTesseractPythonWrapper/blob/master/SimpleTesseractPythonWrapper.ipynb

[4] https://github.com/jbarlow83/OCRmyPDF/tree/master/src/ocrmypdf/lib

Zdenko

-- 
You received this message because you are subscribed to the Google Groups 
"tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to tesseract-ocr+unsubscr...@googlegroups.com.
To post to this group, send email to tesseract-ocr@googlegroups.com.
Visit this group at https://groups.google.com/group/tesseract-ocr.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/tesseract-ocr/CAJbzG8xKHJ0n%3DKUtkfOWcLGg2_R6%2BEmdhT3Fif_J0fhN6gaKbg%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

Reply via email to