Hi,

On 14/02/2023 19:10, Flávio. wrote:
Sorry, how can I do that?  I'm trying to send image binary data, not a path. The goal is to not write a file to disk and use only memory. Could you please write a code that sends the data (binary) to the stdin of tesseract? it can be in Python, Dart or Java :(  I've tried ChatGPT but it is wrong and gets lost

Normally I'd say 'left as an exercises to the reader' but I so happen to have a snippet around that ought to give you a general idea.

This uses io.BytesIO in Python 3 to save the image (stream) to, it contains an uncompressed PNG (compression will just slow things down). It assumes that the variable "pil_image" contains a PIL.Image object.

The code to use just one core in Tesseract is of course entirely optional. I didn't *test* this to work (I modified it a bit - it works in another setting), but it should work in theory:

with io.BytesIO() as output:
    pil_image.save(output, format='PNG', compress=0, compress_level=0)
    output.seek(0)

    # Let's just use one core in tesseract
    env = os.environ.copy()
    env['OMP_THREAD_LIMIT'] = '1'

    p = subprocess.Popen(['tesseract', '-', '-'],
                           stdin=subprocess.PIPE,
                           stdout=subprocess.PIPE,
                           stderr=subprocess.PIPE,
                           env=env)
    output, stderr = p.communicate(output.read())
    stderr = stderr.decode('utf-8')

    if stderr:
        logger.warning('tesseract_baselines stderr: %s', stderr)


Regards,
Merlijn

--
You received this message because you are subscribed to the Google Groups 
"tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to tesseract-ocr+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/tesseract-ocr/d9e4cf6c-4b68-2136-c857-cc99a2c04ab3%40archive.org.

Reply via email to