Thanks, i'm still trying to figure it out. It seems when the PIL image is 
saved, unfortunately it saves a temp file to disk. My goal is to not write 
to disk, because this application will read a lot of files and I want to 
spare my SSD. My code receives byte data from a Dart program (I checked it 
is correct).   So far the py file looks like this but i'm not getting 
anything in return.

def main():
    base64_image = sys.stdin.read()
    image_bytes = base64.b64decode(base64_image)
    with io.BytesIO(image_bytes) as input:
        pil_image = Image.open(input)
        with io.BytesIO() as output:
            pil_image.save(output, format='PNG', compress=0, 
compress_level=0)  # using disk!
            output.seek(0)

            env = os.environ.copy()
            env['OMP_THREAD_LIMIT'] = '1'

            p = subprocess.Popen([tesseractPath, '-', '-','-l','por'],
                                 stdin=subprocess.PIPE,
                                 stdout=subprocess.PIPE,
                                 stderr=subprocess.PIPE,
                                 env=env)
            output, stderr = p.communicate(output.read())
            stderr = stderr.decode('utf-8')

            if stderr:
                logger.warning('tesseract_baselines stderr: %s', stderr)
            else:
                sys.stdout(output.encode('utf-8').strip())


if __name__ == '__main__':
     main()


On Tuesday, February 14, 2023 at 4:11:13 PM UTC-3 Merlijn Wajer wrote:

>
> Hi,
>
> On 14/02/2023 19:10, Flávio. wrote:
> > Sorry, how can I do that?  I'm trying to send image binary data, not a 
> > path. The goal is to not write a file to disk and use only memory. Could 
> > you please write a code that sends the data (binary) to the stdin of 
> > tesseract? it can be in Python, Dart or Java :(  I've tried ChatGPT but 
> > it is wrong and gets lost
>
> Normally I'd say 'left as an exercises to the reader' but I so happen to 
> have a snippet around that ought to give you a general idea.
>
> This uses io.BytesIO in Python 3 to save the image (stream) to, it 
> contains an uncompressed PNG (compression will just slow things down). 
> It assumes that the variable "pil_image" contains a PIL.Image object.
>
> The code to use just one core in Tesseract is of course entirely 
> optional. I didn't *test* this to work (I modified it a bit - it works 
> in another setting), but it should work in theory:
>
> > with io.BytesIO() as output:
> > pil_image.save(output, format='PNG', compress=0, compress_level=0)
> > output.seek(0)
> > 
> > # Let's just use one core in tesseract
> > env = os.environ.copy()
> > env['OMP_THREAD_LIMIT'] = '1'
> > 
> > p = subprocess.Popen(['tesseract', '-', '-'],
> > stdin=subprocess.PIPE,
> > stdout=subprocess.PIPE,
> > stderr=subprocess.PIPE,
> > env=env)
> > output, stderr = p.communicate(output.read())
> > stderr = stderr.decode('utf-8')
> > 
> > if stderr:
> > logger.warning('tesseract_baselines stderr: %s', stderr)
>
>
> Regards,
> Merlijn
>

-- 
You received this message because you are subscribed to the Google Groups 
"tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to tesseract-ocr+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/tesseract-ocr/ebd9d42d-244c-4a6b-8ab8-c1efd87db501n%40googlegroups.com.

Reply via email to