Thanks, i'm still trying to figure it out. It seems when the PIL image is saved, unfortunately it saves a temp file to disk. My goal is to not write to disk, because this application will read a lot of files and I want to spare my SSD. My code receives byte data from a Dart program (I checked it is correct). So far the py file looks like this but i'm not getting anything in return.
def main(): base64_image = sys.stdin.read() image_bytes = base64.b64decode(base64_image) with io.BytesIO(image_bytes) as input: pil_image = Image.open(input) with io.BytesIO() as output: pil_image.save(output, format='PNG', compress=0, compress_level=0) # using disk! output.seek(0) env = os.environ.copy() env['OMP_THREAD_LIMIT'] = '1' p = subprocess.Popen([tesseractPath, '-', '-','-l','por'], stdin=subprocess.PIPE, stdout=subprocess.PIPE, stderr=subprocess.PIPE, env=env) output, stderr = p.communicate(output.read()) stderr = stderr.decode('utf-8') if stderr: logger.warning('tesseract_baselines stderr: %s', stderr) else: sys.stdout(output.encode('utf-8').strip()) if __name__ == '__main__': main() On Tuesday, February 14, 2023 at 4:11:13 PM UTC-3 Merlijn Wajer wrote: > > Hi, > > On 14/02/2023 19:10, Flávio. wrote: > > Sorry, how can I do that? I'm trying to send image binary data, not a > > path. The goal is to not write a file to disk and use only memory. Could > > you please write a code that sends the data (binary) to the stdin of > > tesseract? it can be in Python, Dart or Java :( I've tried ChatGPT but > > it is wrong and gets lost > > Normally I'd say 'left as an exercises to the reader' but I so happen to > have a snippet around that ought to give you a general idea. > > This uses io.BytesIO in Python 3 to save the image (stream) to, it > contains an uncompressed PNG (compression will just slow things down). > It assumes that the variable "pil_image" contains a PIL.Image object. > > The code to use just one core in Tesseract is of course entirely > optional. I didn't *test* this to work (I modified it a bit - it works > in another setting), but it should work in theory: > > > with io.BytesIO() as output: > > pil_image.save(output, format='PNG', compress=0, compress_level=0) > > output.seek(0) > > > > # Let's just use one core in tesseract > > env = os.environ.copy() > > env['OMP_THREAD_LIMIT'] = '1' > > > > p = subprocess.Popen(['tesseract', '-', '-'], > > stdin=subprocess.PIPE, > > stdout=subprocess.PIPE, > > stderr=subprocess.PIPE, > > env=env) > > output, stderr = p.communicate(output.read()) > > stderr = stderr.decode('utf-8') > > > > if stderr: > > logger.warning('tesseract_baselines stderr: %s', stderr) > > > Regards, > Merlijn > -- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To unsubscribe from this group and stop receiving emails from it, send an email to tesseract-ocr+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/tesseract-ocr/ebd9d42d-244c-4a6b-8ab8-c1efd87db501n%40googlegroups.com.