Re: [tesseract-ocr] IF I could make .unicharset by box/tif pairs instead of fonts files by tesstrain.sh?

2018-08-28 Thread WangSiyuan
Hey shree Thank you for reply. I had noticed the /tmp directory,and I will try this new flag for viewing how the fonts files change into the box/tiff pairs. 在 2018年8月28日星期二 UTC+8下午1:49:40,shree写道: > > When using tesstrain.sh, you can add --save_box_tiff to the command line. > > Original tes

RE: [tesseract-ocr] Tesseract 3.x multiprocessing weird behaviour

2018-08-28 Thread Adrian Owen
When multiprocessing using V4 (and TessAPI), I had to make multiple copies of tessdata, and give each worker with a unique tessdata. Now it works okay. Hope this is helpful. From: tesseract-ocr@googlegroups.com [mailto:tesseract-ocr@googlegroups.com] On Behalf Of ignas...@gmail.com Sent: 28 Aug

Re: [tesseract-ocr] What i need to do fine tuning for only numbers and specific font?

2018-08-28 Thread Soumik Ranjan Dasgupta
Hey Yasin, Sorry to reply so late. As far as I know, Tesseract doesn't work on MacOs yet. Maybe you can install a Linux environment inside a VM and make-do with it? No, You don't have to create box files manually, tesstrain.sh will do that for you. In fact, it will take care of the entire training

[tesseract-ocr] OCR images with arbitrary foreign language text

2018-08-28 Thread loiodice
Is anyone aware of best practices for recognizing text in a image which could be in english or any other language? - Is configuring tesseract with all 100+ supported trained datasets and just letting him figure out what the best language dataset to use an option? Does anybody have experience w