Re: [tesseract-ocr] Not getting results with numbers and currency simbols in tables

2019-03-24 Thread Zdenko Podobny
Tesseract is OCR library e.g. user is responsible for image preprocessing. Zdenko ne 24. 3. 2019 o 4:12 napísal(a): > Hi, i feel confused why upscaling works.Actually, in the tesseract, it > also has the process to prescale the image to height 36pix. > > 在 2018年7月30日星期一 UTC+8下午11:19:23,Emili

Re: [tesseract-ocr] Not getting results with numbers and currency simbols in tables

2019-03-23 Thread kotomi . niu
Hi, i feel confused why upscaling works.Actually, in the tesseract, it also has the process to prescale the image to height 36pix. 在 2018年7月30日星期一 UTC+8下午11:19:23,Emiliano Isaza Villamizar写道: > > Lorenzo, Thank you so much for your help. I did everything step by step > and got a very good resu

Re: [tesseract-ocr] Not getting results with numbers and currency simbols in tables

2018-10-15 Thread Lorenzo Bolzani
Just a small note (in case someone will land on this thread): I recently found out that PSM 7 and others work better than 13. See: https://github.com/tesseract-ocr/tesseract/issues/1778#issuecomment-429527692 Il giorno mar 31 lug 2018 alle ore 11:30 Lorenzo Bolzani < l.bolz...@gmail.com> ha scrit

Re: [tesseract-ocr] Not getting results with numbers and currency simbols in tables

2018-07-31 Thread Lorenzo Bolzani
I'm happy to hear that and thank you for letting me know. I was wondering if the instructions were just a mess or too long :) Bye Lorenzo 2018-07-30 17:19 GMT+02:00 Emiliano Isaza Villamizar : > Lorenzo, Thank you so much for your help. I did everything step by step > and got a very good resul

Re: [tesseract-ocr] Not getting results with numbers and currency simbols in tables

2018-07-30 Thread Emiliano Isaza Villamizar
Lorenzo, Thank you so much for your help. I did everything step by step and got a very good result I think what helped me most was up scaling the images. the code I did is in python and is the following if anyone is following the thread: *import PIL* *from PIL import Image* *im = Image.open(im

Re: [tesseract-ocr] Not getting results with numbers and currency simbols in tables

2018-07-26 Thread Lorenzo Bolzani
First, read this: "Fine Tuning for ± a few characters" Then check the data/unicharset file to see if everything is ok, if there are all the characters you want. Then, 15000 iterations are

[tesseract-ocr] Not getting results with numbers and currency simbols in tables

2018-07-25 Thread Emiliano Isaza Villamizar
Hello, I'm trying to train tesseract to accurately extract information from a table. Initialy when running with pytesseract I get these results: *pytesseract.image_to_string(img, lang='eng', config='--psm 11 --oem 1 -c tessedit_char_whitelist=0123456789')* I get these results: ground truth