[tesseract-ocr] Request to participate in the doctoral study.

2018-10-27 Thread Sushant Mishra
Dear Colleagues, Please fill the questionnaire https://docs.google.com/forms/d/1bhsgh3UbrNzU6C8YZsmtktaX3bMRZIPs6El0kVrpW5k/edit?c=0&w=1 The study pertains to the "Mann Ki Baat" programme being aired by the Prime Minister, Narendra Modi. The anticipated time for completion of this survey is less

[tesseract-ocr] Re: Unable to identify image for number 5 using Eng trained data

2018-10-27 Thread vinaybabu2909
Here is another image where text is skewed and tesseract fails to identify it. On Saturday, October 27, 2018 at 12:22:45 PM UTC+5:30, vinayb...@gmail.com wrote: > > I am using Pytesseract to recognise an image for number 5 and I'm stunned > that even after applying various filters like Glaussia

Re: [tesseract-ocr] Re: Unable to identify image for number 5 using Eng trained data

2018-10-27 Thread Vinay Babu
Well it doesn't seems to be a problem with fonts Training. I tried capturing the same image without skewness and it perfectly worked out. Not sure why tesseract doesn't works with bit skewed texts in images.. On Sat, Oct 27, 2018 at 5:22 PM Vinod Gattani wrote: > It gave "|" as text. > > When re

Re: [tesseract-ocr] Re: Unable to identify image for number 5 using Eng trained data

2018-10-27 Thread Vinod Gattani
It gave "|" as text. When resized to 50*50, text is "N\". You should check whether font used in the image, is a part of fonts on which English language was trained. Thanks On Sat, Oct 27, 2018 at 3:46 PM wrote: > Can you try this new attached image for Alphabet "M" ? > > On Saturday, October 2

Re: [tesseract-ocr] Re: Retrain tesseract 4 model from real image (not from text file and tesstrain.sh)

2018-10-27 Thread Lorenzo Bolzani
Check the unicharset file to see if all the characters you want to recognize are there. combine_tessdata -u trained_model.traineddata output_dir cat output_dir/*unicharset Otherwise you need to merge the old one with the new one before training. This is how ocrd-train

[tesseract-ocr] Re: Unable to identify image for number 5 using Eng trained data

2018-10-27 Thread vinaybabu2909
Can you try this new attached image for Alphabet "M" ? On Saturday, October 27, 2018 at 12:22:45 PM UTC+5:30, vinayb...@gmail.com wrote: > > I am using Pytesseract to recognise an image for number 5 and I'm stunned > that even after applying various filters like GlaussianBlur and Threshold > an

[tesseract-ocr] Gobierno catalán: Retirar multa de 90.001 € a Josep Pamies

2018-10-27 Thread joan . inglada
Hola. Acabo de firmar la petición "Gobierno catalán: Retirar multa de 90.001 € a Josep Pamies" y pensé que te podría interesar. Estamos intentando conseguir 16.243 firmas y necesitamos todo el apoyo que podamos conseguir. Puedes leer más y firmar la petición aquí: https://chn.ge/2O81QvO ¡Gracia

Re: [tesseract-ocr] Unable to identify image for number 5 using Eng trained data

2018-10-27 Thread Vinod Gattani
I used this command: tesseract five_filter_5.jpg ocr.txt --oem 1 --psm 6 -l eng I used "eng.traineddata" from tessdata_best repo. It gave "5" in ocr.txt. On Sat, Oct 27, 2018 at 12:22 PM wrote: > I am using Pytesseract to recognise an image for number 5 and I'm stunned > that even after a