What command did you use? Difficult to help without seeing what training data you used.
On Sat, Oct 10, 2020, 09:31 Fazle Rabbi <akafazlera...@gmail.com> wrote: > Hi. I have a similar goal in mind about finetuning the 'ben' traineddata > with the pictures i am working with. The picture will be an id so the names > of people have to be recognized correctly. I tried the (line image,ground > truth) way of finetuning the traineddata with very small number of images. > The result was not good- I was kinda surprised as i expected at least the > performance of the default model. My question is if i have a substantial > amount of images and then process and produce the line image and ground > truth from it- will that help me in improving the detection? > > On Sunday, September 27, 2020 at 9:21:17 PM UTC+6 Grad wrote: > >> @shree thank you for the advice, it was helpful. I managed to get >> everything working satisfactorily: after adding additional training images, >> I now get perfect results (446 pass, 0 fail)! Furthermore, these results >> come with using the built-in "eng" model. I ended up not needing to >> re-train or fine-tune Tesseract. The ticket was finding the magic sequence >> of image processing steps to perform on my source images to prepare them >> for input to Tesseract OCR >> >> I have battled with this problem since your response and have come close >> to giving up more than once, thinking that perhaps Tesseract simply isn't >> up to the task. But the limited character set and the uniformity of the >> character appearances kept me going -- there just had to be a way to make >> this work. I'd love to document all the things I tried, and what results >> they gave, but there is just too much. A quick summary will have to suffice. >> >> *What got me close but ultimately didn't work* >> >> - Resized my images so the text was 36px in height. I did this in >> Python using OpenCV and (wrongly I think) chose the cv2.INTER_AREA >> interpolation method. >> - Tried different values for MAX_ITERATIONS in tesstrain's Makefile, >> and got varied results but nothing perfect. >> - Downloaded >> >> https://github.com/Shreeshrii/tessdata_shreetest/blob/master/digits_comma.traineddata >> and used it for the START_MODEL of tesstrain's Makefile (also had to set >> TESSDATA for the Makefile) >> - Between these things, the best result I ever got was something like >> this (input on left, OCR output on right): >> 21,485,000 -> 21,483,000 >> 21,875,000 -> 21,873,000 >> 24,995 -> 24,999 >> 5,450,000 -> 9,450,000 >> 591,958 -> 9591,958 >> 851 -> 8571 >> 851 -> 8571 >> Pass: 428 >> Fail: 7 >> - So you can see, close, but still some pretty unforgivable errors >> (unforgivable to me due to the nature of my application -- these numbers >> need to be perfect) >> >> *What ultimately did work* >> >> - In an act of desperation, and following a bit of a hunch, I >> abandoned trying to train/re-train/fine-tune, and just focused on getting >> perfect OCR on one of the images where it failed using "eng" model >> - I chose this file 1,000,000.png, which produced an empty string >> when ran through Tesseract >> - I used GIMP on Windows and opened 1,000,000.png and began >> adjusting/tweaking/filtering the image in various ways, each time >> re-trying >> the OCR to see if the result changed. Using GIMP was crucial because it >> allowed me to iterate through trying different image processing techniques >> using a GUI, which was much quicker than doing the same thing in Python >> using OpenCV. >> - Once I found what worked, I implemented it in Python. The magic >> steps ended up being: >> 1. Read the source image as color: >> image_to_ocr = cv2.imread(raw_image_file_name, cv2.IMREAD_COLOR) >> 2. Use only the green channel of the source image. The numbers in >> my source images are mostly green tinted and I thought maybe this would >> help. This results in a grayscale image with a dark background and >> white >> text: >> b, image_to_ocr, r = cv2.split(image_to_ocr) >> 3. Enlarge the image by 2x. This resulted in text that is ~20px in >> height, and I found this to be necessary but sufficient. I also found >> the >> use of cv2.INTER_CUBIC instead of cv2.INTER_AREA to be crucial here. I >> think the resizing (enlarging in my case) of the images was an absolute >> must-have. I'm really thankful I posted here and really thankful to >> @shree >> for that little nugget of insight. >> image_to_ocr = cv2.resize(image_to_ocr, (image_to_ocr.shape[1] * 2, >> image_to_ocr.shape[0] * 2), interpolation = cv2.INTER_CUBIC) >> 4. Invert the image so that the background is white and the text >> is black. I am not sure if this step was necessary. >> image_to_ocr = cv2.bitwise_not(image_to_ocr) >> - With these steps, 1,000,000.png OCR'd perfectly >> - I then re-ran my script to check accuracy on all 400+ source >> images, and got the perfect result. I was so nervous while the script was >> running; it prints out errors as it goes, and so many times before I'd run >> the script with eager anticipation that I'd finally gotten everything >> right, only to have an error appear. This time...it ran...seconds go >> by...more seconds go by...no errors...I can't look OMG...check back in 30 >> seconds, 446 pass, 0 fail, I literally stood up and hooped and hollered >> with arms raised. >> >> >> On Sunday, September 20, 2020 at 11:09:02 AM UTC-5 shree wrote: >> >>> Resize your images so that text is 36 pixels high. That's what is used >>> for eng models. >>> >>> Since you are fine tuning, limit number of iterations to 400 or so (not >>> 10000 which is default). >>> >>> Use dedug_level of -1 during training so that you can see the details >>> per iteration. >>> >>> >>> >>> On Sun, Sep 20, 2020, 00:24 Grad <kes...@gmail.com> wrote: >>> >>>> I have fixed my ground-truth file creator script to eliminate the >>>> badly-formed numbers and have re-run my experiment. Unfortunately, I am >>>> still seeing really poor results (12 pass, 383 fail), even though the >>>> training error rates appear to be much smaller this time around: >>>> >>>> At iteration 509/10000/10000, Mean rms=0.184%, delta=0.055%, char >>>> train=0.344%, word train=2.5%, skip ratio=0%, New worst char error = 0.344 >>>> wrote checkpoint. >>>> >>>> Finished! Error rate = 0.308 >>>> lstmtraining \ >>>> --stop_training \ >>>> --continue_from data/swtor/checkpoints/swtor_checkpoint \ >>>> --traineddata data/swtor/swtor.traineddata \ >>>> --model_output data/swtor.traineddata >>>> Loaded file data/swtor/checkpoints/swtor_checkpoint, unpacking... >>>> >>>> Full log of "make training" is attached. >>>> >>>> When I run Tesseract using the "eng" and "swtor" models on the training >>>> images, I'm seeing a the following types of results: >>>> >>>> "eng" model results for 638,997.png: >>>> >>>> > tesseract --psm 7 --oem 1 -c tessedit_char_whitelist=',0123456789' >>>> > 638,997.png >>>> out >>>> Tesseract Open Source OCR Engine v5.0.0-alpha.20200328 with Leptonica >>>> Warning: Invalid resolution 0 dpi. Using 70 instead. >>>> > cat .\out.txt >>>> 638,997 >>>> >>>> "swtor" model results for 638,997.png: >>>> >>>> > tesseract --tessdata-dir -l swtor --psm 7 --oem 1 -c >>>> > tessedit_char_whitelist=',0123456789' >>>> 638,997.png out >>>> Failed to load any lstm-specific dictionaries for lang swtor!! >>>> Tesseract Open Source OCR Engine v5.0.0-alpha.20200328 with Leptonica >>>> Warning: Invalid resolution 0 dpi. Using 70 instead. >>>> > cat .\out.txt >>>> 3,9,997 >>>> >>>> In general, digits are more erroneous, and there is a proliferation of >>>> commas. >>>> >>>> Do any other ideas come to mind? I appreciate your help Shree! >>>> >>>> On Saturday, September 19, 2020 at 12:12:19 PM UTC-5 Grad wrote: >>>> >>>>> If it turns out to be that simple, I will feel really relieved and >>>>> really stupid at the same time. I cannot believe I didn't catch this >>>>> before >>>>> posting. Thank you for taking a look, I'll fix my ground-truth file >>>>> creator >>>>> script and try again. >>>>> >>>>> On Saturday, September 19, 2020 at 12:01:50 PM UTC-5 shree wrote: >>>>> >>>>>> You will get better results when you fix your training data (I >>>>>> deleted all file names ending in -2 and -3). >>>>>> >>>>>> Mean rms=0.145%, delta=0.046%, train=0.214%(1.01%), skip ratio=0% >>>>>> Iteration 396: GROUND TRUTH : 5,500,000 >>>>>> File data/swtor-ground-truth/5,500,000.lstmf line 0 (Perfect): >>>>>> Mean rms=0.145%, delta=0.046%, train=0.214%(1.008%), skip ratio=0% >>>>>> Iteration 397: GROUND TRUTH : 2,000,000 >>>>>> File data/swtor-ground-truth/2,000,000.lstmf line 0 (Perfect): >>>>>> Mean rms=0.145%, delta=0.045%, train=0.213%(1.005%), skip ratio=0% >>>>>> Iteration 398: GROUND TRUTH : 6,435 >>>>>> File data/swtor-ground-truth/6,435.lstmf line 0 (Perfect): >>>>>> Mean rms=0.145%, delta=0.045%, train=0.213%(1.003%), skip ratio=0% >>>>>> Iteration 399: GROUND TRUTH : 3,750,000 >>>>>> File data/swtor-ground-truth/3,750,000.lstmf line 0 (Perfect): >>>>>> Mean rms=0.144%, delta=0.045%, train=0.212%(1%), skip ratio=0% >>>>>> 2 Percent improvement time=4, best error was 100 @ 0 >>>>>> At iteration 4/400/400, Mean rms=0.144%, delta=0.045%, char >>>>>> train=0.212%, word train=1%, skip ratio=0%, New best char error = 0.212 >>>>>> wrote best model:data/swtor/checkpoints/swtor_0.212_4_400.checkpoint >>>>>> wrote >>>>>> checkpoint. >>>>>> >>>>>> Iteration 400: GROUND TRUTH : 5,222,100 >>>>>> File data/swtor-ground-truth/5,222,100.lstmf line 0 (Perfect): >>>>>> Mean rms=0.144%, delta=0.045%, train=0.212%(0.998%), skip ratio=0% >>>>>> Iteration 401: GROUND TRUTH : 696,969 >>>>>> File data/swtor-ground-truth/696,969.lstmf line 0 (Perfect): >>>>>> Mean rms=0.144%, delta=0.045%, train=0.211%(0.995%), skip ratio=0% >>>>>> Iteration 402: GROUND TRUTH : 71,000,000 >>>>>> File data/swtor-ground-truth/71,000,000.lstmf line 0 (Perfect): >>>>>> Mean rms=0.144%, delta=0.045%, train=0.211%(0.993%), skip ratio=0% >>>>>> Iteration 403: GROUND TRUTH : 64,500 >>>>>> File data/swtor-ground-truth/64,500.lstmf line 0 (Perfect): >>>>>> Mean rms=0.144%, delta=0.045%, train=0.21%(0.99%), skip ratio=0% >>>>>> Iteration 404: GROUND TRUTH : 39,500,000 >>>>>> File data/swtor-ground-truth/39,500,000.lstmf line 0 (Perfect): >>>>>> Mean rms=0.144%, delta=0.045%, train=0.21%(0.988%), skip ratio=0% >>>>>> Iteration 405: GROUND TRUTH : 4,500,000 >>>>>> File data/swtor-ground-truth/4,500,000.lstmf line 0 (Perfect): >>>>>> Mean rms=0.143%, delta=0.045%, train=0.209%(0.985%), skip ratio=0% >>>>>> Iteration 406: GROUND TRUTH : 1,450,000 >>>>>> >>>>>> >>>>>> <http://www.avg.com/email-signature?utm_medium=email&utm_source=link&utm_campaign=sig-email&utm_content=webmail> >>>>>> Virus-free. >>>>>> www.avg.com >>>>>> <http://www.avg.com/email-signature?utm_medium=email&utm_source=link&utm_campaign=sig-email&utm_content=webmail> >>>>>> <#m_-2125701927703813766_m_-1362665791027190050_m_4573838550678158057_m_3745996810865765477_m_-8209654746249460667_m_-4693331455246237650_DAB4FAD8-2DD7-40BB-A1B8-4E2AA1F9FDF2> >>>>>> >>>>>> On Sat, Sep 19, 2020 at 10:15 PM Shree Devi Kumar <shree...@gmail.com> >>>>>> wrote: >>>>>> >>>>>>> > Each of my PNG files have file names that indicate ground truth, >>>>>>> and I have a little script that generates ground-truth TXT files from >>>>>>> the >>>>>>> PNG file names. >>>>>>> >>>>>>> Please review your script. I notice a number of file names ending >>>>>>> with -2. The gt.txt files for the same also contain -2 while the image >>>>>>> only >>>>>>> has the number. >>>>>>> >>>>>>> Example files attached. >>>>>>> >>>>>>> >>>>>>> <http://www.avg.com/email-signature?utm_medium=email&utm_source=link&utm_campaign=sig-email&utm_content=webmail> >>>>>>> Virus-free. >>>>>>> www.avg.com >>>>>>> <http://www.avg.com/email-signature?utm_medium=email&utm_source=link&utm_campaign=sig-email&utm_content=webmail> >>>>>>> <#m_-2125701927703813766_m_-1362665791027190050_m_4573838550678158057_m_3745996810865765477_m_-8209654746249460667_m_-4693331455246237650_m_2830491266519781149_DAB4FAD8-2DD7-40BB-A1B8-4E2AA1F9FDF2> >>>>>>> >>>>>> >>>>>> >>>>>> -- >>>>>> >>>>>> ____________________________________________________________ >>>>>> भजन - कीर्तन - आरती @ http://bhajans.ramparivar.com >>>>>> >>>>> -- >>>> >>> You received this message because you are subscribed to the Google >>>> Groups "tesseract-ocr" group. >>>> To unsubscribe from this group and stop receiving emails from it, send >>>> an email to tesseract-oc...@googlegroups.com. >>>> >>> To view this discussion on the web visit >>>> https://groups.google.com/d/msgid/tesseract-ocr/70e5fed6-3035-4885-965c-0552560ef0f6n%40googlegroups.com >>>> <https://groups.google.com/d/msgid/tesseract-ocr/70e5fed6-3035-4885-965c-0552560ef0f6n%40googlegroups.com?utm_medium=email&utm_source=footer> >>>> . >>>> >>> -- > You received this message because you are subscribed to the Google Groups > "tesseract-ocr" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to tesseract-ocr+unsubscr...@googlegroups.com. > To view this discussion on the web visit > https://groups.google.com/d/msgid/tesseract-ocr/f20fef2a-367c-4b10-b1b5-f8349679b4edn%40googlegroups.com > <https://groups.google.com/d/msgid/tesseract-ocr/f20fef2a-367c-4b10-b1b5-f8349679b4edn%40googlegroups.com?utm_medium=email&utm_source=footer> > . > -- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To unsubscribe from this group and stop receiving emails from it, send an email to tesseract-ocr+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/tesseract-ocr/CAG2NduXWoChF6q1tL-HHyWeJ_AsfavDcJ03DfryksbP6dhO1eA%40mail.gmail.com.