Hi, you did give all details, so I need to guess some details:
1. I guess that you run something like this: $ tesseract binarized.jpg content -l deu but you created makebox file with command $ tesseract binarized.jpg binarized makebox if yes, than difference is in used language file 2. I try to run OCR eng and than with deu language file. With eng url was ok (see binarized-eng), but some German words were not correct. It look like "problem" is in German language file (dictionary?) and not in tesseract library. This is just quick option, so maybe I am wrong. As a workaround you can combine English and German file in tesseract3.02 (see result binarized-eng_deu.txt) $ tesseract binarized.jpg binarized-eng_deu -l eng+deu Zdenko Dňa 17.03.2012 21:40, Renard Wellnitz wrote / napísal(a): > Hi all, > > first of all i would like to express my heartfelt thanks for this great > piece of software which tesseract is. :-) > > Right now i am currently making an OCR Android App with tesseract and the > results i got so far are very good. > > But i encountered a strange issue with tesseract 3.01 and also 3.02. > When running tesseract on the supplied file, tesseract fails to correctly > recognize some characters. Especially in line 8 it gives "wwwxegio-bahnde" > instead of "www.regio-bahn.de" > I then ran the makebox command to see what was going on. To my surprise if > found that the boxes and characters where all 100% correct! > I guess there is no easy fix or config value that i can experiment with? > > Cheers > Renard > > > Am Donnerstag, 2. Februar 2012 19:55:57 UTC+1 schrieb Ray Smith: >> Tesseract 3.02 is now available in svn for preliminary testing, currently >> Linux-only. >> >> There are now 65 languages and some big improvements in layout analysis >> and character accuracy. >> This version will with luck make it into Ubunto LTS Precise Pangolin, so >> please test to see if your favorite issue is resolved. >> >> Thanks and enjoy! >> >> Ray. >> > Am Donnerstag, 2. Februar 2012 19:55:57 UTC+1 schrieb Ray Smith: >> Tesseract 3.02 is now available in svn for preliminary testing, currently >> Linux-only. >> >> There are now 65 languages and some big improvements in layout analysis >> and character accuracy. >> This version will with luck make it into Ubunto LTS Precise Pangolin, so >> please test to see if your favorite issue is resolved. >> >> Thanks and enjoy! >> >> Ray. >> > Am Donnerstag, 2. Februar 2012 19:55:57 UTC+1 schrieb Ray Smith: >> Tesseract 3.02 is now available in svn for preliminary testing, currently >> Linux-only. >> >> There are now 65 languages and some big improvements in layout analysis >> and character accuracy. >> This version will with luck make it into Ubunto LTS Precise Pangolin, so >> please test to see if your favorite issue is resolved. >> >> Thanks and enjoy! >> >> Ray. >> > Am Donnerstag, 2. Februar 2012 19:55:57 UTC+1 schrieb Ray Smith: >> Tesseract 3.02 is now available in svn for preliminary testing, currently >> Linux-only. >> >> There are now 65 languages and some big improvements in layout analysis >> and character accuracy. >> This version will with luck make it into Ubunto LTS Precise Pangolin, so >> please test to see if your favorite issue is resolved. >> >> Thanks and enjoy! >> >> Ray. >> > Am Donnerstag, 2. Februar 2012 19:55:57 UTC+1 schrieb Ray Smith: >> Tesseract 3.02 is now available in svn for preliminary testing, currently >> Linux-only. >> >> There are now 65 languages and some big improvements in layout analysis >> and character accuracy. >> This version will with luck make it into Ubunto LTS Precise Pangolin, so >> please test to see if your favorite issue is resolved. >> >> Thanks and enjoy! >> >> Ray. >> > Am Donnerstag, 2. Februar 2012 19:55:57 UTC+1 schrieb Ray Smith: >> Tesseract 3.02 is now available in svn for preliminary testing, currently >> Linux-only. >> >> There are now 65 languages and some big improvements in layout analysis >> and character accuracy. >> This version will with luck make it into Ubunto LTS Precise Pangolin, so >> please test to see if your favorite issue is resolved. >> >> Thanks and enjoy! >> >> Ray. >> > Am Donnerstag, 2. Februar 2012 19:55:57 UTC+1 schrieb Ray Smith: >> Tesseract 3.02 is now available in svn for preliminary testing, currently >> Linux-only. >> >> There are now 65 languages and some big improvements in layout analysis >> and character accuracy. >> This version will with luck make it into Ubunto LTS Precise Pangolin, so >> please test to see if your favorite issue is resolved. >> >> Thanks and enjoy! >> >> Ray. >> -- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To post to this group, send email to tesseract-ocr@googlegroups.com To unsubscribe from this group, send email to tesseract-ocr+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/tesseract-ocr?hl=en
Haben Sie noch Fragen? Unsere Mitarbeiter/-innen helfen lhnen gem weiter: KundenCenter Regiobahn An der Regiobahn 13 40822 Mettmann Telefon: 02104 305-400 Telefax: 02104 305-403 www.regio-bahn.de i...@regio-bahn.de Schlaue Nummer 0 180 3/50 40 30 (Festnetzpreis 0,09 â¬/Minute; mobil max. 0,42 â¬/Minute) Gute Fahrtwtmscht lhnen lhre REGIOBAHN
Haben Sie noch Fragen? Unsere Mitarbeiter/-innen helfen Ihnen gern weiter: KundenCenter Regiobahn An der Regiobahn 13 40822 Mettmann Telefon: 02104 305-400 Telefax: 02104 305-403 www.regio-bahn.de i...@regio-bahn.de Schlaue Nummer 0 180 3/50 40 30 (Festnetzpreis 0,09 â¬/Minute; mobil max. 0,42 â¬/Minute) Gute Fahrt wünscht Ihnen Ihre REGIOBAHN