Sorry I typed the version number incorrectly.. I downloaded the most recent version of source code (3.03) from Google Drive actually...
I tested the same image against 3.02 an 3.03 On Tuesday, March 11, 2014 3:39:06 AM UTC-4, zdenop wrote: > > Did you think about using some recent version of tesseract ;-) ? > Versions 3.0.2 and 3.0.3 seem too old for me (even I am not aware that > they we released). Current stable version is 3.02.02 and next release will > be 3.03. > > Zdenko > > > On Tue, Mar 11, 2014 at 7:41 AM, <[email protected] <javascript:>> wrote: > >> Just made a fresh install of version 3.0.3 and decided to test it out on >> one the image of this test page >> http://iupr1.cs.uni-kl.de/~tmb/ocropus-results/g1000p22/ >> >> Basically, I got better accuracy with 3.0.2 than with 3.0.3 using the >> default english model. >> >> >> For the test, I'm using this image (single page) >> http://iupr1.cs.uni-kl.de/~tmb/ocropus-results/g1000p22/0002.nrm.png >> >> As you can see on the image below, there really isn't any skew or border. >> Also changed the dpi from 72 to 300 >> >> >> <http://papyrus.jolome.com/300.png> >> >> >> When I run tesseract 3.0.3 on this image using various psm, the most >> accurate result is : >> >> ========= >> >> POPULAR TALES >> >> 0' >> >> THE WEST HIGHLANDS. >> >> __._ >> >> XVIII. >> >> THE CHEST. >> Mlanhanhy. >> >> BEFOREthhthmwnflnglndhewinhed >> >> bloehil son with nwifo bofomhonhould depart >> Hinnnnidhehadbeï¬ orgofornwife;mdhognve >> himhnlfuhnndmdponndltogothcr. Homtfor- >> wudtholengthofuday,lndwhanthonightumoho >> mtinblhahlrytomyinit Homtdmtoa >> chmbarwithngoodï¬ ninfrontofhim;mdwhenhe >> hadgoflenmuhthomofthohouumtdownto >> hikme Hobldthomofflnhonnmojour- >> noyonwhichhom Thammofthehountold >> himhneednotgofuflha; “chasm-little >> homoppodbtohhdupingchmhugthtdnmm >> ofthohouuhndthmï¬ nodmghmnndifhowonld >> undinthewindowofhilchmbuinthomoming, >> Mbowouldmonom-noflmooming >> hot-alt Thattheymulllihuchothmnnd >> hoouldnotdisï¬ ngniahomï¬ omhoï¬ ot,but >> >> voun. n \- >> >> Eii‘ >> >> (r.- if- >> >> >> >> >> ======== >> >> In tesseract 3.0.2, I get a more accurate running the same command: >> >> POPULAR >> THE >> TALES >> WEST HIGHLANDS. >> >> THE CHEST. >> From In MecGeechy, Ieley. >> B‘i§°.:‘.‘%..:“;:.*::a: .r::..'..‘::"..%..:“.‘.‘..,.':r.18 >> llie eon said he had better go for 5 wife ; and he gave >> him lmlf a hundred pounds to get her. He went for- >> wen! the length of is day, and when the night came he >> wentintoehoetelrytoeteyinit. Hewentdowntoe >> chamber with e good fire in front of him; and when he >> bed gotten meet, the men of the house went down to >> telk to him. He told the men of the house the jour- >> no] on which he was The men of the house told >> himheneednot go further; that therewee e little >> house opposite to his sleeping chsmber; that the men >> of the house had three fine daughters; end if he would >> stand in the window of his chamber in the morning, >> that he would see one after enother coming ‘to dreee >> herself. That they were ell like eech other, and that >> In could not one from the other, but that >> >> >> -- >> -- >> You received this message because you are subscribed to the Google >> Groups "tesseract-ocr" group. >> To post to this group, send email to [email protected]<javascript:> >> To unsubscribe from this group, send email to >> [email protected] <javascript:> >> For more options, visit this group at >> http://groups.google.com/group/tesseract-ocr?hl=en >> >> --- >> You received this message because you are subscribed to the Google Groups >> "tesseract-ocr" group. >> To unsubscribe from this group and stop receiving emails from it, send an >> email to [email protected] <javascript:>. >> For more options, visit https://groups.google.com/d/optout. >> > > -- -- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To post to this group, send email to [email protected] To unsubscribe from this group, send email to [email protected] For more options, visit this group at http://groups.google.com/group/tesseract-ocr?hl=en --- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. For more options, visit https://groups.google.com/d/optout.

