How to run UNLV tests on Tesseract

2010-07-20 Thread mahmod Fathi
how to build ftk-1.0 binaries on mac os ? i used a precompiled binary versions of the analytic tools http://www.isri.unlv.edu/ISRI/OCRtk#OCR_Frontiers_Toolkit for linux in mac os i got always awk: division by zero source line number 2 message i need to know how to test Tesseract 2.04 in mac os

How to run UNLV tests on Tesseract

2010-07-20 Thread mahmod Fathi
how to build ftk-1.0 binaries on mac os ? i used a precompiled binary versions of the analytic tools http://www.isri.unlv.edu/ISRI/OCRtk#OCR_Frontiers_Toolkit for linux in mac os i got always awk: division by zero source line number 2 message i need to know how to test Tesseract 2.04 in mac os

Re: Tesseract Reading Issue

2010-07-20 Thread Austin Henderson
As a developer I am cautious to estimate the amount of time a code change will take. I am thrilled to have the code and look forward to enhancements as they are ported to .net environments. For now I am cleaning up the image in pre processing steps to remove blobs that are inconsistent with others

Re: Tesseract Reading Issue

2010-07-20 Thread Jimmy O'Regan
On 20 July 2010 02:52, Austin Henderson wrote: > As a developer I am cautious to estimate the amount of time a code change > will take. :D I like you a lot right now. > I am thrilled to have the code and look forward to enhancements > as they are ported to .net environments. Nobody has mentione

Problem using DangAmbigs and user-words files

2010-07-20 Thread caro
I try to complete these files, after looking at errors appearing during the recognition. Typically, I have the following error which occurs very ofter: tesseract recognizes FESLLTS instead of RESULTS So I had in the file user-word: RESULTS and in the file DangAmbigs: 2 F E 2 R E 2 L L 2 U L 1 F 1

Re: Problem using DangAmbigs and user-words files

2010-07-20 Thread Jimmy O'Regan
On 20 July 2010 15:18, caro wrote: > I try to complete these files, after looking at errors appearing > during the recognition. > Typically, I have the following error which occurs very ofter: > tesseract recognizes FESLLTS instead of RESULTS > > So I had in the file user-word: RESULTS > and in th

Re: Tesseract Reading Issue

2010-07-20 Thread Taxman
"This bad problem is just about fixing Tesseract to accept the reality that not all text have the same height for all letters because not everything is a book." Only some books have uniform text sizes. Textbooks have a large degree of variability in text size within the same page and probably caus

Re: Tesseract Reading Issue

2010-07-20 Thread patrickq
As I said, we just need Jimmy to find 4-5 hours of his free time to knock this one out :-)! On Jul 20, 11:01 am, Taxman wrote: > "This bad problem is just about fixing Tesseract to accept the reality > that not all text have the same height for all letters because not > everything is a book." > >

Generic comments and questions

2010-07-20 Thread CraigLandrum
We are using tesseract-ocr (2.04) as the built-in OCR option for our document management and workflow client software for both Mac and Windows. The client software supports high-speed scanning from Fujitsu document scanners (and others) to TIFF and PDF documents as well as single-image formats. Im

Re: Generic comments and questions

2010-07-20 Thread Jimmy O'Regan
On 20 July 2010 20:40, CraigLandrum wrote: > - The various "dawg" and tessconfig files ("batch", etc) appear to > have only slight effect on the output - probably because we are not > using them correctly. No. I answered basically this question once already today, and I don't feel like repeating