Re: OCR questions

Rodolfo Medina Sun, 22 Jul 2007 02:41:56 -0700

Rodolfo Medina wrote:

> I tried gocr and the result was quite miserable.  Then I tried with MS Windows
> and it was almost perfect.  Somewhere in the web I read that OCR software
> under
> Linux is very poor at the moment and that it's better to use MS Windows for
> that: unfortunately my test seems to confirm that.  What do you Debian listers
> think?




[EMAIL PROTECTED] (Bob Proulx) writes:

> I think you should check out these articles.
>
>   
> http://google-code-updates.blogspot.com/2006/08/announcing-tesseract-ocr.html
>
>   http://code.google.com/p/tesseract-ocr/
>
>   http://www.linux.com/articles/57222
>
>   http://sourceforge.net/projects/tesseract-ocr



I tried tesseract, but am sorry to say that, at least with italian language, it
works much better than gocr, but still sensibly worse than MS Windows software
that came with my Canon CanoScan LIDE 25 scanner.  I don't like that, but
unfortunately it is true in my exprience.

I'm reporting the installation procedure, from source:

 1) from: `http://code.google.com/p/tesseract-ocr/downloads/list' I downloaded
    the files tesseract-2.00.tar.gz and tesseract-2.00.ita.tar.gz, put them in
    my ~/tmp directory and unpacked them the usual way: tar xzvf <package-name>;

 2) I copied all the files from the ~/tmp/tessdata into
    ~/tmp/tesseract-2.00/tessdata;

 3) $ cd ~/tmp/tesseract-2.00
    $ ./configure
    $ make
    # make install
    $ tesseract document.tif document -l ita

Bye,
Rodolfo


-- 
To UNSUBSCRIBE, email to [EMAIL PROTECTED] 
with a subject of "unsubscribe". Trouble? Contact [EMAIL PROTECTED]

Re: OCR questions

Reply via email to