Re: OCR questions (was: How to acquire text so to edit it?)

Andrew Sackville-West Sat, 21 Jul 2007 11:57:38 -0700

On Sat, Jul 21, 2007 at 08:10:27PM +0200, Bob Proulx wrote:
> Rodolfo Medina wrote:
> > Somewhere in the web I read that OCR software under Linux is very
> > poor at the moment and that it's better to use MS Windows for that:
> > unfortunately my test seems to confirm that.  What do you Debian
> > listers think?
> 
> I think you should check out these articles.
> 
>   
> http://google-code-updates.blogspot.com/2006/08/announcing-tesseract-ocr.html
> 
>   http://code.google.com/p/tesseract-ocr/
> 
>   http://www.linux.com/articles/57222


hey, looks pretty good. The linux.com article complains about having
to manually crop out photos and the limited file formats accepts (tiff
only) but those are pretty minor. Its should be fairly simple to put
wrappers around to clean up the and convert files format to get data
into the thing without having to grok OCR code. IOW, I would expect to
see this get used as a backend in various other existing graphics code
bases to make OCR really viable in OSS.

A

signature.asc
Description: Digital signature

Re: OCR questions (was: How to acquire text so to edit it?)

Reply via email to