On Sun, 3 Jan 2016 14:28:22 +0100 Joost Andrae <joost.and...@gmx.de> wrote:
> Hi there, > > I've just played with the OpenSource OCR engine > https://code.google.com/p/tesseract-ocr/ and it seems to do it's job > very well to do OCR on scanned bitmaps. > As it comes with Apache License 2.0 and as it's available as C++ source > code why not integrating it's functionality into AOO or building an > extension that either connect it's API to AOO or which connect's it > using it's command line arguments ? > > From my perspective both projects would benefit... > > Just my 2 EUR cents.... > > > Kind regards, Joost My experience with it was that it was very accurate, perhaps very close in accuracy to the best commercial products under Windows. I was undertaking a major OCR project (ebook preparation of two out of print 220 page books); I found that using a scan and OCR application under linux (Linux-Intelligent-Ocr-Solution) made more sense for a project of this size; I later fed the plain text files into OO Writer for detailed spellchecking and reformatting. I doubt that full integration with OpenOffice would be a good idea; an extension might be possible, although I doubt its general usefulness will be worth the effort of writing it. -- Rory O'Farrell <ofarr...@iol.ie> --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@openoffice.apache.org For additional commands, e-mail: dev-h...@openoffice.apache.org