Re: Use of scanned documents for text extraction and indexing

2009-02-27 Thread Bastian Buch
terested in this too. --Renaud -Original Message- From: Sudarsan, Sithu D. [mailto:sithu.sudar...@fda.hhs.gov] Sent: Thursday, February 26, 2009 8:29 AM To: solr-u...@lucene.apache.org; java-user@lucene.apache.org Subject: Use of scanned documents for text extraction and indexing Hi All:

RE: Use of scanned documents for text extraction and indexing

2009-02-26 Thread Renaud Waldura
apache.org Subject: Use of scanned documents for text extraction and indexing Hi All: Is there any study / research done on using scanned paper documents as images (may be PDF), and then use some OCR or other technique for extracting text, and the resultant index quality? Thanks in advanc

Use of scanned documents for text extraction and indexing

2009-02-26 Thread Sudarsan, Sithu D.
Hi All: Is there any study / research done on using scanned paper documents as images (may be PDF), and then use some OCR or other technique for extracting text, and the resultant index quality? Thanks in advance, Sithu D Sudarsan sithu.sudar...@fda.hhs.gov sdsudar...@ualr.edu