On Fri, 9 Jul 2004, martin f krafft wrote: > also sprach Andrew Perrin <[EMAIL PROTECTED]> [2004.07.09.1752 +0200]: > > I use an Epson SU1640 Office, which includes a document feeder and > > can be connected via either USB or SCSI. It works fine under > > debian, using the SANE backends, although the one put out by epson > > (the "epkowa" driver) works better than the epson one included > > with SANE. I wrote a simple script to turn documents into PDF's; > > it's not exactly perfect, but it does the job: > > http://www.unc.edu/home/aperrin/tips/src/pdfscan-pl.txt > > Since I assume PNM files to be graphic files, `convert`ing them and > `ps2pdf`ing them will result in PDFs storing image data. These PDFs > are not going to be searchable. Can you confirm this? >
Correct - if you want searchable text you need some OCR filter. I've used gocr with some, moderate, success, but it's by no means perfect. Others have recommended clara, which is probably better but requires too much user involvement for my taste! ap ---------------------------------------------------------------------- Andrew J Perrin - http://www.unc.edu/~aperrin Assistant Professor of Sociology, U of North Carolina, Chapel Hill [EMAIL PROTECTED] * andrew_perrin (at) unc.edu -- To UNSUBSCRIBE, email to [EMAIL PROTECTED] with a subject of "unsubscribe". Trouble? Contact [EMAIL PROTECTED]