The perl application gscan2pdf will probably do what you need: http://gscan2pdf.sourceforge.net/
I use a shell script "bscan" for scanning to pnm then conversion to e.g. pdf. Since my scanner scans better in 8-bit grayscale then 2-bit B&W, I scan in 8-bit grayscale @ 300dpi then convert to bitonal Black&White using djvu wavelet compression (option -BW in my script): bscan --mode=8-bit --shades=2 --page=Legal --comp=lzw -BW FILE Sometimes I may need to use a photo scanner with high optical resolution (e.g. an Epson with 24-bit grayscale). If I need to scan in color, I usually scan to pnm then convert to djvu using c44, e.g.: bscan --mode=color --shades=truecolor --page=Letter -c44 --djvutopdf=25 FILE http://www.acjlaw.net:8080/~jeremy/Ricoh/usage_bscan.html I haven't had much luck with any of the open source OCR programs. Maybe max 90% accuracy on straight B/W text with no logos, rules, underlines and all text horizontal and of the same font weight and shape.