On Friday 19 December 2008 11:34:09 gobo wrote: > for some time now i've been using homemade scripts with scanimage and > scanadf to scan my paper documents. most of my documents are plain > text. the results have always been poor and marginally acceptable. i'm > using suse 10.3 and an hp aio j6450 or psc1210xi. > > recently i obtained a canon scanner w/adf for use at work where i must > use windows. to get around the image compatibility issues of microsoft > document imaging (office 2003) i simply print the scanned image to pdf > with acrobat. the results obtained with mdi are far superior to > anything i've ever been able to achieve with sane apps. > > i've spent hours fumbling around with scanimage options, imagemagick > convert to resize the images and ps2pdf to produce the pdf files. > while i have made some slight improvements over the default settings, > i've never been able to get even close to the mdi output. in the few > places where i must have a good scan, i use resolutions of 150 or 300, > but to get prints of the image becomes a real pain. i must load the > image in gimp, fiddle around resizing it and then printing. > > > my standard scanimage script would contain: > scanimage -x 215.9 -y 297 -d > hpaio:/net/Officejet_J6400_series?ip=192.168.1.103 \ > -pv --mode gray > $FILE > > > pieces from a perl script using the adf: > # this is the scan device > @scanr = ("hpaio:/net/Officejet_J6400_series?ip=192.168.1.103"); > # these are the command line options for scanadf > @opts = ("-x 215.9 -y 297 -v --mode=gray --source ADF --batch-scan=no -e > 1"); > > # scan page > system("scanadf @opts -d @scanr -o $fnamepg"); > > adding --resolution=150, or 300 does produce a larger image, with less > artifacting, and much more readable, but difficult to print. > > the answer must be one of two things -- either i'm missing something > real simple about producing hi-res 8.5x11" images (that is right in > front of my nose) or we are just not there yet with linux scanning. > > can someone correct, or put me on a better path? > > thanks.
I use 2 bash scripts for document scanning, bscan and scans2pdf, located at http://www.acjlaw.net:8080/~jeremy/Ricoh/scripts/ The scripts are based on simpler versions I found on the net (I forget where) The bscan (batch scan) script acquires pnm images from the scanner using scanimage and then processes those images into a multipage pdf using pnmtools. The scans2pdf script takes sequential pnm images from xsane (e.g. file.%04d.pnm) and converts them into a multipage pdf. The processing logic in scans2pdf is exactly the same as in bscan. I never got around to substituting the processing logic in bscan with a call to scans2pdf (it's mainly just a matter of repackaging arguments to bscan to work with scans2pdf -- eg. the option "-gray nshades" enables both grayscale scanning and also sets the number of grayshades to keep in the final processed pdf.) To facilitate one-key scanning it's convenient to define some aliases: alias B='bscan -gray 2' alias BL='bscan -gray 2 -page Legal' alias CL='bscan -color 32 -page Legal' alias b='bscan -s 0' alias bl='bscan -s 0 -page Legal' alias c='bscan -color 32' Thus to scan a letter-sized document in grayscale, and then convert to black+white using adaptive/dynamic thresholding/binarization I would simply use the command "B -bw filename" which will create filename.pdf To scan legal sheets in lineart mode: "bl filename" or in color "cl filename" I have here a 13-page legal -sized document which was scanned in grayscale and converted to b/w. It is 749K or 57K/pg which is reasonable. I could have scanned in b/w but it would not have saved all that much space. The bscan program accepts many options for changing the default behavior: SCANNER OPTIONS: -d "device name" eg. HS2P or SP15C -source ADF= Y | N -page legal | letter -color number_of_colors (enables color scanning & set max # colors) -gray number_of_gray_shades -res resolution -duplex enables duplex -s (user settings defaults) PROCESSING OPTIONS -bw (convert to black+white using adaptive thresholding) -dither (eg. atkinson, see pamdither) -color (remap colorspace to number_of_colors) -gray (downsample to nshades of gray) -flip r180 (rotates 180 degrees) OUTPUT OPTION: -pnm (don't convert to pdf) Some documents which don't have enough contrast to still be readable after conversion to b/w are simply scanned in gray or color mode: "B filename" or "C filename" The large filename.pdf can then be reduced in size by conversion to djvu: pdf2djvu filename.pdf -o filename.djv djview4 filename.djv -> print to ps ps2pdf14 filename.ps filename.pdf (now much smaller ~1/50 original size) Some documents may need user-interaction to set cropping, brightness/contrast/gamma, etc. using xsane. The scans (file.0001.pnm, file.0002.pnm, ...) can then be converted to pdf: "scans2pdf -bw file" which will convert all the file*.pnm to a single multipage file.pdf containing b/w images. It should be straightforward to modify this script to recognize your scanners' options and device names. There is also a promissing gui-program gscan2pdf on sourceforge: http://gscan2pdf.sourceforge.net/ There was a bug in the program which would not let me change my SP15C scanner's options. I submitted a bug report to the author, but he hadn't been able to fix/work around the problem. But the program may work for you. -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.alioth.debian.org/pipermail/sane-devel/attachments/20081219/c5cbbb87/attachment-0001.htm