beastmaster...@hotmail.com (jeffrey black) writes: > I am buried up to my ears in receipts and would like to go paperless
> budget would have to be a maximum of about $400 USD. I need to scan > everything from 2 inch wide thermal receipts up to to full size > 8 1/2 X 11Â inch receipts. I use a Fujitsu scansnap. It retails for about US $400. Double-sided, ADF. Receipts are often printed on tissue-thin paper and I have to use care to feed them into the scanner. The scanner can handle very long receipts but the operator has to hold the button for a long time (15 sec?) to tell the scanner not to abort after 20 inches or so. I wish I had a fully unix/linux workflow, but alas, have not been able to assemble a reliable auto-crop, auto-rotate, OCR pipeline that functions as well as the Windows drivers/applications that ship with the scanner. My current setup has W2K (!) running in a VirtualBox VM with the scanner USB forwarded from the unix host into the W2k VM. I installed the Fujitsu-provided driver and OCR to the VM client. The VM is configured to make a host directory visible to the VM client as a windows "share". In the VM, scansnap and the bundled OCR software is configured to save output to the shared folder, i.e., files end up on the unix host. The bundled OCR software outputs PDF. The windows driver maintains a continual conversation with the scanner even when it is not scanning. Excessive latency over USB seems to interrupt this conversation, resulting in the driver declaring the scanner "gone" forevermore until VM client reboot. Setting the VM to "realtime" priority on the host (FreeBSD) seems to have ameliorated the problem. As for integration with gnucash, I have a clunky homegrown PDF browser tool that more or less runs pdfgrep on the OCR'ed files to guess at transaction fields and then presents them to me for editing alongside a rendering of the image. I edit the description, category, etc. and indicate "done", and the tool will generate a QIF transaction record and move the PDF file to a directory/filename based on the transaction info. After I have done a bunch of receipts this way, I then import the QIF file into gnucash. It is somewhat cumbersome, but being able to process the text from OCR of the receipts enables some automation of the translation into canonical transactions. Some cleverness is required to match up credit card numbers on receipts (which might have only the last 4 digits) or to parse the transaction amount, and so far it needs a lot of manual oversight. Wouldn't it be nice if there were a standard barcode on receipts that encoded the relevant information? Unfortunately, we are probably too far along into digital/online transactions for any innovation in printed receipts. -- G. Paul Ziemba FreeBSD unix: 9:21PM up 85 days, 8:09, 21 users, load averages: 0.21, 0.32, 0.37
_______________________________________________ gnucash-user mailing list gnucash-user@gnucash.org https://lists.gnucash.org/mailman/listinfo/gnucash-user ----- Please remember to CC this list on all your replies. You can do this by using Reply-To-List or Reply-All.