On Jul 13, 6:22 pm, Scott David Daniels <scott.dani...@acm.org> wrote: > DrLeif wrote: > > I have about 6000 PDF files which have been produced using a scanner > > with more being produced each day. The PDF files contain old paper > > records which have been taking up space. The scanner is set to > > detect when there is information on the backside of the page (duplex > > scan). The problem of course is it's not the always reliable and we > > wind up with a number of PDF files containingblankpages. > > > What I would like to do is have python detect a "blank" pages in a PDF > > file and remove it. Any suggestions? > > I'd check into ReportLab's commercial product, it may well be easily > capable of that. If no success, you might contact PJ at Groklaw, she > has dealt with a _lot_ of PDFs (and knows people who deal with PDFs > in bulk). > > --Scott David Daniels > scott.dani...@acm.org
Thanks everyone for the quick reply. I had considered using ReportLab however, was uncertain about it's ability to detect a blank page. Scott, I'll drop an email to ReportLab and PJ.... Thanks again, DrLeif -- http://mail.python.org/mailman/listinfo/python-list