Larry Bates wrote: > I have a project that I wanted to solicit some advice > on from this group. I have millions of pages of scanned > documents with each page in and individual .JPG file. > When the documents were scanned the people that did > the scanning put a colored (hot pink) separator page > between the individual documents. I was wondering if > there was any way to utilize PIL to scan through the > individual files, look at some small section on the > page, and determine if it is a separator page by > somehow comparing the color to the separator page > color? I realize that this would be some sort of > percentage match where 100% would be a perfect match > and any number lower would indicate that it was less > likely that it was a coverpage. > > Thanks in advance for any thoughts or advice. > I suspect the easiest way would be to select a few small patches of each image and average the color values of the pixels, then normalize to hue rather than RGB.
Close enough to the hue you want (and you could include saturation and intensity too, if you felt like it) across several areas of the page would be a hit for a separator. regards Steve -- Steve Holden +1 571 484 6266 +1 800 494 3119 Holden Web LLC/Ltd http://www.holdenweb.com Skype: holdenweb http://del.icio.us/steve.holden ------------------ Asciimercial --------------------- Get on the web: Blog, lens and tag your way to fame!! holdenweb.blogspot.com squidoo.com/pythonology tagged items: del.icio.us/steve.holden/python All these services currently offer free registration! -------------- Thank You for Reading ---------------- -- http://mail.python.org/mailman/listinfo/python-list