Larry Bates wrote: > Steve Holden wrote: >> Larry Bates wrote: >>> I have a project that I wanted to solicit some advice >>> on from this group. I have millions of pages of scanned >>> documents with each page in and individual .JPG file. >>> When the documents were scanned the people that did >>> the scanning put a colored (hot pink) separator page >>> between the individual documents. I was wondering if >>> there was any way to utilize PIL to scan through the >>> individual files, look at some small section on the >>> page, and determine if it is a separator page by >>> somehow comparing the color to the separator page >>> color? I realize that this would be some sort of >>> percentage match where 100% would be a perfect match >>> and any number lower would indicate that it was less >>> likely that it was a coverpage. >>> >>> Thanks in advance for any thoughts or advice. >>> >> I suspect the easiest way would be to select a few small patches of each >> image and average the color values of the pixels, then normalize to hue >> rather than RGB. >> >> Close enough to the hue you want (and you could include saturation and >> intensity too, if you felt like it) across several areas of the page >> would be a hit for a separator. >> >> regards >> Steve > > Steve, > > I'm completely lost on how to proceed. I don't know how to average color > values, normalize to hue... Any guidance you could give would be greatly > appreciated. > > Thanks in advance, > Larry
I'd like to help but I don't have any sample code to hand. Maybe someone who does could give you more of a clue. Let's hope so, anyway ... regards Steve -- Steve Holden +1 571 484 6266 +1 800 494 3119 Holden Web LLC/Ltd http://www.holdenweb.com Skype: holdenweb http://del.icio.us/steve.holden ------------------ Asciimercial --------------------- Get on the web: Blog, lens and tag your way to fame!! holdenweb.blogspot.com squidoo.com/pythonology tagged items: del.icio.us/steve.holden/python All these services currently offer free registration! -------------- Thank You for Reading ---------------- -- http://mail.python.org/mailman/listinfo/python-list