Steve Holden wrote: > Larry Bates wrote: >> I have a project that I wanted to solicit some advice >> on from this group. I have millions of pages of scanned >> documents with each page in and individual .JPG file. >> When the documents were scanned the people that did >> the scanning put a colored (hot pink) separator page >> between the individual documents. I was wondering if >> there was any way to utilize PIL to scan through the >> individual files, look at some small section on the >> page, and determine if it is a separator page by >> somehow comparing the color to the separator page >> color? I realize that this would be some sort of >> percentage match where 100% would be a perfect match >> and any number lower would indicate that it was less >> likely that it was a coverpage. >> >> Thanks in advance for any thoughts or advice. >> > I suspect the easiest way would be to select a few small patches of each > image and average the color values of the pixels, then normalize to hue > rather than RGB. > > Close enough to the hue you want (and you could include saturation and > intensity too, if you felt like it) across several areas of the page > would be a hit for a separator. > > regards > Steve
Steve, I'm completely lost on how to proceed. I don't know how to average color values, normalize to hue... Any guidance you could give would be greatly appreciated. Thanks in advance, Larry -- http://mail.python.org/mailman/listinfo/python-list