Thomas W wrote: >How can I use python to find images that looks quite similar? Thought >I'd scale the images down to 32x32 and convert it to use a standard >palette of 256 colors then compare the result pixel for pixel etc, but >it seems as if this would take a very long time to do when processing >lots of images. > >Any hint/clue on this subject would be appreciated. > > This really depends on what is meant by "quite similar".
If you mean "to the human eye, the two pictures are identical", as in the case of a tool to get rid of trivially-different duplications, then you can use the technique you propose. I don't imagine that you can save any time over that process. You'd use something like PIL to do the comparisons, of course -- I suspect you want to do something like: 1) resize both 2) quantize the colors 3) subtract the two images 4) resize to 1x1 5) threshhold the result (i.e. we've used PIL to sum the differences) strictly speaking, it might be more mathematically ideal to take the sum of the difference of the squares of the pixels (i.e. compute chi-square). This of course, avoids the painfully slow process of comparing pixel-by-pixel in a Python loop, which would, of course be painfully slow. This is conceptually equivalent to using an "epsilon" to test "equality" of floating point numbers. The more general case of matching images with similar content (but which would be recognizeably different to the human eye), is a much more challenging cutting-edge AI problem, as has already been mentioned -- but I was going to mention imgSeek myself (I see someone's already given you the link). -- http://mail.python.org/mailman/listinfo/python-list