"Daniel Fetchinson" <[EMAIL PROTECTED]> wrote in message news:[EMAIL PROTECTED] | The various free tools differ by their chosen optimization paths and | their degree of specialization. My preference would be, | | 1. Doesn't really matter how long it takes to compute the N numbers per image
Your problem here is that there is really no such thing as 'general features' and correspondingly, no such thing as 'general similarity of features'. The features extracted have to have a specific definition. The features represent a severe lossy compression of the original. What to keep depends on the application. Example: classify each pixel as white, black, red, green, or blue. Will that match your intuitive idea of what matches? To be a bit more sophisticated, use more color bins and do the binning separately for multiple areas, such as top, left, center, right, and bottom (or center, upper right, upper left, lower right, and lower left). I suspect Google does something like this to match, for instance, pictures with skin tones in the center, or pictures with blue tops (sky?) and green bottoms (vegetation?). | 2. Lookups should be fast, consequently N should not be too large (I guess) | 3. It should be a generic algorithm working on generic images (everyday photos) Given feature vectors, there are various ways to calculate a distance or similarity coefficient. There have been great debates on what is 'best'. | 4. PIL should be enough for the implementation | | So if anyone knows of a good resource that is close to being pseudo | code I would be very grateful! If you do not have sufficient insight into your own idea of 'matches', try something on a test set of perhaps 20 photos, calculate a 'match matrix', and compare that you your intuition. Terry Jan Reedy -- http://mail.python.org/mailman/listinfo/python-list