> | The various free tools differ by their chosen optimization paths and > | their degree of specialization. My preference would be, > | > | 1. Doesn't really matter how long it takes to compute the N numbers per > image > > Your problem here is that there is really no such thing as 'general > features' and correspondingly, no such thing as 'general similarity of > features'.
Yes there are! :) Image manipulation experts defined dozens of ways of characterizing what 'similarity' means for images and all I was asking is whether anyone here knew of a simple one. > The features extracted have to have a specific definition. The > features represent a severe lossy compression of the original. What to > keep depends on the application. Yes, and if you know *any* simple but useful (yes, useful, in *any* sense) definition, I'd be happy to hear it. > Example: classify each pixel as white, black, red, green, or blue. Will > that match your intuitive idea of what matches? Probably not, but thanks for the idea. > To be a bit more sophisticated, use more color bins and do the binning > separately for multiple areas, such as top, left, center, right, and bottom > (or center, upper right, upper left, lower right, and lower left). I > suspect Google does something like this to match, for instance, pictures > with skin tones in the center, or pictures with blue tops (sky?) and green > bottoms (vegetation?). Now this sounds like a simple and good idea. I'll try this and see how far I get. > | 2. Lookups should be fast, consequently N should not be too large (I > guess) > | 3. It should be a generic algorithm working on generic images (everyday > photos) > > Given feature vectors, there are various ways to calculate a distance or > similarity coefficient. There have been great debates on what is 'best'. True. As I've said, *any* but concrete and useful example would make me happy. > | 4. PIL should be enough for the implementation > | > | So if anyone knows of a good resource that is close to being pseudo > | code I would be very grateful! > > If you do not have sufficient insight into your own idea of 'matches', try > something on a test set of perhaps 20 photos, calculate a 'match matrix', > and compare that you your intuition. Yes, this is what I'll do. The second thing I'll try (after trying your suggestion) is based on this paper which I found in the meantime: http://salesin.cs.washington.edu/abstracts.html#MultiresQuery In case anyone is interested, it describes a multiresolution querying algorithm and best of all, it has pseudo code for the various steps. I don't know yet how difficult the implementation will be but so far this looks the most promising. Cheers, Daniel -- http://mail.python.org/mailman/listinfo/python-list