Daniel Fetchinson wrote: > Since you seem to know quite a bit about this topic, what is your > opinion on the apparently 'generic' algorithm described here: > http://grail.cs.washington.edu/projects/query/ ? > So far it seems to me that it does what I'm asking for, it does even > more because it can take a hand drawn sample image and query the > database for similar photos. > > There is even a python implementation for it here: > http://members.tripod.com/~edcjones/pycode.html > > On the histogram method I agree that it won't work partly because of > what you say and partly because it is terribly slow since it's > comparing every single pixel.
I'm hardly the expert and can't answer authoritatively, but here's my 2c. I can't comment as to the actual accuracy of the algorithm, since it will depend on your specific data set (set of photos). The algorithm is sensitive to spatial and luminance information (because of the YIQ colorspace), so there are simple ways in which it will fail. The histogram method uses only color, but has a lot of numbers to compare. You may find the histogram method insensitive to spatial relations (a landscape with the mountain on the left and one with the mountain on the right) compared to the wavelet approach. This is a relatively old paper, and I've seen other more recent image retrieval research using wavelets (some cases using only the high-frequency wavelets for "texture" information instead of the low-frequency ones used by this paper for "shape") and other information retrieval-related research using lossy compressed data as the features. If you have time, you may want to look at other research that cite this particular paper. And just a thought: Instead of merely cutting off at m largest-wavelets, why not apply a quantization matrix to all the values? Let me know how it works out. -- http://mail.python.org/mailman/listinfo/python-list