RE: FuzzyOcr 3.5.1 released

Giampaolo Tomassoni Mon, 08 Jan 2007 09:05:33 -0800

From: decoder [mailto:[EMAIL PROTECTED]
> 
> ...omissis...
>
> > Hash the hashes and store them in a suitable tree?
> I explained before that you cannot hash the hashes because a
> cryptographic hash is tolerance resistant. A fuzzy matching on such a
> hash of the actual hash is impossible then.


Ok. Just to add more useless thoughts to the already shot out, may I ask a 
question?

I see that the first part of an hash is made of file size, image height and 
width, and image color count. Now, since the target is to construct an image's 
weak key, I don't see any purpose for both file size and image color count: 
both may change greately depending on the image "carrier" (which is the image 
type) and may possibly relate too loosely with the true image content to be 
useful. Why are they taken?

Let me also start a bit of brainstorming.

I see in Hashing.pm that an image in the database matches the searched one if 
the components of its composite key are all below a given threshold, expressed 
as a percentage of the value.

This is interesting, because we could use overlapping categories here thanks to 
the threshold percentage.

Let me explain by focusing on a single parameter: image width "w". Let assume 
we have a db field "c" and a monotonic transfer function c=F(W), such that 
F(W-W*threshold) >= c-1 and F(W+W*threshold) <= c+1. Then, saying 
abs(w[i]-W)<threshold means W-W*threshold<w[i]<W+W*threshold and, due to 
assumed monotonicity of F(W), at most 
F(W-W*threshold)<=F(w[i])<=F(W+W*threshold) or c-1<=c<=c+1. Which means that 
all the images having a width within +-threshold from W would be contained at 
most in categories ( c-1 and c ) or ( c and c+1 ) and you don't have to browse 
all the images but just the ones matching these values in column "c".

Simply put, you may then "search by index" into the db instead of just browse 
all the images in it.

Please note that, given a W, you need to inspect the images in class c and 
either the ones in class c-1 or c+2 thank to the value of W itself (inspect c-1 
if W < inv(F, c), c+1 otherwise).

This method could possibly be further applied. If, in example, you extend this 
method also to the image height, you end inspecting images in a neightborhood 
of 4 image groups. It can't be applied to deeply, however, since you would end 
inspecting 2^n neightborhood with n being the number of classified parameters.

Classification may however be used for a surrogate key. Take some n key 
components (in example, h*w and color frequencies) as coordinates of a point in 
a hypercube (a "component space"). Then choose a distance function and 
categorize the images by the value of the distance with respect to a fixed 
point (the origin, in example). You would end up inspecting images in only two 
clusters.

A good integer transfer function could be:

        F(w) = floor( ln(w) / ln(1/(1 - threshold)) )

the inverse being:

        W(c) = (1/(1-t))^c

EOT (End Of Thoughts )

Please don't flame too hard...  :)

Thanks,

        giampaolo

> Chris

RE: FuzzyOcr 3.5.1 released

Reply via email to