On Jun 19, 2005, at 5:17 AM, [EMAIL PROTECTED] wrote:
Hi there,
i have an index with the following infos in it:
url - keyword - Field("url", this.url, Field.Store.YES,
Field.Index.UN_TOKENIZED);
md5 - keyword - Field("md5", this.url, Field.Store.YES,
Field.Index.UN_TOKENIZED);
alt - Field("alt", this.alt, Field.Store.YES, Field.Index.TOKENIZED);
i use it to index my images.
now it happens that the same image (eg: same md5) is used in different
locations (eg: different urls).
filename: mylogo.gif used in
http://site.com/project1/mylogo.gif and also
http://site.com/project2/some_other_bubu/mylogo.gif
the ALT is different (eg: different text)
now on my image search app when i search mylogo i get "several"
results with the same image.
i would like to reduce the nr of results in that way that the md5 is
unique.
Note: i can't delete from the index the 2nd image cause the ALT might
be different, so in general all the properties put together (md5, url,
alt) compose a different "entity".
It seems you have conflicting goals here. You want (md5, url, alt)
to be unique in one sense, yet you want md5 itself to be unique in
another sense.
i bought "Lucene in Action" book, which is a GREAT book.
Thank you! :)
i was looking into "filters".
i quote: "If all the information needed to perform filtering is in the
index, there is no need to write your own filter because QueryFilter
can handle it."
i can't seem to figure it out, how query filter can help me.
also tried to write my own filter but not that much info on that
direction either.
Filters reduce the search space to a subset of the documents in the
index. Which document would you want returned when there are
multiple documents in the index with the same MD5? Or do you want to
cluster them by MD5?
Do you want to cluster them by MD5 perhaps, but still return multiple
documents back from a search?
I'm not sure if a Filter is the appropriate technique for this
scenario or not.
Erik
---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]