Hi Miles :)

I can imagine if you apply clustering to search results anyway then the information about clusters can help you determine 'similar' results and reorder the output list.

Just a thought.

D.


Miles Barr wrote:
Has anyone tried to remove similar documents from their search results?
It looks like Google does some on the fly filtering of the results,
hiding pages which is thinks are too similar, i.e. when you see:

"In order to show you the most relevant results, we have omitted some
entries very similar to the 7 already displayed.
If you like, you can repeat the search with the omitted results
included."

at the bottom of the page.

Is there anything in Lucene or one of the contrib packages that compares
two documents?




--------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]



Reply via email to