Has anyone tried to remove similar documents from their search results? It looks like Google does some on the fly filtering of the results, hiding pages which is thinks are too similar, i.e. when you see:
"In order to show you the most relevant results, we have omitted some entries very similar to the 7 already displayed. If you like, you can repeat the search with the omitted results included." at the bottom of the page. Is there anything in Lucene or one of the contrib packages that compares two documents? -- Miles Barr <[EMAIL PROTECTED]> Runtime Collective Ltd. --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]