I'v been thinking about a similar problem. However, it seems that the
similarity score returned by a search is only relevant within those search
results. You can't compare the similarity scores from two different searches.
I think you will have to compute the similarities yourself using the t
On Thursday 15 June 2006 13:50, Prasenjit Mukherjee wrote:
> I want to do some document clustering on a corpus of ~ 100,000
> documents, with average doc size being ~ 7k. I have looked into carrot2
> but it seems to work only for relatively short documents and has soem
> scalign issues for lar