>>most of the body text is the same, but I want to group them all under
one result.
I created this analyzer class to identify content that was "mostly
similar" but not necessarily identical.
http://issues.apache.org/jira/browse/LUCENE-725
If you feed a small set of documents through it (say y
other articles might be similar after that first hit? Try and normalize
the similarity basically? Am I off my rocker?
Or, is there possibly a way to use Carrot2 to find related articles for a
given document?
Thanks,
Scott
--
View this message in context:
http://www.nabble.com/Find-related-question