It's exactly my question:
http://www.mail-archive.com/lucene-u...@jakarta.apache.org/msg04915.html
--- On Mon, 6/29/09, Amir Hossein Jadidinejad wrote:
From: Amir Hossein Jadidinejad
Subject: Doc-Doc Similarity Matrix Construction
To: java-user@lucene.apache.org
Date: Monday, June 29, 20
See MoreLikeThis in the contrib/queries folder. It optimizes the speed
of similarity comparisons by taking the most significant words only
from a document as search terms.
On 29 Jun 2009, at 20:14, Amir Hossein Jadidinejad wrote:
Hi,
It's my first experiment with Lucene. Please help me.
Hi,
It's my first experiment with Lucene. Please help me.
I'm
going to index a set of documents and create a feature vector for each
of them. This vector contains all terms belong to the document that
weight using TFIDF.
After that I want to compute the cosine similarity between all documents and