Hello, No I'm not using cosine similarity metrics.
2010/12/30 Shashi Kant <sk...@sloan.mit.edu> > Have you considered using document similarity metrics such as Cosine > Similarity? > > > On Thu, Dec 30, 2010 at 6:05 AM, Amel Fraisse <amel.frai...@gmail.com> > wrote: > > Hello, > > > > I am using Lucene for plagiarism detection. > > > > The goal is that: when I have a new document, I will check on the solr > index > > if there is a document that contain some common chunk. > > > > So to compute similarity between the query and a source document I would > use > > this formula : > > > > Score (suspicious document, source document) = Number of common chunk > > between source document and suspicious document / Number of total chunk > in > > the suspicious document. > > > > So I have to change the scoring formula in the Similarity class. > > > > How can I change the scoring formula? ( by customizing only the > Similarity > > class? or Scorer?) > > > > Do you have an Example of this use case? > > > > Thank for your help. > > > > --------------------------------------------------------------------- > To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org > For additional commands, e-mail: java-user-h...@lucene.apache.org > > -- -------------- Amel Fraisse