Re: Using Lucene/Solr for Plagiarism detection

2010-12-30 Thread Lance Norskog
The MoreLikeThis feature may be exactly what you want. Try it out. On Thu, Dec 30, 2010 at 8:28 AM, Amel Fraisse wrote: > Hello, > > No I'm not using cosine similarity metrics. > > > 2010/12/30 Shashi Kant > >> Have you considered using document similarity metrics such as Cosine >> Similarity? >

Re: Using Lucene/Solr for Plagiarism detection

2010-12-30 Thread Amel Fraisse
Hello, No I'm not using cosine similarity metrics. 2010/12/30 Shashi Kant > Have you considered using document similarity metrics such as Cosine > Similarity? > > > On Thu, Dec 30, 2010 at 6:05 AM, Amel Fraisse > wrote: > > Hello, > > > > I am using Lucene for plagiarism detection. > > > > Th

Re: Using Lucene/Solr for Plagiarism detection

2010-12-30 Thread Shashi Kant
Have you considered using document similarity metrics such as Cosine Similarity? On Thu, Dec 30, 2010 at 6:05 AM, Amel Fraisse wrote: > Hello, > > I am using Lucene for plagiarism detection. > > The goal is that: when I have a new document, I will check on the solr index > if there is a document

Using Lucene/Solr for Plagiarism detection

2010-12-30 Thread Amel Fraisse
Hello, I am using Lucene for plagiarism detection. The goal is that: when I have a new document, I will check on the solr index if there is a document that contain some common chunk. So to compute similarity between the query and a source document I would use this formula : Score (suspicious do