There are lots of parameters you can adjust, but the defaults essentially assume that you have a fairly large corpus and aren't interested in low-frequency terms.

So, try MoreLikeThis#setMinDocFreq. The default is 5. You don't have any terms in your example with a doc freq over 2.

Also, try setMinTermFreq. The default is 2. You don't have any terms with a term frequency above 1.

-- Jack Krupansky

-----Original Message----- From: Thomas Keller
Sent: Tuesday, January 15, 2013 3:22 PM
To: java-user@lucene.apache.org
Subject: Lucene-MoreLikethis

Hey,

I have a question about "MoreLikeThis" in Lucene, Java. I built up an index and want to find similar documents. But I always get no results for my query, mlt.like(1) is always empty. Can anyone find my mistake? Here is an example. (I use Lucene 4.0)

public class HelloLucene {
public static void main(String[] args) throws IOException, ParseException {

  StandardAnalyzer analyzer = new StandardAnalyzer(Version.LUCENE_40);
  Directory index = new RAMDirectory();
IndexWriterConfig config = new IndexWriterConfig(Version.LUCENE_40, analyzer);

   IndexWriter w = new IndexWriter(index, config);
   addDoc(w, "Lucene in Action", "193398817");
   addDoc(w, "Lucene for Dummies", "55320055Z");
   addDoc(w, "Managing Gigabytes", "55063554A");
   addDoc(w, "The Art of Computer Science", "9900333X");
   w.close();

   // search
   IndexReader reader = DirectoryReader.open(index);
   IndexSearcher searcher = new IndexSearcher(reader);

   MoreLikeThis mlt = new MoreLikeThis(reader);
   Query query = mlt.like(1);
   System.out.println(searcher.search(query, 5).totalHits);
 }

private static void addDoc(IndexWriter w, String title, String isbn) throws IOException {
   Document doc = new Document();
   doc.add(new TextField("title", title, Field.Store.YES));

   // use a string field for isbn because we don't want it tokenized
   doc.add(new StringField("isbn", isbn, Field.Store.YES));
   w.addDocument(doc);
 }
}

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org

Reply via email to