I have tried the same using Lucene directly with the following code, import org.apache.lucene.store.RAMDirectory; import org.apache.lucene.document.Document; import org.apache.lucene.document.Field; import org.apache.lucene.index.IndexWriterConfig; import org.apache.lucene.util.Version; import org.apache.lucene.analysis.standard.StandardAnalyzer; import org.apache.lucene.index.IndexWriter; import org.apache.lucene.queryParser.QueryParser; import org.apache.lucene.index.IndexReader; import org.apache.lucene.search.IndexSearcher; import org.apache.lucene.search.Query; import org.apache.lucene.search.TopScoreDocCollector; import org.apache.lucene.search.ScoreDoc;
public class LuceneTest { public static void main(String[] args) throws Exception { StandardAnalyzer analyzer = new StandardAnalyzer(Version.LUCENE_35); RAMDirectory index = new RAMDirectory(); IndexWriterConfig config = new IndexWriterConfig(Version.LUCENE_35, analyzer); IndexWriter indexWriter = new IndexWriter(index, config); Document doc1 = new Document(); doc1.add(new Field("searchText", "ABC Takeaway f...@company.com f...@company.com", Field.Store.YES, Field.Index.ANALYZED)); Document doc2 = new Document(); doc2.add(new Field("searchText", "XYZ Takeaway f...@company.com", Field.Store.YES, Field.Index.ANALYZED)); indexWriter.addDocument(doc1); indexWriter.addDocument(doc2); indexWriter.close(); Query q = new QueryParser(Version.LUCENE_35, "searchText", analyzer).parse("Takeaway"); int hitsPerPage = 10; IndexReader reader = IndexReader.open(index); IndexSearcher searcher = new IndexSearcher(reader); TopScoreDocCollector collector = TopScoreDocCollector.create(hitsPerPage, true); searcher.search(q, collector); ScoreDoc[] hits = collector.topDocs().scoreDocs; System.out.println("Found " + hits.length + " hits."); for(int i=0;i<hits.length;++i) { int docId = hits[i].doc; Document d = searcher.doc(docId); System.out.println((i + 1) + ". " + d.get("searchText")); } } } The output is .. Found 2 hits. 1. XYZ Takeaway f...@company.com 2. ABC Takeaway f...@company.com f...@company.com On Wed, May 16, 2012 at 9:21 PM, Meeraj Kunnumpurath < meeraj.kunnumpur...@asyska.com> wrote: > Thanks Ivan. > > I don't use Lucene directly, it is used behind the scene by the Neo4J > graph database for full-text indexing. According to their documentation for > full text indexes they use white space tokenizer in the analyser. Yes, I do > get Listing 2 first now. Though if I exclude the term "Takeaway" from the > search string, and just put "f...@company.com", I get Listing 1 first. > > Regards > Meeraj > > > On Wed, May 16, 2012 at 8:49 PM, Ivan Brusic <i...@brusic.com> wrote: > >> Use the explain function to understand why the query is producing the >> results you see. >> >> >> http://lucene.apache.org/core/3_6_0/api/core/org/apache/lucene/search/Searcher.html#explain(org.apache.lucene.search.Query >> , >> int) >> >> Does your current query return Listing 2 first? That might be because >> of term frequencies. Which analyzers are you using? >> >> http://www.lucidimagination.com/content/scaling-lucene-and-solr#d0e63 >> >> Cheers, >> >> Ivan >> >> On Wed, May 16, 2012 at 12:41 PM, Meeraj Kunnumpurath >> <meeraj.kunnumpur...@asyska.com> wrote: >> > Hi, >> > >> > I am quite new to Lucene. I am trying to use it to index listings of >> local >> > businesses. The index has only one field, that stores the attributes of >> a >> > listing as well as email addresses of users who have rated that >> business. >> > >> > For example, >> > >> > Listing 1: "XYZ Takeaway London f...@company.com bar...@company.com >> > f...@company.com" >> > Listing 2: "ABC Takeaway London f...@company.com bar...@company.com" >> > >> > Now when the user does a search with "Takeaway f...@company.com", how >> do I >> > get listing 1 to always come before listing 2, because it has the term >> > f...@company.com appear twice where as listing 2 has it only once? >> > >> > Regards >> > Meeraj >> >> --------------------------------------------------------------------- >> To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org >> For additional commands, e-mail: java-user-h...@lucene.apache.org >> >> >