Re: Search Ranking

2012-05-17 Thread Ivan Brusic
If you read the explain output, you can see where the scores are different. One difference with a noticeable affect is: 1.0 = tf(termFreq(searchText:fred)=1) 0.5 = fieldNorm(field=searchText, doc=1) vs. 1.4142135 = tf(termFreq(searchText:fred)=2) 0.375 = fieldNorm(field=searchText, doc=0) As pred

Re: Search Ranking

2012-05-16 Thread Meeraj Kunnumpurath
Also, if I do the below Query q = new QueryParser(Version.LUCENE_35, "searchText", analyzer).parse("Takeaway f...@company.com^100") I get them in reverse order. Do I need to boost the term, even if it appears more than once in the document? Regards Meeraj On Wed, May 16, 2012 at 9:52 PM, Meeraj

Re: Search Ranking

2012-05-16 Thread Meeraj Kunnumpurath
This is the output I get from explaining the plan .. Found 2 hits. 1. XYZ Takeaway f...@company.com 0.5148823 = (MATCH) sum of: 0.17162743 = (MATCH) weight(searchText:takeaway in 1), product of: 0.57735026 = queryWeight(searchText:takeaway), product of: 0.5945349 = idf(docFreq=2, maxDo

Re: Search Ranking

2012-05-16 Thread Meeraj Kunnumpurath
The actual query is Query q = new QueryParser(Version.LUCENE_35, "searchText", analyzer).parse("Takeaway f...@company.com"); If I use Query q = new QueryParser(Version.LUCENE_35, "searchText", analyzer).parse(" f...@company.com"); I get them in the reverse order. Regards Meeraj On Wed, May 16

Re: Search Ranking

2012-05-16 Thread Meeraj Kunnumpurath
I have tried the same using Lucene directly with the following code, import org.apache.lucene.store.RAMDirectory; import org.apache.lucene.document.Document; import org.apache.lucene.document.Field; import org.apache.lucene.index.IndexWriterConfig; import org.apache.lucene.util.Version; import org

Re: Search Ranking

2012-05-16 Thread Meeraj Kunnumpurath
Thanks Ivan. I don't use Lucene directly, it is used behind the scene by the Neo4J graph database for full-text indexing. According to their documentation for full text indexes they use white space tokenizer in the analyser. Yes, I do get Listing 2 first now. Though if I exclude the term "Takeaway

Re: Search Ranking

2012-05-16 Thread Ivan Brusic
Use the explain function to understand why the query is producing the results you see. http://lucene.apache.org/core/3_6_0/api/core/org/apache/lucene/search/Searcher.html#explain(org.apache.lucene.search.Query, int) Does your current query return Listing 2 first? That might be because of term fre