If you read the explain output, you can see where the scores are
different. One difference with a noticeable affect is:
1.0 = tf(termFreq(searchText:fred)=1)
0.5 = fieldNorm(field=searchText, doc=1)
vs.
1.4142135 = tf(termFreq(searchText:fred)=2)
0.375 = fieldNorm(field=searchText, doc=0)
As pred
Also, if I do the below
Query q = new QueryParser(Version.LUCENE_35, "searchText",
analyzer).parse("Takeaway f...@company.com^100")
I get them in reverse order. Do I need to boost the term, even if it
appears more than once in the document?
Regards
Meeraj
On Wed, May 16, 2012 at 9:52 PM, Meeraj
This is the output I get from explaining the plan ..
Found 2 hits.
1. XYZ Takeaway f...@company.com
0.5148823 = (MATCH) sum of:
0.17162743 = (MATCH) weight(searchText:takeaway in 1), product of:
0.57735026 = queryWeight(searchText:takeaway), product of:
0.5945349 = idf(docFreq=2, maxDo
The actual query is
Query q = new QueryParser(Version.LUCENE_35, "searchText",
analyzer).parse("Takeaway f...@company.com");
If I use
Query q = new QueryParser(Version.LUCENE_35, "searchText", analyzer).parse("
f...@company.com");
I get them in the reverse order.
Regards
Meeraj
On Wed, May 16
I have tried the same using Lucene directly with the following code,
import org.apache.lucene.store.RAMDirectory;
import org.apache.lucene.document.Document;
import org.apache.lucene.document.Field;
import org.apache.lucene.index.IndexWriterConfig;
import org.apache.lucene.util.Version;
import org
Thanks Ivan.
I don't use Lucene directly, it is used behind the scene by the Neo4J graph
database for full-text indexing. According to their documentation for full
text indexes they use white space tokenizer in the analyser. Yes, I do get
Listing 2 first now. Though if I exclude the term "Takeaway
Use the explain function to understand why the query is producing the
results you see.
http://lucene.apache.org/core/3_6_0/api/core/org/apache/lucene/search/Searcher.html#explain(org.apache.lucene.search.Query,
int)
Does your current query return Listing 2 first? That might be because
of term fre