If you read the explain output, you can see where the scores are
different. One difference with a noticeable affect is:
1.0 = tf(termFreq(searchText:fred)=1)
0.5 = fieldNorm(field=searchText, doc=1)
vs.
1.4142135 = tf(termFreq(searchText:fred)=2)
0.375 = fieldNorm(field=searchText, doc=0)
As pred
Also, if I do the below
Query q = new QueryParser(Version.LUCENE_35, "searchText",
analyzer).parse("Takeaway f...@company.com^100")
I get them in reverse order. Do I need to boost the term, even if it
appears more than once in the document?
Regards
Meeraj
On Wed, May 16, 2012 at 9:52 PM, Meeraj
This is the output I get from explaining the plan ..
Found 2 hits.
1. XYZ Takeaway f...@company.com
0.5148823 = (MATCH) sum of:
0.17162743 = (MATCH) weight(searchText:takeaway in 1), product of:
0.57735026 = queryWeight(searchText:takeaway), product of:
0.5945349 = idf(docFreq=2, maxDo
The actual query is
Query q = new QueryParser(Version.LUCENE_35, "searchText",
analyzer).parse("Takeaway f...@company.com");
If I use
Query q = new QueryParser(Version.LUCENE_35, "searchText", analyzer).parse("
f...@company.com");
I get them in the reverse order.
Regards
Meeraj
On Wed, May 16
I have tried the same using Lucene directly with the following code,
import org.apache.lucene.store.RAMDirectory;
import org.apache.lucene.document.Document;
import org.apache.lucene.document.Field;
import org.apache.lucene.index.IndexWriterConfig;
import org.apache.lucene.util.Version;
import org
Thanks Ivan.
I don't use Lucene directly, it is used behind the scene by the Neo4J graph
database for full-text indexing. According to their documentation for full
text indexes they use white space tokenizer in the analyser. Yes, I do get
Listing 2 first now. Though if I exclude the term "Takeaway
Use the explain function to understand why the query is producing the
results you see.
http://lucene.apache.org/core/3_6_0/api/core/org/apache/lucene/search/Searcher.html#explain(org.apache.lucene.search.Query,
int)
Does your current query return Listing 2 first? That might be because
of term fre
Hi,
I am quite new to Lucene. I am trying to use it to index listings of local
businesses. The index has only one field, that stores the attributes of a
listing as well as email addresses of users who have rated that business.
For example,
Listing 1: "XYZ Takeaway London f...@company.com bar...@
into Lucene search/ranking
I think that would work. But I'm not 100% sure of what you are trying to
achieve.
Just a notice:
Sorting on results has poor performance, if you have a large index, we ran into
severe performance problems with just a coupe of million articles which lead us
to m
ngligt meddelande-
Från: Tobias Lohr [mailto:[EMAIL PROTECTED]
Skickat: den 17 januari 2008 15:15
Till: java-user@lucene.apache.org
Ämne: Re: SV: SV: Integrating dynamic data into Lucene search/ranking
Thanks for your hint. If its possible I would take a look into the code, but
the approach is interes
ff: SV: SV: Integrating dynamic data into Lucene search/ranking
> In our solution we used a RAMDir for the newest incoming articles and a
> FSDir for older ones. Then we had a limit for the ramdir like 10.000
> documents when that limit were hit we used mergesegments to move the cont
s
M
-Ursprungligt meddelande-
Från: Andrzej Bialecki [mailto:[EMAIL PROTECTED]
Skickat: den 17 januari 2008 10:55
Till: java-user@lucene.apache.org
Ämne: Re: SV: Integrating dynamic data into Lucene search/ranking
Tobias Lohr wrote:
> I'm not really sure, if this approach is possible for
Tobias Lohr wrote:
I'm not really sure, if this approach is possible for working in changes every
- let's say - 30 seconds!?
The conventional wisdom is to use RAMDirectory in such scenarios. I.e.
you commit frequent updates to a RAMDirectory and frequently reopen its
Searcher (which should b
he.org, java-user@lucene.apache.org
> Betreff: SV: Integrating dynamic data into Lucene search/ranking
> We did this in our system, indexing a constant flow of news articles, by
> doing as Otis described (reopened the indexsearcher)..
>
> Every 3:d minute we are creating a new indexsearche
@lucene.apache.org
Ämne: Re: Integrating dynamic data into Lucene search/ranking
The index contains about a several ten thousand documents, with a field
count of about fifty. The index is going to be rebuild approx. every
day, but varies, since the searchable content doesn't change very often.
Now I fac
/ -- Lucene - Solr - Nutch
- Original Message
From: Tobias Lohr <[EMAIL PROTECTED]>
To: java-user@lucene.apache.org
Sent: Tuesday, January 15, 2008 11:33:56 AM
Subject: Integrating dynamic data into Lucene search/ranking
I have a more architectural question, which is maybe sort of o
ore (e.g. RDBMS, BDB...)
Otis
--
Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch
- Original Message
From: Tobias Lohr <[EMAIL PROTECTED]>
To: java-user@lucene.apache.org
Sent: Tuesday, January 15, 2008 11:33:56 AM
Subject: Integrating dynamic data into Lucene search/r
etc.)
for integrating such dynamic information into a search/ranking functionality?
(I already searched at Google, but couldn't find anything useful though.)
Thanks in advance!
--
Pt! Schon vom neuen GMX MultiMessenger gehört?
Der kann`s mit allen: http://www.gmx.net/de/go/multim
18 matches
Mail list logo