Re: Search Ranking

2012-05-17 Thread Ivan Brusic
If you read the explain output, you can see where the scores are different. One difference with a noticeable affect is: 1.0 = tf(termFreq(searchText:fred)=1) 0.5 = fieldNorm(field=searchText, doc=1) vs. 1.4142135 = tf(termFreq(searchText:fred)=2) 0.375 = fieldNorm(field=searchText, doc=0) As pred

Re: Search Ranking

2012-05-16 Thread Meeraj Kunnumpurath
Also, if I do the below Query q = new QueryParser(Version.LUCENE_35, "searchText", analyzer).parse("Takeaway f...@company.com^100") I get them in reverse order. Do I need to boost the term, even if it appears more than once in the document? Regards Meeraj On Wed, May 16, 2012 at 9:52 PM, Meeraj

Re: Search Ranking

2012-05-16 Thread Meeraj Kunnumpurath
This is the output I get from explaining the plan .. Found 2 hits. 1. XYZ Takeaway f...@company.com 0.5148823 = (MATCH) sum of: 0.17162743 = (MATCH) weight(searchText:takeaway in 1), product of: 0.57735026 = queryWeight(searchText:takeaway), product of: 0.5945349 = idf(docFreq=2, maxDo

Re: Search Ranking

2012-05-16 Thread Meeraj Kunnumpurath
The actual query is Query q = new QueryParser(Version.LUCENE_35, "searchText", analyzer).parse("Takeaway f...@company.com"); If I use Query q = new QueryParser(Version.LUCENE_35, "searchText", analyzer).parse(" f...@company.com"); I get them in the reverse order. Regards Meeraj On Wed, May 16

Re: Search Ranking

2012-05-16 Thread Meeraj Kunnumpurath
I have tried the same using Lucene directly with the following code, import org.apache.lucene.store.RAMDirectory; import org.apache.lucene.document.Document; import org.apache.lucene.document.Field; import org.apache.lucene.index.IndexWriterConfig; import org.apache.lucene.util.Version; import org

Re: Search Ranking

2012-05-16 Thread Meeraj Kunnumpurath
Thanks Ivan. I don't use Lucene directly, it is used behind the scene by the Neo4J graph database for full-text indexing. According to their documentation for full text indexes they use white space tokenizer in the analyser. Yes, I do get Listing 2 first now. Though if I exclude the term "Takeaway

Re: Search Ranking

2012-05-16 Thread Ivan Brusic
Use the explain function to understand why the query is producing the results you see. http://lucene.apache.org/core/3_6_0/api/core/org/apache/lucene/search/Searcher.html#explain(org.apache.lucene.search.Query, int) Does your current query return Listing 2 first? That might be because of term fre

Search Ranking

2012-05-16 Thread Meeraj Kunnumpurath
Hi, I am quite new to Lucene. I am trying to use it to index listings of local businesses. The index has only one field, that stores the attributes of a listing as well as email addresses of users who have rated that business. For example, Listing 1: "XYZ Takeaway London f...@company.com bar...@

SV: SV: SV: Integrating dynamic data into Lucene search/ranking

2008-01-17 Thread Marcus Falk
into Lucene search/ranking I think that would work. But I'm not 100% sure of what you are trying to achieve. Just a notice: Sorting on results has poor performance, if you have a large index, we ran into severe performance problems with just a coupe of million articles which lead us to m

SV: SV: SV: Integrating dynamic data into Lucene search/ranking

2008-01-17 Thread Marcus Falk
ngligt meddelande- Från: Tobias Lohr [mailto:[EMAIL PROTECTED] Skickat: den 17 januari 2008 15:15 Till: java-user@lucene.apache.org Ämne: Re: SV: SV: Integrating dynamic data into Lucene search/ranking Thanks for your hint. If its possible I would take a look into the code, but the approach is interes

Re: SV: SV: Integrating dynamic data into Lucene search/ranking

2008-01-17 Thread Tobias Lohr
ff: SV: SV: Integrating dynamic data into Lucene search/ranking > In our solution we used a RAMDir for the newest incoming articles and a > FSDir for older ones. Then we had a limit for the ramdir like 10.000 > documents when that limit were hit we used mergesegments to move the cont

SV: SV: Integrating dynamic data into Lucene search/ranking

2008-01-17 Thread Marcus Falk
s M -Ursprungligt meddelande- Från: Andrzej Bialecki [mailto:[EMAIL PROTECTED] Skickat: den 17 januari 2008 10:55 Till: java-user@lucene.apache.org Ämne: Re: SV: Integrating dynamic data into Lucene search/ranking Tobias Lohr wrote: > I'm not really sure, if this approach is possible for

Re: SV: Integrating dynamic data into Lucene search/ranking

2008-01-17 Thread Andrzej Bialecki
Tobias Lohr wrote: I'm not really sure, if this approach is possible for working in changes every - let's say - 30 seconds!? The conventional wisdom is to use RAMDirectory in such scenarios. I.e. you commit frequent updates to a RAMDirectory and frequently reopen its Searcher (which should b

Re: SV: Integrating dynamic data into Lucene search/ranking

2008-01-17 Thread Tobias Lohr
he.org, java-user@lucene.apache.org > Betreff: SV: Integrating dynamic data into Lucene search/ranking > We did this in our system, indexing a constant flow of news articles, by > doing as Otis described (reopened the indexsearcher).. > > Every 3:d minute we are creating a new indexsearche

SV: Integrating dynamic data into Lucene search/ranking

2008-01-16 Thread Marcus Falk
@lucene.apache.org Ämne: Re: Integrating dynamic data into Lucene search/ranking The index contains about a several ten thousand documents, with a field count of about fifty. The index is going to be rebuild approx. every day, but varies, since the searchable content doesn't change very often. Now I fac

Re: Integrating dynamic data into Lucene search/ranking

2008-01-16 Thread Tobias Lohr
/ -- Lucene - Solr - Nutch - Original Message From: Tobias Lohr <[EMAIL PROTECTED]> To: java-user@lucene.apache.org Sent: Tuesday, January 15, 2008 11:33:56 AM Subject: Integrating dynamic data into Lucene search/ranking I have a more architectural question, which is maybe sort of o

Re: Integrating dynamic data into Lucene search/ranking

2008-01-15 Thread Otis Gospodnetic
ore (e.g. RDBMS, BDB...) Otis -- Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch - Original Message From: Tobias Lohr <[EMAIL PROTECTED]> To: java-user@lucene.apache.org Sent: Tuesday, January 15, 2008 11:33:56 AM Subject: Integrating dynamic data into Lucene search/r

Integrating dynamic data into Lucene search/ranking

2008-01-15 Thread Tobias Lohr
etc.) for integrating such dynamic information into a search/ranking functionality? (I already searched at Google, but couldn't find anything useful though.) Thanks in advance! -- Pt! Schon vom neuen GMX MultiMessenger gehört? Der kann`s mit allen: http://www.gmx.net/de/go/multim