Re: lucene algorithm ?

2012-04-26 Thread Ralf Heyde
Hi, i dont know the correct implementation. All that I can say is, that you should take a look at query optimization in state-of-the-art database systems. The generell solution is to select this part of a query first, which reduces the resultset most. For example you try to search for a term l

Re: Re-indexing a particular field only without re-indexing the entire enclosing document in the index

2012-04-26 Thread Torsten Krah
Am Donnerstag, den 26.04.2012, 09:46 +0530 schrieb KARTHIK SHIVAKUMAR: > Then delete the same and insert the same Fresh Document alone. But that is not "update" like the question was - that is a complete reindex of the original document, the original question was, if updating a field of a doc can

RE: lucene algorithm ?

2012-04-26 Thread Uwe Schindler
Hi, > I read the paper by Doug "Space optimizations for total ranking", > > since it was written a long time ago, I wonder what algorithms lucene uses > (regarding postings list traversal and score calculation, ranking) The algorithms described in that paper are still in use to merge posting li

Re: Re-indexing a particular field only without re-indexing the entire enclosing document in the index

2012-04-26 Thread Andrzej Bialecki
On 25/04/2012 13:58, Erick Erickson wrote: There's no update-in-place, currently you _have_ to re-index the entire document. But to the original question: There is a "limited join" capability you might investigate that would allow you to split up the textual data and metadata into two different

Re: PhoneticFilterFactory 's inject parameter

2012-04-26 Thread Ian Lea
There are useful tips in the FAQ, http://wiki.apache.org/lucene-java/LuceneFAQ#Why_am_I_getting_no_hits_.2BAC8_incorrect_hits.3F. I still think you should come up with small self-contained example code. -- Ian. On Wed, Apr 25, 2012 at 4:02 PM, Elmer van Chastelet wrote: > Thanks for your sugg

Re: lucene algorithm ?

2012-04-26 Thread Andrzej Bialecki
On 26/04/2012 09:49, Uwe Schindler wrote: There are possibilities to truncate those lists, but this is not implemented in Lucene core. The main problem with Lucene's segmented index structure is, that you cannot early exit, because the very last document in the posting list could be one with a v

Re: two fields, the first important than the second

2012-04-26 Thread jake dsouza
Hi, I think what your are looking for is boost factor that you can use in your score . Take a look at http://lucene.apache.org/core/3_6_0/scoring.html#Score Boosting - Jake On Thu, Apr 26, 2012 at 3:12 PM, Akos Tajti wrote: > Dear List, > > we've been struggling the following problem for a wh

Re: two fields, the first important than the second

2012-04-26 Thread Akos Tajti
Jake, we're already using index time boosts and tried querytime boosts earlier. None of them helped. The problem was that if the description contained a part of a multiterm query many many times it got higher score than the ones that contained the terms in their title. So it is hard to set the boo

Re: two fields, the first important than the second

2012-04-26 Thread Ian Lea
If you really mean "must" and "always", you'll probably have to execute 2 searches. First on title alone then on description, or title and description, merging the hit lists as appropriate. -- Ian. On Thu, Apr 26, 2012 at 8:30 PM, Akos Tajti wrote: > Jake, > > we're already using index time b

Re: two fields, the first important than the second

2012-04-26 Thread Li Li
you should describe your ranking strategy more precisely. if the query has 2 terms, "hello" and "world" for example, and your search fields are title and description. There are many possible combinations. Here is my understanding. Both terms should occur in title or desc query may be +(title:

Re: two fields, the first important than the second

2012-04-26 Thread Li Li
sorry for some typos. original query +(title:hello desc:hello) +(title:world desc:world) boosted one +(title:hello^2 desc:hello) +(title:world^2 desc:world) last one +(title:hello desc:hello) +(title:world desc:hello) (+title:hello +title:world)^10 (+desc:hello +desc:world)^5 the example