sorry for some typos. original query +(title:hello desc:hello) +(title:world desc:world) boosted one +(title:hello^2 desc:hello) +(title:world^2 desc:world) last one +(title:hello desc:hello) +(title:world desc:hello) (+title:hello +title:world)^10 (+desc:hello +desc:world)^5
the example has two terms. if it has more terms, the query will become too complicated. On Fri, Apr 27, 2012 at 11:12 AM, Li Li <fancye...@gmail.com> wrote: > you should describe your ranking strategy more precisely. > if the query has 2 terms, "hello" and "world" for example, and your > search fields are title and description. There are many possible > combinations. > Here is my understanding. > Both terms should occur in title or desc > query may be +(title:hello desc:hello) +(title:world desc:hello) > the problem is that we need title weight more than desc, so may be we > rewrite it to > +(title:hello^2 desc:hello) +(title:world^2 desc:hello) > but we consider this two scenarios: > 1. hello hit only in title, world hit only in desc > 2. hello and world both hit in desc > because title is boosted, so 1 has more score than 2. > But we may think 2 is better than 1 because hello world is a phrase. > But we don't want to use phrase query because it's too strict that the > recall can meet our needs. > Our solution is modify lucene so boolean scorer can tell us which term > is matched. then we use our own collector to boost scenario 1. This > solution need modify lucene(I have posted a mail and you can patch your > DisjunctionSumScorer with > https://issues.apache.org/jira/browse/LUCENE-2686) > Another solution I can come up with is using complicated query: > +(title:hello desc:hello) +(title:world desc:hello) > (+title:hello +title:world)^10 (+desc:hello +desc:world)^5 > The must occurrence condition is the same as before. but if hello world > are all in title, we give it a boost. similarly, if hello world are all in > desc, we also boost it. > > > > On Fri, Apr 27, 2012 at 3:12 AM, Akos Tajti <akos.ta...@gmail.com> wrote: > >> Dear List, >> >> we've been struggling the following problem for a while: >> we have two fields: title and description. Title is generated from short >> summaries while description is generated fromlong texts. We want to search >> on both fields at the same time but we'd like to get all documents in >> which >> the title matches the search term before all others. For multi term >> queries >> we want to achieve the following: all documents that contain all terms in >> their title must come before every other document, no matter how many >> times >> the description matches the query. Is there a simple way to achieve this? >> >> Thanks in advance, >> Ákos Tajti >> > >