+(title:hello title:world desc:hello desc:world) (+title:hello +title:world)^100 (+desc:hello +desc:world)^50 (+title:hello +desc:world)^10 (+desc:hello +title:world)^10
the boost values(100,50,10,10) should be carefully adjusted. if tf of a document is very large, 10 may be not enough. you can modify DefaultSimilariy of it's methods such as tf() idf() and constrain them to a controllable range. On Fri, Apr 27, 2012 at 2:59 PM, Akos Tajti <akos.ta...@gmail.com> wrote: > Thanks gfor the details explanation. But as I understand this query will > still match only documents that contains both terms (either in the same > field or in different). What if there's a document that contains only > "hello"? This query will not find it, am I right? But what we want to > achieve is this. So in the result first have to come those documents that > contain both terms then thos that contain only one of them. > > Ákos > > > > On Fri, Apr 27, 2012 at 5:17 AM, Li Li <fancye...@gmail.com> wrote: > >> sorry for some typos. >> original query +(title:hello desc:hello) +(title:world desc:world) >> boosted one +(title:hello^2 desc:hello) +(title:world^2 desc:world) >> last one +(title:hello desc:hello) +(title:world desc:hello) >> (+title:hello +title:world)^10 (+desc:hello +desc:world)^5 >> >> the example has two terms. if it has more terms, the query will become too >> complicated. >> >> On Fri, Apr 27, 2012 at 11:12 AM, Li Li <fancye...@gmail.com> wrote: >> >> > you should describe your ranking strategy more precisely. >> > if the query has 2 terms, "hello" and "world" for example, and your >> > search fields are title and description. There are many possible >> > combinations. >> > Here is my understanding. >> > Both terms should occur in title or desc >> > query may be +(title:hello desc:hello) +(title:world desc:hello) >> > the problem is that we need title weight more than desc, so may be we >> > rewrite it to >> > +(title:hello^2 desc:hello) +(title:world^2 desc:hello) >> > but we consider this two scenarios: >> > 1. hello hit only in title, world hit only in desc >> > 2. hello and world both hit in desc >> > because title is boosted, so 1 has more score than 2. >> > But we may think 2 is better than 1 because hello world is a phrase. >> > But we don't want to use phrase query because it's too strict that the >> > recall can meet our needs. >> > Our solution is modify lucene so boolean scorer can tell us which term >> > is matched. then we use our own collector to boost scenario 1. This >> > solution need modify lucene(I have posted a mail and you can patch your >> > DisjunctionSumScorer with >> > https://issues.apache.org/jira/browse/LUCENE-2686) >> > Another solution I can come up with is using complicated query: >> > +(title:hello desc:hello) +(title:world desc:hello) >> > (+title:hello +title:world)^10 (+desc:hello +desc:world)^5 >> > The must occurrence condition is the same as before. but if hello >> world >> > are all in title, we give it a boost. similarly, if hello world are all >> in >> > desc, we also boost it. >> > >> > >> > >> > On Fri, Apr 27, 2012 at 3:12 AM, Akos Tajti <akos.ta...@gmail.com> >> wrote: >> > >> >> Dear List, >> >> >> >> we've been struggling the following problem for a while: >> >> we have two fields: title and description. Title is generated from short >> >> summaries while description is generated fromlong texts. We want to >> search >> >> on both fields at the same time but we'd like to get all documents in >> >> which >> >> the title matches the search term before all others. For multi term >> >> queries >> >> we want to achieve the following: all documents that contain all terms >> in >> >> their title must come before every other document, no matter how many >> >> times >> >> the description matches the query. Is there a simple way to achieve >> this? >> >> >> >> Thanks in advance, >> >> Ákos Tajti >> >> >> > >> > >> --------------------------------------------------------------------- To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org For additional commands, e-mail: java-user-h...@lucene.apache.org