On Thu, Apr 19, 2012 at 4:21 PM, Robert Muir <rcm...@gmail.com> wrote: > On Thu, Apr 19, 2012 at 3:49 PM, Benson Margulies <bimargul...@gmail.com> > wrote: >> On Thu, Apr 19, 2012 at 1:34 PM, Robert Muir <rcm...@gmail.com> wrote: >>> On Thu, Apr 19, 2012 at 1:26 PM, Benson Margulies <bimargul...@gmail.com> >>> wrote: >>>> I am trying to solve a problem using DisjunctionMaxQuery. >>>> >>>> >>>> Consider a query like: >>>> >>>> a:b OR c:d OR e:f OR ... >>>> name:richard OR name:dick OR name:dickie OR name:rich ... >>>> >>>> At most, one of the richard names matches. So the match score gets >>>> dragged down by the long list of things that don't match, as the list >>>> can get quite long. >>>> >>>> It seemed to me, upon reading the documentation, that I could cure >>>> this problem by creating a query tree that used DisjunctionMaxQuery >>>> around all those nicknames. However, when I built a boolean query that >>>> had, as a clause, a DisjunctionMaxQuery in the place of a pile of >>>> these individual Term queries, the score and the explanation did not >>>> change at all -- in particular, the coord term shows the same number >>>> of total terms. So it looks as if the children of the disjunction >>>> still count. >>>> >>>> Is there a way to control that term? Or a better way to express this? >>>> Thinking SQL for a moment, what I'm trying to express is >>>> >>>> name IN (richard, dick, dickie, rich) >>>> >>> >>> I think you just want to disable coord() here? You can do this for >>> that particular boolean query by passing true to the ctor: >>> >>> public BooleanQuery(boolean disableCoord) >> >> Rob, >> >> How do nested queries work with respect to this? If I build a boolean >> query one of whose clauses is a BooleanQuery with coord turned off, >> does just the nested query insides get left out of 'coord'? >> >> If so, then your answer certainly seems to be what the doctor ordered. >> > > it applies only to that query itself. So if this BQ is a clause to > another BQ that has coord enabled, > that would not change the top-level BQ's coord. > > Note: if you don't want coord at all, then you can also plug in a > Similarity that returns 1, > or pick another Similarity like BM25: in trunk only the vector space > impl even does anything for coord()....
Robert, I'm sorry that my density is approaching lead. My problem is that I want coord, but I want to control which terms are counted and which are not. I suppose I can accomplish this with my own scorer. My hope was that there was a way to express "This group of terms counts as one for coord". In other words, for a subset of fields in the query, I want to scale the entire score by the fraction of them that match. Another way to think about this, which might be no use at all, is to wonder: is there a way to charge a score penalty for failure to match a particular query term? That would, from another direction, address the underlying effect I'm trying to get. > > > -- > lucidimagination.com > > --------------------------------------------------------------------- > To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org > For additional commands, e-mail: java-user-h...@lucene.apache.org > --------------------------------------------------------------------- To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org For additional commands, e-mail: java-user-h...@lucene.apache.org