On Thu, Apr 19, 2012 at 5:10 PM, Robert Muir <rcm...@gmail.com> wrote: > On Thu, Apr 19, 2012 at 5:05 PM, Benson Margulies <bimargul...@gmail.com> > wrote: >> On Thu, Apr 19, 2012 at 4:21 PM, Robert Muir <rcm...@gmail.com> wrote: >>> On Thu, Apr 19, 2012 at 3:49 PM, Benson Margulies <bimargul...@gmail.com> >>> wrote: >>>> On Thu, Apr 19, 2012 at 1:34 PM, Robert Muir <rcm...@gmail.com> wrote: >>>>> On Thu, Apr 19, 2012 at 1:26 PM, Benson Margulies <bimargul...@gmail.com> >>>>> wrote: >>>>>> I am trying to solve a problem using DisjunctionMaxQuery. >>>>>> >>>>>> >>>>>> Consider a query like: >>>>>> >>>>>> a:b OR c:d OR e:f OR ... >>>>>> name:richard OR name:dick OR name:dickie OR name:rich ... >>>>>> >>>>>> At most, one of the richard names matches. So the match score gets >>>>>> dragged down by the long list of things that don't match, as the list >>>>>> can get quite long. >>>>>> >>>>>> It seemed to me, upon reading the documentation, that I could cure >>>>>> this problem by creating a query tree that used DisjunctionMaxQuery >>>>>> around all those nicknames. However, when I built a boolean query that >>>>>> had, as a clause, a DisjunctionMaxQuery in the place of a pile of >>>>>> these individual Term queries, the score and the explanation did not >>>>>> change at all -- in particular, the coord term shows the same number >>>>>> of total terms. So it looks as if the children of the disjunction >>>>>> still count. >>>>>> >>>>>> Is there a way to control that term? Or a better way to express this? >>>>>> Thinking SQL for a moment, what I'm trying to express is >>>>>> >>>>>> name IN (richard, dick, dickie, rich) >>>>>> >>>>> >>>>> I think you just want to disable coord() here? You can do this for >>>>> that particular boolean query by passing true to the ctor: >>>>> >>>>> public BooleanQuery(boolean disableCoord) >>>> >>>> Rob, >>>> >>>> How do nested queries work with respect to this? If I build a boolean >>>> query one of whose clauses is a BooleanQuery with coord turned off, >>>> does just the nested query insides get left out of 'coord'? >>>> >>>> If so, then your answer certainly seems to be what the doctor ordered. >>>> >>> >>> it applies only to that query itself. So if this BQ is a clause to >>> another BQ that has coord enabled, >>> that would not change the top-level BQ's coord. >>> >>> Note: if you don't want coord at all, then you can also plug in a >>> Similarity that returns 1, >>> or pick another Similarity like BM25: in trunk only the vector space >>> impl even does anything for coord().... >> >> Robert, I'm sorry that my density is approaching lead. My problem is >> that I want coord, but I want to control which terms are counted and >> which are not. I suppose I can accomplish this with my own scorer. My >> hope was that there was a way to express "This group of terms counts >> as one for coord". > > So just structure your boolean query appropriately? > > BQ1(coord=true) > BQ2(coord=false): 25 terms > BQ3(coord=false): 87 terms > > BQ1's coord is based on how many subscorers match (out of 2, BQ2 and > BQ3). If both match its 2/2 otherwise 1/2. > > But in this example BQ2 and BQ3 disable coord themselves, hiding the > fact they accept 25 and 87 terms respectively and appearing as a > single sub for coord(). > > Does this make sense? you can extend this idea to control this however > you want by structuring the BQ appropriately so your BQ's with > "synonyms" have coord=0
Robert, This makes perfect sense, it is what I thought you meant to begin with. I tried it and thought that it did not work. Or, perhaps, I am misreading the 'explain' output. Or, more likely, I goofed altogether. I'll go back and recheck my results and post some explain output if I can't find my mistake. --benson > > -- > lucidimagination.com > > --------------------------------------------------------------------- > To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org > For additional commands, e-mail: java-user-h...@lucene.apache.org > --------------------------------------------------------------------- To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org For additional commands, e-mail: java-user-h...@lucene.apache.org