Hi Michael, I'm looking into implementing a solution.
On Fri, 2013-01-25 at 16:23 -0500, Michael McCandless wrote: > On Fri, Jan 25, 2013 at 3:48 PM, Nicola Buso <nb...@ebi.ac.uk> wrote: > > > if you have experiences in this use case can you share solutions? What > > is reusable from Lucene 4.x implementation? > > Sorry, no experience doing drill sideways w/ Lucene facets ... just in > a prior life (another search engine). > > One conceptual way to get the counts is to do "hold one out" query for > each dimension you need the sideways counts on. EG if user searched > for "cameras", then drilled down on "Manufacturer = Sony" and drilled > down again on "FormFactor = SLR", you could do 3 queries: > > * cameras AND Manufacturer=Sony --> sideways counts for "FormFactor" > > * cameras AND FormFactor=SLR -> sideways counts for "Manufacturer" > > * cameras AND FormFactor=SLR AND Manufacturer=Sony -> query results > > I believe that will work? But it's sort of costly ... you could "save > the facet counts from the last query" to save on one of these query > executions. I agree this will retrieve the facets but is too costly; Fn+1 queries where Fn is the number of facets; supposing to have 4-6 facets to show to the user it's really too much. > > Conceptually, the query divides all documents into 3 sets: document > matches (add to drill-down counts), document is a near miss (would > match except for exactly one of drill-down constraints), document > doesn't match. The sideways counts amounts to tallying up the near > miss hits against the dimension that was the near miss. > > Like if you could run a query for cameras AND (Manufacturer=Sony OR > FormFactor=SLR minShouldMatch=N-1 (1 in this case)), which would match > the "matches" and the near misses, and then during collection > determine whether it was a hit or a near miss (hmm this isn't so hard: > use Scorer.getChildren()), and collect and/or tally up the appropriate > counts (drill down or sideways), then you'd get the right counts I > think? Sorry I'm quite new to Lucene, I have some questions... Are you supposing I can use only one query to obtain all the informations? Am I dreaming? :-) BooleanQuery.setMinimumNumberShouldMatch(int min) permit you to skip a number of clauses? if yes how to ensure it's skipping a particular facet clause? Scorer.getChildren() return sub scores for a children, how are hierarchically organized the scores in this class? Do you think rewriting a Collector that discriminate from Scorer.score() and Scorer.getChildren()...score() I can collect different set of result needed for facet counting? Sorry for all these questions, that are just to better understand lucene. Nicola. > > I think this could be a reasonable way to do drill sideways! > > > Reading some books I just noticed that the expected behaviour for an > > user filtering by facets is: > > - facet values in the same facet group (category in lucene) are added in > > OR > > - facet values from different facet groups are added in AND > > Right. > > > - another interesting aspect is how the selection affect the counting in > > other facets > > This is why you have to do the N-1 queries I think. > > > Should be interesting to evaluate if some of these facet navigation > > patterns can be implemented or supported with some utilities in Lucene > > 4.0 > > I think drill sideways is possible! Just not implemented yet ... > > Mike McCandless > > http://blog.mikemccandless.com --------------------------------------------------------------------- To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org For additional commands, e-mail: java-user-h...@lucene.apache.org