Hi Michael,

I'm looking into implementing a solution.

On Fri, 2013-01-25 at 16:23 -0500, Michael McCandless wrote:
> On Fri, Jan 25, 2013 at 3:48 PM, Nicola Buso <nb...@ebi.ac.uk> wrote:
> 
> > if you have experiences in this use case can you share solutions? What
> > is reusable from Lucene 4.x implementation?
> 
> Sorry, no experience doing drill sideways w/ Lucene facets ... just in
> a prior life (another search engine).
> 
> One conceptual way to get the counts is to do "hold one out" query for
> each dimension you need the sideways counts on.  EG if user searched
> for "cameras", then drilled down on "Manufacturer = Sony" and drilled
> down again on "FormFactor = SLR", you could do 3 queries:
> 
>   * cameras AND Manufacturer=Sony --> sideways counts for "FormFactor"
> 
>   * cameras AND FormFactor=SLR -> sideways counts for "Manufacturer"
> 
>   * cameras AND FormFactor=SLR AND Manufacturer=Sony -> query results
> 
> I believe that will work?  But it's sort of costly ... you could "save
> the facet counts from the last query" to save on one of these query
> executions.
I agree this will retrieve the facets but is too costly; Fn+1 queries
where Fn is the number of facets; supposing to have 4-6 facets to show
to the user it's really too much.

> 
> Conceptually, the query divides all documents into 3 sets: document
> matches (add to drill-down counts), document is a near miss (would
> match except for exactly one of drill-down constraints), document
> doesn't match.  The sideways counts amounts to tallying up the near
> miss hits against the dimension that was the near miss.
> 
> Like if you could run a query for cameras AND (Manufacturer=Sony OR
> FormFactor=SLR minShouldMatch=N-1 (1 in this case)), which would match
> the "matches" and the near misses, and then during collection
> determine whether it was a hit or a near miss (hmm this isn't so hard:
> use Scorer.getChildren()), and collect and/or tally up the appropriate
> counts (drill down or sideways), then you'd get the right counts I
> think?
Sorry I'm quite new to Lucene, I have some questions...

Are you supposing I can use only one query to obtain all the
informations? Am I dreaming? :-)

BooleanQuery.setMinimumNumberShouldMatch(int min) permit you to skip a
number of clauses? if yes how to ensure it's skipping a particular facet
clause?

Scorer.getChildren() return sub scores for a children, how are
hierarchically organized the scores in this class?

Do you think rewriting a Collector that discriminate from Scorer.score()
and Scorer.getChildren()...score() I can collect different set of result
needed for facet counting?

Sorry for all these questions, that are just to better understand
lucene.


Nicola.

> 
> I think this could be a reasonable way to do drill sideways!
> 
> > Reading some books I just noticed that the expected behaviour for an
> > user filtering by facets is:
> > - facet values in the same facet group (category in lucene) are added in
> > OR
> > - facet values from different facet groups are added in AND
> 
> Right.
> 
> > - another interesting aspect is how the selection affect the counting in
> > other facets
> 
> This is why you have to do the N-1 queries I think.
> 
> > Should be interesting to evaluate if some of these facet navigation
> > patterns can be implemented or supported with some utilities in Lucene
> > 4.0
> 
> I think drill sideways is possible!  Just not implemented yet ...
> 
> Mike McCandless
> 
> http://blog.mikemccandless.com



---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org

Reply via email to