[ 
https://issues.apache.org/jira/browse/SOLR-17736?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chris M. Hostetter updated SOLR-17736:
--------------------------------------
    Attachment: SOLR-17736.patch
        Status: Open  (was: Open)

I haven't fully wrapped my head around everything in the linked GH PR#3316, but 
IIUC the "meat" of the idea is that:
 * IF:
 ** a block join {{parent}} QParser is used
 ** AND that {{parent}} QParser directly wraps a (single) {{KnnXxxVectorQuery}}
 * THEN:
 ** Extract some of the properties from the {{KnnXxxVectorQuery}} (field name, 
vector, topK, etc...)
 ** Ignore the original KNN query and replace it with a new 
{{DiversifyingChildrenXxxKnnVectorQuery}} query

(correct?)

To me, this approach seems kind of "kludgy" and brittle.

In my day job, we have a small plugin that we use with Solr which creates 
instances of  {{DiversifyingChildren...}} queries via a simple subclass of the 
existing {{KnnQParserPlugin}} using a new {{childOf}} local params (modeled 
after the {{of}} param in Solr's {{child}} QParser)

This approach supports several usecases that (AFAICT) the current PR does not:
 * Create {{DiversifyingChildrenXxxKnnVectorQuery}} instances even w/o a 
{{parent}} wrapper
 ** When you want to return diverse child docs w/o joining to the parent doc
 * Wrap {{parent}} queries around BooleanQuery containing multiple clauses (one 
or more of which might be {{{}DiversifyingChildrenXxxKnnVectorQuery{}}})
 ** When you want the top scoring parents based on child scores using multiple 
vector queries, either because of multiple input vectors, or because of 
multiple vector fields.
 ** Or when you want to return parent docs based on *either* diverse topK 
children *OR* some other non-vector child criteria
 * Wrap {{parent}} queries around plain {{KnnXxxVectorQuery}} against children, 
w/o using {{DiversifyingChildren...}}
 ** When you don't care about the overhead of diversification, perhaps because 
you know each parent has at most one child (of a particular type) with a vector

I'm attaching a patch that adapts my current custom plugin to re-implement it 
as a simple addition to the existing {{{}KnnQParserPlugin{}}}, that kicks in if 
and only if a {{childOf}} local param is specified.

To my mind this approach is a lot cleaner and more versatile then the " 
{{parent}} QParser wrapped around {{knn}} QParser should always throw away the 
original query and build it's own" approach in the PR – but if folks prefer the 
approach in PR#3316 then can we please at least make it configurable? (maybe 
via a new variant of the {{parent}} QParser?)

Because as it stands right now – since 
{{DiversifyingChildrenXxxxKnnVectorQuery}} extends {{KnnXxxVectorQuery}} – it 
will become impossible to support some of the use cases I listed above (even 
with a custom plugin) because the Solr {{parent}} QParser will start treating 
_any_ {{KnnXxxVectorQuery}} it wraps (including 
{{DiversifyingChildrenXxxxKnnVectorQuery}} created by a custom plugin) as 
special, throwing them away and creating it's own.

> Introduce support for KNN search on nested vector documents
> -----------------------------------------------------------
>
>                 Key: SOLR-17736
>                 URL: https://issues.apache.org/jira/browse/SOLR-17736
>             Project: Solr
>          Issue Type: New Feature
>          Components: query
>    Affects Versions: 9.8
>            Reporter: Alessandro Benedetti
>            Priority: Major
>              Labels: pull-request-available
>         Attachments: SOLR-17736.patch
>
>          Time Spent: 1h 20m
>  Remaining Estimate: 0h
>
> This issue tracks the work of introducing the support for KNN search on 
> nested vector documents, surfacing the Lucene implementation in here:
> https://github.com/apache/lucene/pull/12434
> This allows both:
> -KNN retrieval of children, applying parent filters with no denormalisation 
> needed
> -KNN retrieval of parents (based on children KNN, children level prefiltering 
> and parent level prefiltering)
> It's one way of having multi-valued vectors per field, per document in Solr.
> More will come soon



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org
For additional commands, e-mail: issues-h...@solr.apache.org

Reply via email to