[ 
https://issues.apache.org/jira/browse/SOLR-17974?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=18031903#comment-18031903
 ] 

Chris M. Hostetter edited comment on SOLR-17974 at 10/21/25 9:32 PM:
---------------------------------------------------------------------

A few things to note...

 

1) Lucene's existing HNSW graph based field only supports "single valued vector 
fields" -- so when Solr's {{DenseVectorField}} type was added, it models itself 
as a "multi-valued numeric" field (either "float" or "byte" based) which is why 
this hasn't been a problem ... yet. (there is work in process considering what 
multi-valued HNSW fields might look at like a lucene level)

2) Lucene 10.3 added a 
[LateInteractionField|https://lucene.apache.org/core/10_3_1/core/org/apache/lucene/document/LateInteractionField.html]
 which *does* support "multi-valued vectors" -- so the quirks of our 
SolrDocument/SolrInputDocument API are not a "future" problem -- they are a 
"currently preventing us from easily adding support for this cool feature" 
problem.

3) I'm attaching a small test only patch that just tries to demonstrate (some 
of) the existing quirks for folks who may not be familiar with what i'm 
describing -- which means it currently passes, but that doesn't mean the 
behavior is useful.


was (Author: hossman):
A few things to note...

 

1) Lucene's existing HNSW graph based field only supports "single valued vector 
fields" -- so when Solr's {{DenseVectorField}} type was added, it models itself 
as a "multi-valued numeric" field (either "float" or "byte" based) which is why 
this hasn't been a problem ... yet.

2) Lucene 10.3 added a 
[LateInteractionField|https://lucene.apache.org/core/10_3_1/core/org/apache/lucene/document/LateInteractionField.html]
 which *does* support "multi-valued vectors" -- so the quirks of our 
SolrDocument/SolrInputDocument API are not a "future" problem -- they are a 
"currently preventing us from easily adding support for this cool feature" 
problem.

3) I'm attaching a small test only patch that just tries to demonstrate (some 
of) the existing quirks for folks who may not be familiar with what i'm 
describing -- which means it currently passes, but that doesn't mean the 
behavior is useful.

> Tech-Debt repayment: SolrDocument/SolrInputDocument will merge multiple 
> "list" values
> -------------------------------------------------------------------------------------
>
>                 Key: SOLR-17974
>                 URL: https://issues.apache.org/jira/browse/SOLR-17974
>             Project: Solr
>          Issue Type: Task
>            Reporter: Chris M. Hostetter
>            Priority: Major
>         Attachments: SOLR-17974.tests.patch
>
>
> A long standing bit of "convenience" logic in SolrDocument (that was later 
> copied/inherited in SolrInputDocument) is that if you "add" a 
> {{java.util.Collection}} of "values" for a field name it will either use that 
> {{java.util.Collection}} as is; or -- if the document already has some values 
> in it for that field name -- it unwraps the (new) {{java.util.Collection}} 
> and adds each of the items in it to whatever existing 
> {{java.util.Collection}} of values it already has for that field name.
> Once upon a time this kind of made life easier for folks - you could call one 
> method on either a single value, or a list of values and Solr would "do what 
> you mean".
> But as we get into a world where "multi-valued vector fields" start being a 
> thing we have to consider, we need to rethink our APIs to ensure that (at a 
> conceptual level, if not in terms of specific {{java.util.Collections}} class 
> names) it's possible to have a "list of floats" as a single "field value" in 
> a "multi-valued" field -- w/o a user getting confused why adding additional 
> "field values" breaks their existing data.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to