[ https://issues.apache.org/jira/browse/SOLR-17487?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17888367#comment-17888367 ]
Guillaume Jactat edited comment on SOLR-17487 at 10/10/24 5:26 PM: ------------------------------------------------------------------- I've done a few more tests. I get the same error when I post my documents in XML format. The problem doesn't seem to be related to the input vectors JSON serialization though.... The xml document (384 vector size) is also attached to this issue. was (Author: gjactat): I've done a few more tests. I get the same error when I post my documents in XML format. The problem doesn't seem to be related to the input vectors JSON serialization though.... > Dimensions disappear from dense vectors when POSTing Solr updates > ----------------------------------------------------------------- > > Key: SOLR-17487 > URL: https://issues.apache.org/jira/browse/SOLR-17487 > Project: Solr > Issue Type: Bug > Security Level: Public(Default Security Level. Issues are Public) > Components: UpdateRequestProcessors > Affects Versions: 9.7, 9.6.1 > Reporter: Guillaume Jactat > Priority: Major > Attachments: image-2024-10-10-18-05-01-195.png, > image-2024-10-10-18-07-14-904.png, image-2024-10-10-18-07-19-370.png, > vector-384.json, vector-768.json > > > Hello, > > I'm using Solr 9.7 as a vector database. I've come across something I can't > explain : I POST my documents as JSON and I've got a vector field of > dimension {*}768{*}. > > The JSON document I POST has a vector field, which is an array of length 768. > Each value is a float. > > Solr complains that my array is only *767* long... > I've compared the JSON I POST and the array parsed by Solr and written in the > logs.... And indeed, one of the 768 values has simply disappeared in the > process. > > The problem can easily be reproduced. All you have to do is : > * In your "schema.xml", declare the following dense vector field type : > {code:java} > <fieldType name="knn_vector_768" class="solr.DenseVectorField" > vectorDimension="768" similarityFunction="cosine"/>{code} > * In your schema.xml, declare the followig dense vector dynamic field : > {code:java} > <dynamicField name="*_vector_768" type="knn_vector_768" indexed="true" > stored="true"/>{code} > * Use the Solr Admin UI to post the *attached document* to your Solr core. > * You should get the following error : "{*}incorrect vector dimension. The > vector value has size 767 while it is expected a vector with size 768"{*} > > * Furthermore, while the POSTed vector has 768 size, the vector written in > the logs is only 767... One value is missing. You can easily spot the missing > value with a simple diff. > Maybe someone will find the reason why this specific vector leads to this > issue. Of course, I have plenty of others documents that get indexed without > any issue. > In case it helps, the value that disappears from the 768 vector is > "0.0335415453". It's the 384th dimension (starting from 1) > !image-2024-10-10-18-07-19-370.png! > Thanks for reading -- This message was sent by Atlassian Jira (v8.20.10#820010) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For additional commands, e-mail: issues-h...@solr.apache.org