[ https://issues.apache.org/jira/browse/SOLR-17487?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Guillaume Jactat updated SOLR-17487: ------------------------------------ Description: Hello, I'm using Solr 9.7 as a vector database. I've come across something I can't explain : I POST my documents as JSON and I've got a vector field of dimension {*}768{*}. The JSON document I POST has a vector field, which is an array of length 768. Each value is a float. Solr complains that my array is only *767* long... I've compared the JSON I POST and the array parsed by Solr and written in the logs.... And indeed, one of the 768 values has simply disappeared in the process. I'm pretty sure it must be some kind of JSON array parsing issue but I don't know how to fix this :/ The problem can easily be reproduced. All you have to do is : * In your "schema.xml", declare the following dense vector field type : {code:java} <fieldType name="knn_vector_768" class="solr.DenseVectorField" vectorDimension="768" similarityFunction="cosine"/>{code} * In your schema.xml, declare the followig dense vector dynamic field : {code:java} <dynamicField name="*_vector_768" type="knn_vector_768" indexed="true" stored="true"/>{code} * Use the Solr Admin UI to post the attached document to your Solr core. * You should get the following error : "{*}incorrect vector dimension. The vector value has size 767 while it is expected a vector with size 768"{*} * Furthermore, while the POSTed vector has 768 size, the vector written in the logs is only 767... One value is missing. You can easily spot the missing value with a simple diff. Thanks for reading was: Hello, I'm using Solr 9.7 as a vector database. I've come across something I can't explain : I POST my documents as JSON and I've got a vector field of dimension {*}768{*}. The JSON document I POST has a vector field, which is an array of length 768. Each value is a float. Solr complains that my array is only *767* long... I've compared the JSON I POST and the array parsed by Solr and written in the logs.... And indeed, one of the 768 values has simply disappeared in the process. I'm pretty sure it is realted to some JSON array parsing issue on Solr side but I don't know how to fix this :/ The problem can easily be reproduced. All you have to do is : * In your "schema.xml", declare the following dense vector field type : {code:java} <fieldType name="knn_vector_768" class="solr.DenseVectorField" vectorDimension="768" similarityFunction="cosine"/>{code} * In your schema.xml, declare the followig dense vector dynamic field : {code:java} <dynamicField name="*_vector_768" type="knn_vector_768" indexed="true" stored="true"/>{code} * Use the Solr Admin UI to post the attached document to your Solr core. * You should get the following error : "{*}incorrect vector dimension. The vector value has size 767 while it is expected a vector with size 768"{*} * Furthermore, while the POSTed vector has 768 size, the vector written in the logs is only 767... One value is missing. You can easily spot the missing value with a simple diff. Thanks for reading > Dimensions disappear from dense vectors when POSTing Solr updates > ----------------------------------------------------------------- > > Key: SOLR-17487 > URL: https://issues.apache.org/jira/browse/SOLR-17487 > Project: Solr > Issue Type: Bug > Security Level: Public(Default Security Level. Issues are Public) > Components: UpdateRequestProcessors > Affects Versions: 9.7, 9.6.1 > Reporter: Guillaume Jactat > Priority: Major > Attachments: vector-768.json > > > Hello, > > I'm using Solr 9.7 as a vector database. I've come across something I can't > explain : I POST my documents as JSON and I've got a vector field of > dimension {*}768{*}. > > The JSON document I POST has a vector field, which is an array of length 768. > Each value is a float. > > Solr complains that my array is only *767* long... > I've compared the JSON I POST and the array parsed by Solr and written in the > logs.... And indeed, one of the 768 values has simply disappeared in the > process. > > I'm pretty sure it must be some kind of JSON array parsing issue but I don't > know how to fix this :/ > The problem can easily be reproduced. All you have to do is : > * In your "schema.xml", declare the following dense vector field type : > {code:java} > <fieldType name="knn_vector_768" class="solr.DenseVectorField" > vectorDimension="768" similarityFunction="cosine"/>{code} > * In your schema.xml, declare the followig dense vector dynamic field : > {code:java} > <dynamicField name="*_vector_768" type="knn_vector_768" indexed="true" > stored="true"/>{code} > * Use the Solr Admin UI to post the attached document to your Solr core. > * You should get the following error : "{*}incorrect vector dimension. The > vector value has size 767 while it is expected a vector with size 768"{*} > > * Furthermore, while the POSTed vector has 768 size, the vector written in > the logs is only 767... One value is missing. You can easily spot the missing > value with a simple diff. > > Thanks for reading -- This message was sent by Atlassian Jira (v8.20.10#820010) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For additional commands, e-mail: issues-h...@solr.apache.org