Thanks for discovering this! I saw your message yesterday and went down a rabbit hole for a bit to try to help but got pulled away. If I can help in any way, let me know.
INTERNAL INTERNAL From: Guillaume <gjac...@gmail.com> Date: Thursday, October 10, 2024 at 4:09 PM To: users@solr.apache.org <users@solr.apache.org> Subject: [EXTERNAL] Re: Json dense vector parsing issue I have opened an issue on JIRA <https://issues.apache.org/jira/browse/SOLR-17487<https://issues.apache.org/jira/browse/SOLR-17487>> for this problem. After a bit of digging, I've found that the root cause wasn't JSON... In fact, Solr kind of "deduplicates" the vector dimensions. So, a vector of 384 that contains the very same value twice will end up as a 383 vector. The second occurrence of the value is simply eluded. Le jeu. 10 oct. 2024 à 16:31, Guillaume <gjac...@gmail.com> a écrit : > Hello, > > I'm using Solr 9.7 as a vector database. I've come across something I > can't explain : I POST my documents as JSON and I've got a vector field of > dimension 768. > > The JSON document I POST has a vector field, which is an array of length > 768. Each value is a float. > > Solr complains that my array is only 767 long... > I've compared the JSON I POST and the array parsed by Solr and written in > the logs.... And indeed, one of the 768 values has simply disappeared in > the process. > > I'm pretty sure it is realted to some JSON array parsing issue on Solr > side but I don't know how to fix this :/ > > Anyone came across something similar ? > > Thanks for reading ! > > > CAUTION: This email originated from outside of the organization. Do not click links or open attachments unless you recognize the sender and know the content is safe.