The main thing which converts search result fields to arrays is the “col” function https://solr.apache.org/guide/8_4/vectorization.html#creating-a-vector-with-the-col-function
You may also need “let” to use variables etc. Rest is just employing available math functions. But they don’t play well with multivalued fields, it’s hard to work with them. They look like arrays but are not exactly arrays. It’s just a bunch of values sticking together. For example afaik there’s no way to refer to 1st, 2nd element of a multivalued field. When you enable docValues and use the export handler, those values would be returned in ascending order, losing position information. For example if the ratings were from different movie raters, such as imdb, rottentomatoes etc and every rating were in a different field, it would be much easier to work with, as Solr expects to build arrays and matrices from such formatted documents. I’d be happy to learn if someone more knowledgeable has a better answer. Sent from Mail for Windows From: Eric Pugh Sent: Saturday, October 14, 2023 8:05 PM To: users@solr.apache.org Subject: Re: Vector math with Streaming Expressions? By average them, I mean the first version. So at the end, I get a set of numbers that represents the average vector. Here is an example of the vector.. https://github.com/apache/solr/blob/main/solr/example/films/films.json#L8365 In the existing docs on searching vectors, we make a statement that we have the average vector of three movies: https://github.com/apache/solr/blob/main/solr/example/films/README.md?plain=1#L154 I’d actually like to figure out how to calculate that vector from data we have in Solr already. > On Oct 14, 2023, at 12:50 PM, ufuk yılmaz <uyil...@vivaldi.net.INVALID> wrote: > > By “average them” do you mean to calculate the simple arithmetic average > element by element of the all returned film ratings? Eg. sum first element of > all arrays and divide by the number of arrays, do it again for the second > element etc.. > > Or find the average of the array for each movie, producing a single number > for each movie > > ~ufuk > > — > >> On 14 Oct 2023, at 19:19, Eric Pugh <ep...@opensourceconnections.com >> <mailto:ep...@opensourceconnections.com>> wrote: >> >> I’m trying to average three arrays of floats and not quite making the >> conceptual jump from “I defined a array of numbers” in the way that the >> https://github.com/apache/lucene-solr/blob/visual-guide/solr/solr-ref-guide/src/vector-math.adoc#element-by-element-vector-math >> example expects with “I made a query and get back a array of numbers”. >> >> I’m using the films example, so : bin/solr start -c -e films >> >> Then, I want to get the vectors for three films and average them. >> >> The streaming expression grabs the three vectors, but I can’t figure out how >> to wrap it in something to average them. >> >> select( >> search(films, >> qt="/select", >> q="name:"Finding Nemo" OR name:"Bee Movie" OR name:"Harry Potter and >> the Chamber of Secrets"", >> fl="id,name,film_vector"), >> film_vector >> ) >> >> produces: >> >> { >> "result-set": { >> "docs": [ >> { >> "film_vector": [ >> "-0.2758314", >> "-0.14416906", >> "-0.11316811", >> "0.2745105", >> "0.040616427", >> "-4.2628963E-4", >> "-0.120363355", >> "0.07888852", >> "0.036417373", >> "-0.29541242" >> ] >> }, >> { >> "film_vector": [ >> "-0.11665395", >> "0.04247921", >> "-0.13233364", >> "0.52578413", >> "-0.1739291", >> "-0.01880563", >> "-0.06670809", >> "-0.11242808", >> "0.09724514", >> "-0.11909142" >> ] >> }, >> { >> "film_vector": [ >> "-0.14272659", >> "0.13051921", >> "-0.19087574", >> "0.44983688", >> "-0.21098459", >> "0.0033124345", >> "-0.008155139", >> "-0.09109363", >> "0.12401622", >> "-0.12211737" >> ] >> }, >> { >> "EOF": true, >> "RESPONSE_TIME": 24 >> } >> ] >> } >> } >> >> Great, now how do I average across them and get the final vector that I >> expect, which should be similar to: >> >> [-0.1784, 0.0096, -0.1455, 0.4167, -0.1148, -0.0053, -0.0651, -0.0415, >> 0.0859, -0.1789] >> >> Thanks! >> >> Eric >> >> _______________________ >> Eric Pugh | Founder & CEO | OpenSource Connections, LLC | 434.466.1467 | >> http://www.opensourceconnections.com >> <http://www.opensourceconnections.com/><http://www.opensourceconnections.com/> >> | My Free/Busy <http://tinyurl.com/eric-cal> >> Co-Author: Apache Solr Enterprise Search Server, 3rd Ed >> <https://www.packtpub.com/big-data-and-business-intelligence/apache-solr-enterprise-search-server-third-edition-raw> >> >> This e-mail and all contents, including attachments, is considered to be >> Company Confidential unless explicitly stated otherwise, regardless of >> whether attachments are marked as such. _______________________ Eric Pugh | Founder & CEO | OpenSource Connections, LLC | 434.466.1467 | http://www.opensourceconnections.com <http://www.opensourceconnections.com/> | My Free/Busy <http://tinyurl.com/eric-cal> Co-Author: Apache Solr Enterprise Search Server, 3rd Ed <https://www.packtpub.com/big-data-and-business-intelligence/apache-solr-enterprise-search-server-third-edition-raw> This e-mail and all contents, including attachments, is considered to be Company Confidential unless explicitly stated otherwise, regardless of whether attachments are marked as such.