Hello everyone, I am currently trying to create a system for performing distance computation of different documents based on some pre-computed numerical feature vector.
I set up Solr (cloud) 8.7 and I am using streaming expressions. I have documents as such, with the feature field being pfloat with multiValued set to True: { "id":"1", "feature":[ 0.1, 0.5, 0.6, 1.7], , { "id":"2", "feature":[ 0.5, 0.1, 0.7, 0.9], }, { "id":"3", "feature":[ -0.5, 0.9, 1.5, 0.2], }, I want to create a matrix so I can then use the distance() function to compute the distances for the columns of a matrix. The documentation provides an example of what I am interested in, by defining the vectors on the fly: let(a=array(20, 30, 40), b=array(21, 29, 41), c=array(31, 40, 50), d=matrix(a, b, c), c=distance(d)) By transposing the matrix I can easily perform the distance between the rows, so I can get what I want. However, now I want to extract the numerical features from a feature field indexed in Solr. The documentation explains how to create a matrix from numerical values stored in some fields: let( a=random(collection1, q="market:A", rows="5000", fl="price_f"), b=random(collection1, q="market:B", rows="5000", fl="price_f"), c=random(collection1, q="market:C", rows="5000", fl="price_f"), d=random(collection1, q="market:D", rows="5000", fl="price_f"), e=col(a, price_f), f=col(b, price_f), g=col(c, price_f), h=col(d, price_f), i=matrix(e, f, g, h), j=sumRows(i)) However, in my case, I already have an array of float values for each document. So I try to do it that way: let( s1=search(test,q="id:1",fl="feature"), f1=col(s1, feature), s2=search(test,q="id:2",fl="feature"), f2=col(s2, feature), s3=search(test,q="id:3",fl="feature"), f3=col(s3, feature), m=matrix(f1,f2,f3) ) But I get this error: { "result-set": { "docs": [ { "EXCEPTION": "Failed to evaluate expression matrix(f1,f2,f3) - Numeric value expected but found type java.util.ArrayList for value [0.1,0.5,0.6,1.7]", "EOF": true, "RESPONSE_TIME": 5 } ] } } When I inspect what I get as f3, I see that I have an array of array, which is why I think it is failing here to create the matrix. I've been searching a lot on how to create a matrix from float vectors stored in a field of my documents, and I still cannot find any solution. What I could do is extract the vectors, create them on the fly, and construct the vectors and matrix, but I would like to be able to do it in one request. Moreover, I find it really curious that I cannot directly create the matrix on the results of a a normal search. For instance, I would prefer to do something like that: s=search(test,q="*",fl="feature,id"), m=col(s,feature)) which returns: { "result-set": { "docs": [ { "m": [ [ 0.1, 0.5, 0.6, 1.7 ], [ 0.5, 0.1, 0.7, 0.9 ], [ -0.5, 0.9, 1.5, 0.2] ] ] }, { "EOF": true, "RESPONSE_TIME": 3 } ] } } and be able to use the matrix I obtain here. But again, I was not able to perform matrix operations on "m". Does anyone know any elegant way to create a matrix from my numerical vectors stored in my feature field? Thank you. -- Xavier Favory Music Technology Group Universitat Pompeu Fabra