[ 
https://issues.apache.org/jira/browse/SOLR-17164?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17817824#comment-17817824
 ] 

Chris M. Hostetter commented on SOLR-17164:
-------------------------------------------

FWIW: I don't have the bandwidth to fully dig into this right at the moment, 
but i hope to in the next few weeks or so.

If someone else wants to tackle this in the mean time go right ahead.

Here's a rough psuedo code outline of how i think this would be doable in a 
backcompat way (w/o needing a new function name)...
{noformat}
  final String arg1Str = fp.parseArg();
  throw error if arg1Str is null or !fp.hasMoreArguments()
  
  final boolean constVec = '[' == fp.sp.peek();
  final String arg2Str = constVec ? null : fp.parseArg();
  if (fp.hasMoreArguments() && null != arg2Str) {
     parse existing arg1Str/arg2Str to get vectorEncoding/similarityFunction
     then continue with existing vectorSimilarity() valuesource parser logic & 
return
  }
  
  final SchemaField field1 = ... lookup arg1Str in schema ...
  throw error if field1.getType() is not DenseVectorField

  final VectorEncoding vectorEncoding = field1.getFieldType().get...
  final VectorSimilarityFunction similarityFunction = 
field1.getFieldType().get...

  final ValueSource v1 = field1.getType().getValueSource(...)
  
  ValueSource v2 = null;
  if (constVec) {
    v2 = fp.parseValueSource( ... use vectorEncoding for flags ... )
  } else {
    final SchemaField field2 = ... lookup arg2Str in schema ...
    throw error if field2.getType() is not DenseVectorField
    throw error if vectorEncoding or similarityFunction don't match 
field2.getType()

    v2 = field2.getType().getValueSource(...)
  }
  return new $(vectorEncoding)VectorSimilarityFunction(similarityFunction, v1, 
v2)
{noformat}
...but i may be overlooking some nuances, and we'd obviously want to ensure we 
add a lot of good tests (particularly of the error cases)

(Worst case scenerio, a new "function name" could be picked for this two arg 
version ... ala: {{fieldVectorSimilarity(vecField,vecField|constantVec)}} ... 
or something like that

> Add 2 arg variant of vectorSimilarity() function
> ------------------------------------------------
>
>                 Key: SOLR-17164
>                 URL: https://issues.apache.org/jira/browse/SOLR-17164
>             Project: Solr
>          Issue Type: Improvement
>      Security Level: Public(Default Security Level. Issues are Public) 
>            Reporter: Chris M. Hostetter
>            Priority: Major
>
> Solr's current 4 argument 
> {{vectorySimilarity(vectorEncoding,similarityFunction,vec1,vec2)}} function 
> is really awkward to use for the (seemingly common) situation where you just 
> want to know the similarity between a field and a constant vector, or 
> (probably less common) between two fields of the same type.
> The first two (currently) mandatory arguments to {{vectorySimilarity()}} 
> ({{{}vectorEncoding{}}} and {{{}similarityFunction{}}}) are already mandatory 
> properties of {{{}DenseVectorField{}}}. IIUC the only reason these arguments 
> are required is in the (seemingly uncommon?) situation where you might want 
> to compute the similarity of two vector constants, so the function needs to 
> know what {{vectorEncoding}} and {{similarityFunction}} to use.
>  
> ----
>  
> It would be really nice to support a simplified 2 argument variant of 
> {{vectorySimilarity()}} such that:
>  * the first argument must be the name of a {{DenseVectorField}} field
>  * the second argument must be either:
>  ** A vector constant
>  *** in which case the {{vectorEncoding}} use to parse the constant is 
> infered from the fieldType properties of the first argument
>  ** Or the name of a second {{DenseVectorField}} field
>  *** in which case the {{vectorEncoding}} and {{similarityFunction}} of the 
> two fields must match
>  * The ValueSource returned should be based on the configured 
> {{vectorEncoding}} & {{similarityFunction}} of the field(s)
> Examples...
> {noformat}
> vectorySimilarity(title_float_vec_dim4, [1.0,2.0,3.0,4.0])
>    ...or...
> vectorySimilarity(title_float_vec_dim4, body_float_vec_dim4)
> {noformat}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org
For additional commands, e-mail: issues-h...@solr.apache.org

Reply via email to