cpoerschke opened a new pull request, #4230:
URL: https://github.com/apache/solr/pull/4230
done-ish list:
* [X] skeleton `SemanticHighlightingComponent` class in
`modules/language-models` to avoid `core/HighlightingComponent.java` having
`modules/language-models` dependency
* [X] created `CustomModel.java` and `custom-model.json` providing
hard-coded mock embeddings for test use without an external model provider
dependency
* [X] skeleton logic to use a language model to compute a score for a
`Passage`
* [X] minimal `SemanticHighlightingComponentTest` class to illustrate usage
to-do list, non-exhaustive:
* [ ] consideration of parameter details e.g. how to request semantic
highlighting and the model(s) to use
* [ ] how to obtain the vector against which passages are compared e.g.
from some new parameter directly or by extraction from the `q` or `hl.q`
parameter if a knn parser is used or by some other way?
* [ ] consideration of `Passage` extraction e.g. currently it is term based
but what if the `q` was a vector query i.e. no terms
* [ ] should the `PassageScorer` and/or the `Comparator<Passage>` apply the
language model to candidate passages?
* [ ] how to properly and efficiently compute vector distances? currently
using euclidian distance for illustration only.
* [ ] tests
* [ ] documentation
* [ ] ???
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]