Hi there,

Is there a way to omit only term frequencies but keep positions? I see it's
possible to omit frequencies and positions, or just positions for a field
in a schema.xml (
https://solr.apache.org/guide/8_5/field-type-definitions-and-properties.html#field-default-properties),
but we would like to omit term frequencies only. In a way, we assumed we
could achieve this with:

omitTermFreqAndPositions="true" // forget term frequencies, forget positions
omitPositions="false" // ... but actually, keep positions
// ==> forget term frequencies only

However, this looks just a bit weird ... and probably isn't how these
options are intended to be combined?

We also understand we can extend DefaultSimilarity (if it's still called
that?) and return a constant, like, 1.0f for all term frequencies. However,
from our recollection this requires creating a plugin and adding it to
Solr's classpath -- which is possible, but is additional work which we'd
rather not have to do if it is already an out of the box option.

Context: our use case is coming from our scientific curators saying:
- term frequencies are "overpowering" the contribution to the score of
length normalisation. E.g., if Document1 has a multivalued field with the
value, "Albumin", and someone searches this field for "Albumin", then they
*really* want that document back first. Instead, Document2 whose field
value is ["Albumin D box-binding protein", "Albumin D-element-binding
protein", "2S albumin"] is coming higher.
- they state that they're not actually interested in how many times a term
appears, it's either there or not.

Just to clarify from my above waffle, is it possible to omit term
frequencies only, but keep positions?

Many thanks, kind regards and have a nice day,

Edd

PS. I've seen this question asked a few times over the years, but wanted to
ask again in case I've missed a new option in Solr.

--------------------
Edward Turner

Reply via email to