Re: SOLR TF/IDF factor removal.

Vincenzo D'Amore Thu, 14 Apr 2022 06:52:15 -0700

https://github.com/freedev/solr-constant-similarity


this is just an implementation of what Markus was suggesting.

Could be a good idea adding a constant similarity class into the solr
standard distribution.

On Thu, Apr 14, 2022 at 3:48 PM Vincenzo D'Amore <v.dam...@gmail.com> wrote:

> Hi,
>
> long time ago I wrote this, just trying to handle cases where there is no
> need TF/IDF
>
> https://github.com/freedev/solr-constant-similarity
>
> There just two simple steps to follow:
>
>    1. Add this line in solrconfig.xml:
>
> <lib dir="../../../dist/" regex="constant-similarity-\d.*\.jar" />
>
>    1. And add this line into schema.xml:
>
> <similarity
> class="it.damore.solr.similarity.ConstantTFSimilarity"></similarity>
>
>
> <https://github.com/freedev/solr-constant-similarity#old-solr-versions-before-54>
>
> On Thu, Apr 14, 2022 at 3:30 PM Jeremy Buckley - IQ-C
> <jeremy.buck...@gsa.gov.invalid> wrote:
>
>> You may be interested in the ^= modifier for queries.  From the reference
>> guide:
>>
>> Constant Score with "^="
>>
>> Constant score queries are created with <query_clause>^=<score>, which
>> sets
>> the entire clause to the specified score for any documents matching that
>> clause. This is desirable when you only care about matches for a
>> particular
>> clause and don’t want other relevancy factors such as term frequency (the
>> number of times the term appears in the field) or inverse document
>> frequency (a measure across the whole index for how rare a term is in a
>> field).
>>
>> Example:
>>
>> (description:blue OR color:blue)^=1.0 text:shoes
>>
>>
>> On Thu, Apr 14, 2022 at 8:52 AM Fiz N <fiznewy...@gmail.com> wrote:
>>
>> > Hello Experts,
>> >
>> > In our project we are using SOLR 8.11.1 in Standalone mode in Windows
>> > server box.
>> >
>> > We have implemented a search mechanism by using pure keyword match and
>> > boosting the keywords as per business needs. For each search result,
>> match
>> > percentage is derived using the obtained SOLR document score. The SOLR
>> > document score is the summation of individual keyword scores which is
>> > derived by using boost factor and TF/IDF values of the keyword.
>> >
>> > As per requirements, in our case the resultant scores should be
>> dependent
>> > only on the boost factor, whereas the implicit TF/IDF factor is causing
>> > deviation in the expected results and also causing uncertainty in the
>> > resultant ranking.
>> >
>> > So we are looking for better approaches to eliminate/neutralize the SOLR
>> > TF/IDF factor.
>> >
>> > Please do let us know your suggestions in removal of TF/IDF factor or
>> any
>> > other solution approach that we can consider in this case.
>> >
>> > Thanks
>> > Fiz Fareedh.
>> >
>>
>
>
> --
> Vincenzo D'Amore
>
>

-- 
Vincenzo D'Amore

Re: SOLR TF/IDF factor removal.

Reply via email to