For our corpus, term frequency gets in the way of how we want to rank
search results rather than being helpful.

I put this in our schema to effectively turn Okapi BM25
<https://en.wikipedia.org/wiki/Okapi_BM25> into BM15:

    <similarity class="solr.BM25SimilarityFactory">
        <float name="b">0</float>
    </similarity>

Thomas



Op wo 28 dec. 2022 om 14:35 schreef Eric Pugh <
ep...@opensourceconnections.com>:

> For a very long time, that was what folks always say….  “The different
> IDF” is going to be an issue.   My opinion is that there are many other
> things that REALLY effect your overall relevance a lot more then unbalanced
> IDF.   Folks worry way too much about IDF, and not enough about “what are
> your crazy synonyms.txt or stop words.txt doing to you?”.
>
> You should go use a tool like Quepid (www.quepid.com) and set up a
> baseline relevance test case, and just try the experiment, that way instead
> of making decisions based on hunches, you have data!
>
>
>
> > On Dec 28, 2022, at 8:30 AM, Dave <hastings.recurs...@gmail.com> wrote:
> >
> > Eric, that is super clever.  But how does it effect ranking if you do a
> general search?  Since each collection has its own idf etc?
> > -Dave
> >
> >> On Dec 28, 2022, at 7:03 AM, Eric Pugh <ep...@opensourceconnections.com>
> wrote:
> >>
> >> You may find it an easier path forward to just move to SolrCloud.  You
> can run a single Solr server with multiple collections and use the embedded
> ZK to avoid setting up the full ZK ensemble….
> >>
> >>> On Dec 28, 2022, at 12:04 AM, Mike <mz579...@gmail.com> wrote:
> >>>
> >>> Yes, it should be the same, it works without basic authentication.
> >>>
> >>> Thank you
> >>>
> >>>> Am Mi., 28. Dez. 2022 um 05:48 Uhr schrieb Srijan <shree...@gmail.com
> >:
> >>>>
> >>>>
> >>>>
> https://issues.apache.org/jira/plugins/servlet/mobile#issue/SOLR-15237/comment/17626195
> >>>>
> >>>> Same issue?
> >>>>
> >>>>> On Tue, Dec 27, 2022, 19:59 Mike <mz579...@gmail.com> wrote:
> >>>>
> >>>>> I get a 401 require authentication error when I query with &shards=
> >>>>>
> >>>>> Do you or anyone else have any idea why?
> >>>>>
> >>>>> Am Mi., 28. Dez. 2022 um 04:10 Uhr schrieb Shawn Heisey <
> >>>>> apa...@elyograg.org
> >>>>>> :
> >>>>>
> >>>>>> On 12/27/22 19:50, Mike wrote:
> >>>>>>> The server is not in cloud mode, it is a standalone server.
> >>>>>>> I don't understand where to put the query line, in the URL, with
> what
> >>>>>> query
> >>>>>>> parameter (?=) ?
> >>>>>>>
> >>>>>>> Do I have to change something in solr.xml or solrconfig?
> >>>>>>
> >>>>>> If you put it in the URL:
> >>>>>>
> >>>>>> &shards=server:port/solr/core1,server:port/solr/core2
> >>>>>>
> >>>>>> The way I did it is created a special core with no index of its own
> and
> >>>>>> put the following line in the solrconfig.xml, in the defaults
> section
> >>>> of
> >>>>>> the search handler:
> >>>>>>
> >>>>>> <str
> >>>>>> name="shards">
> >>>>>>
> >>>>>
> >>>>
> idxb2.example.com:8981/solr/inclive,idxb1.example.com:8981/solr/s0live,idxb1.example.com:8981/solr/s1live,idxb1.example.com:8981/solr/s2live,idxb2.example.com:8981/solr/s3live,idxb2.example.com:8981/solr/s4live,idxb2.example.com:8981/solr/s5live
> >>>>>> </str>
> >>>>>>
> >>>>>> Queries never went directly to the cores with data, they only went
> to
> >>>>>> the special core.  I wrote an indexing system that would ensure
> >>>>>> documents ended up in the correct shard.
> >>>>>>
> >>>>>> Thanks,
> >>>>>> Shawn
> >>>>>>
> >>>>>
> >>>>
> >>
> >> _______________________
> >> Eric Pugh | Founder & CEO | OpenSource Connections, LLC | 434.466.1467
> | http://www.opensourceconnections.com <
> http://www.opensourceconnections.com/> | My Free/Busy <
> http://tinyurl.com/eric-cal>
> >> Co-Author: Apache Solr Enterprise Search Server, 3rd Ed <
> https://www.packtpub.com/big-data-and-business-intelligence/apache-solr-enterprise-search-server-third-edition-raw>
>
> >> This e-mail and all contents, including attachments, is considered to
> be Company Confidential unless explicitly stated otherwise, regardless of
> whether attachments are marked as such.
> >>
>
> _______________________
> Eric Pugh | Founder & CEO | OpenSource Connections, LLC | 434.466.1467 |
> http://www.opensourceconnections.com <
> http://www.opensourceconnections.com/> | My Free/Busy <
> http://tinyurl.com/eric-cal>
> Co-Author: Apache Solr Enterprise Search Server, 3rd Ed <
> https://www.packtpub.com/big-data-and-business-intelligence/apache-solr-enterprise-search-server-third-edition-raw>
>
> This e-mail and all contents, including attachments, is considered to be
> Company Confidential unless explicitly stated otherwise, regardless of
> whether attachments are marked as such.
>
>

Reply via email to