Hi everyone,

I have a few questions about how we can improve our solr query performance,
especially for boosts (BF, BQ, boost, etc).

*System Specs:*
Solr Version: 7.7.x
Heap Size: 31gb
Num Docs: >100M
Shards: 8
Replication Factor: 6
Index is completely mapped into memory


Example query:
{
q=hello world
qf=title description keywords
pf=title^0.5
ps=0
fq=type:P
boost:def(boostFieldA,1) // boostFieldA is docValue float type
bf=mul(termfreq(termScoreFieldB,$q),1000.0) // termScoreFieldB is a
textField. No docValue, just indexed
rows:500
fl=id,score
}

numFound: >21M
qTime: 800ms

Experimentation of params:

   - When I remove the boost parameter, the qTime drops to 525ms
   - When I remove the bf parameter, the qTime dropes to 650ms
   - When I remove both the boost & bf parameters, the qTime drops to 400ms


Questions:

   1. Is there any way to improve the performance of the boosts (specific
   field types, etc)?
   2. Will sharding further such that each core only has to score a smaller
   subset of documents help with query performance?
   3. Is there any performance impact when boosting/querying against sparse
   fields, both indexed=true or docValues=true?
   4. It seems the base case scoring is 400ms, which is already quite high.
   Is this because the query (hello world) implicitly gets parsed as (hello OR
   world)? Thus it would be more computationally expensive?
   5. Any other advice :) ?


Thanks in advance,

Ash

-- 
**
** <https://www.canva.com/>Empowering the world to design
Share accurate 
information on COVID-19 and spread messages of support to your community.
Here are some resources 
<https://about.canva.com/coronavirus-awareness-collection/?utm_medium=pr&utm_source=news&utm_campaign=covid19_templates>
 
that can help.
 <https://twitter.com/canva> <https://facebook.com/canva> 
<https://au.linkedin.com/company/canva> <https://twitter.com/canva>  
<https://facebook.com/canva>  <https://au.linkedin.com/company/canva>  
<https://instagram.com/canva>










Reply via email to