dsmiley commented on pull request #2:
URL: https://github.com/apache/solr/pull/2#issuecomment-821772007


   I did some benchmarking, finally.  The new implementation appears 8% faster 
overall, excluding the optimization I added.  The data set was a million docs 
and a field with a gaussian distribution of terms.  The queries had a filter 
query with a term against that field, and it resulted in a SortedIntDocSet the 
vast majority of the time.  The main "q" query was a randomly produced phrase 
query that would always match some subphrase of a sentence found in many 
documents.  The benchmark produced 2000 consistently random queries and I 
re-ran this about 10 times and took the average of the fastest 3 runs.
   
   I added a small optimization to short-circuit intersect(DocSet) when there 
was no intersection change.  The % improvement moved to ~11%.  In my benchmark, 
there was another filter query that matched everything, and 
SolrIndexSearcher.getProcessedFilter would intersect that with the cached 
SortedIntDocSet producing a new SortedIntDocSet every time, and thus there was 
_never_ any cache re-use of cachedOrdIdxMap.  Of course this is highly 
dependent on the benchmarking scenario.  This really emphasizes how "YMMV" 
applies to benchmarking this stuff because it's so dependent on what the app's 
usage pattern looks like.  I think in some future JIRA issue, 
getProcessedFilter would be better off not intersecting any SortedIntDocSets; 
they could simply be added as separate filters in a BooleanQuery (dependent on 
SOLR-14166).
   
   So I think this is ready to merge!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org
For additional commands, e-mail: issues-h...@solr.apache.org

Reply via email to