Are you using HDFSDirectory to serve your indices? I noticed that 
tlogDfsReplication is set, so that's why I'm asking.

8 maxWarmingSearchers is very high, typically that value is 2 or maybe 4, but 
you would know if this was an issue by looking at your logs.

I'm assuming that you had 30 NRT replicas before? If you had fewer, then your 
tail latencies might be higher because you're seeing cache misses on the 
queries. Do you have metrics on the response times for TLOG v PULL? Are they 
both slower, or just one?

Mike

On 2021/06/11 12:55:31, Nick Vladiceanu <vladicean...@gmail.com> wrote: 
> hello,
> I’m facing some performance issues when moving from NRT replica types to TLOG 
> + PULL. We’re constantly indexing new data and heavily querying (~2k rps).
> 
> - index size is ~ 2.5Gi;
> - number of docs ~4.6M;
> - 2 shards;
> - 7 cores and 14Gi of memory
> - 30 instances
> - JVM Heap is 12Gi
> 
> When running on NRT only, the response time in avg is ~150ms p99 and 40ms 
> p95. When changing to TLOG (6 tlog replicas) + 24 PULL, the response time 
> grows to ~350ms p99 and 120ms p95.
> 
> Here are some fragments from our solrconfig:
> 
>  
> >     <updateHandler class="solr.DirectUpdateHandler2">
> >         <updateLog>
> >             <str name="dir">${solr.data.dir:}</str>
> >             <int 
> > name="tlogDfsReplication">${solr.ulog.tlogDfsReplication:3}</int>
> >         </updateLog>
> > 
> >         <autoCommit>
> >             <maxTime>${solr.autoCommit.maxTime:60000}</maxTime>
> >             <maxDocs>${solr.autoCommit.maxDocs:10000}</maxDocs>
> >             <openSearcher>true</openSearcher>
> >         </autoCommit>
> > 
> >         <autoSoftCommit>
> >             <maxTime>${solr.autoSoftCommit.maxTime:300000}</maxTime>
> >         </autoSoftCommit>
> >     </updateHandler>
> 
> >     <query>
> >         <maxBooleanClauses>1000</maxBooleanClauses>
> >         <filterCache class="solr.CaffeineCache"
> >                      size="${filterCache.size:32768}"
> >                      initialSize="${filterCache.initialSize:32768}"
> >                      autowarmCount="20%"/>
> > 
> >         <queryResultCache class="solr.CaffeineCache"
> >                           size="${queryResultCache.size:32768}"
> >                           
> > initialSize="${queryResultCache.initialSize:32768}"
> >                           autowarmCount="0%"/>
> > 
> >         <documentCache class="solr.CaffeineCache"
> >                        size="${documentCache.size:150000}"
> >                        initialSize="${documentCache.initialSize:150000}"
> >                        autowarmCount="0%"/>
> > 
> >         <enableLazyFieldLoading>true</enableLazyFieldLoading>
> >         <useFilterForSortedQuery>true</useFilterForSortedQuery>
> > 
> >         <queryResultWindowSize>160</queryResultWindowSize>
> >         <queryResultMaxDocsCached>300</queryResultMaxDocsCached>
> > 
> >         <listener event="newSearcher" class="solr.QuerySenderListener">
> >         </listener>
> >         <listener event="firstSearcher" class="solr.QuerySenderListener">
> >         </listener>
> > 
> >         <useColdSearcher>false</useColdSearcher>
> >         <maxWarmingSearchers>8</maxWarmingSearchers> 
> >     </query>
> 
> One of my assumption was to reduce the maxWarmingSearchers and to increase 
> the autoCommit maxTime, since the softCommit isn’t available anymore in TLOG 
> replicas. Is that valid? 
> 
> I couldn’t find any documents with the differences/considerations we need to 
> take into account between NRT and TLOG, could you please help? Thanks a lot 
> in advance. Please let me know if there is anything else required.
> 
> Best regards,
> Nick Vladiceanu

Reply via email to