[ https://issues.apache.org/jira/browse/SOLR-16561?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17638026#comment-17638026 ]
Hang Sun commented on SOLR-16561: --------------------------------- Added PR: https://github.com/apache/solr/pull/1189 > Use autoSoftCommmitMaxTime as preferred poll interval of IndexFetcher > --------------------------------------------------------------------- > > Key: SOLR-16561 > URL: https://issues.apache.org/jira/browse/SOLR-16561 > Project: Solr > Issue Type: Improvement > Components: replication (java) > Affects Versions: 8.8.2 > Reporter: Hang Sun > Priority: Minor > Labels: replication-performance > Attachments: SOLR-16561.patch > > Time Spent: 10m > Remaining Estimate: 0h > > TLOG/PULL replicas use *IndexFetcher* to fetch segment files from leaders. > Once new segment files are downloaded and merged into existing index, a new > Searcher is opened so the updated data is made available to the clients. The > poll interval is determined by following code in *ReplicateFromLeader* > {code:java} > if (uinfo.autoCommmitMaxTime != -1) { > pollIntervalStr = toPollIntervalStr(uinfo.autoCommmitMaxTime/2); > } else if (uinfo.autoSoftCommmitMaxTime != -1) { > pollIntervalStr = toPollIntervalStr(uinfo.autoSoftCommmitMaxTime/2); > }{code} > > In a typical config for replication using TLOG/PULL replicas where data > visibility is less important (a trade-off to avoid NRT replicas), we set a > short commit time to persist changes and long soft-commit time to make > changes visible. > > {code:java} > <autoCommit> > <maxTime>15000</maxTime> > <openSearcher>false</openSearcher> > </autoCommit> > <autoSoftCommit> > <maxTime>3600000</maxTime> > </autoSoftCommit> > {code} > > With about config, the poll interval will be 15/2 = 7 sec. This leads to > frequent opening of new Searchers which causes huge impact on realtime user > queries, especially if the new Searcher takes long time to warmup. This also > makes changes visible on followers ahead of leaders. > Because the polling of new segment files is more about visibility because > TLOG replicas still get updates to tlog files via UpdateHandler (this is my > understanding). It seems more appropriate to use *autoSoftCommmitMaxTime* as > the poll interval. > I would proposed change below where *autoSoftCommmitMaxTime* is chosen as > the preferred interval. This will make the poll interval much longer and > make the visibility order more inline with eventual consistency pattern. > > {code:java} > if (uinfo.autoSoftCommmitMaxTime != -1) { > pollIntervalStr = toPollIntervalStr(uinfo.autoSoftCommmitMaxTime); > } else if (uinfo.autoCommmitMaxTime != -1) { > pollIntervalStr = toPollIntervalStr(uinfo.autoCommmitMaxTime); > } > {code} > The change has been tried and showed much less impact on realtime queries. -- This message was sent by Atlassian Jira (v8.20.10#820010) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For additional commands, e-mail: issues-h...@solr.apache.org