ooasis opened a new pull request, #1189:
URL: https://github.com/apache/solr/pull/1189

   # Description
   
   TLOG/PULL replicas use _IndexFetcher_ to fetch segment files from leaders. 
Once new segment files are downloaded and merged into existing index, a new 
Searcher is opened so the updated data is made available to the clients.  The 
poll interval is determined by following code in _ReplicateFromLeader_
   
   ```
   if (uinfo.autoCommmitMaxTime != -1) {
      pollIntervalStr = toPollIntervalStr(uinfo.autoCommmitMaxTime/2);
   } else if (uinfo.autoSoftCommmitMaxTime != -1) {
      pollIntervalStr = toPollIntervalStr(uinfo.autoSoftCommmitMaxTime/2);
   }
   ```
   
   In a typical config for replication using TLOG/PULL replicas where data 
visibility is less important (a trade-off to avoid NRT replicas), we set a 
short commit time to persist changes and long soft-commit time to make changes 
visible.
   
   ```
   <autoCommit>
     <maxTime>15000</maxTime>
     <openSearcher>false</openSearcher>
   </autoCommit>
   <autoSoftCommit>
     <maxTime>3600000</maxTime>
   </autoSoftCommit>
   
   ``` 
   
   With about config, the poll interval will be 15/2 = 7 sec.  This leads to 
frequent opening of new Searchers which causes huge impact on realtime user 
queries, especially if the new Searcher takes long time to warmup.  This also 
makes changes visible on followers ahead of leaders.   
   
   Because the polling of new segment files is more about visibility because 
TLOG replicas still get updates to tlog files via UpdateHandler (this is my 
understanding). It seems more appropriate to use  _autoSoftCommmitMaxTime_ as 
the poll interval.   
   
   # Solution
   
   I would  proposed change below where autoSoftCommmitMaxTime is chosen as the 
preferred interval.  This will make the poll interval much longer and make the 
visibility order more inline with eventual consistency pattern.
   
   ```
   if (uinfo.autoSoftCommmitMaxTime != -1) {
       pollIntervalStr = toPollIntervalStr(uinfo.autoSoftCommmitMaxTime);
   } else if (uinfo.autoCommmitMaxTime != -1) {
       pollIntervalStr = toPollIntervalStr(uinfo.autoCommmitMaxTime);
   }
   ```
   
   # Tests
   
   The difference can only be tested with proper replication config and 
controlled indexing and user queries.  The change has been tried in my 
environment and showed much less impact on realtime queries compared with 
previous tests.
   
   # Checklist
   
   Please review the following and check all that apply:
   
   - [x] I have reviewed the guidelines for [How to 
Contribute](https://wiki.apache.org/solr/HowToContribute) and my code conforms 
to the standards described there to the best of my ability.
   - [x] I have created a Jira issue and added the issue ID to my pull request 
title.
   - [x] I have given Solr maintainers 
[access](https://help.github.com/en/articles/allowing-changes-to-a-pull-request-branch-created-from-a-fork)
 to contribute to my PR branch. (optional but recommended)
   - [x] I have developed this patch against the `main` branch.
   - [x] I have run `./gradlew check`.
   - [ ] I have added tests for my changes.
   - [ ] I have added documentation for the [Reference 
Guide](https://github.com/apache/solr/tree/main/solr/solr-ref-guide)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org
For additional commands, e-mail: issues-h...@solr.apache.org

Reply via email to