[ https://issues.apache.org/jira/browse/SOLR-16879?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17813066#comment-17813066 ]
Alexey Serba commented on SOLR-16879: ------------------------------------- I think this feature introduced a regression that you can not backup collections with more than 10 shards as thread pool is rejecting new tasks: {noformat} SolrException: Could not backup all shards Task org.apache.solr.common.util.ExecutorUtil$MDCAwareThreadPoolExecutor$$Lambda... rejected from org.apache.solr.common.util.ExecutorUtil$MDCAwareThreadPoolExecutor[Running, pool size = 5, active threads = 5, queued tasks = 5, completed tasks = 30]" {noformat} {{expensiveExecutor}} thread pool executor is [created|https://github.com/apache/solr/blob/releases/solr/9.4.1/solr/solrj/src/java/org/apache/solr/common/util/ExecutorUtil.java#L170-L174] with 5 max threads and bounded queue of the same size (5), so the total number of tasks is limited to 10 and all the other tasks are immediately rejected. > Throttle concurrent backups/restores per node > --------------------------------------------- > > Key: SOLR-16879 > URL: https://issues.apache.org/jira/browse/SOLR-16879 > Project: Solr > Issue Type: Improvement > Components: Backup/Restore > Affects Versions: 9.2.1 > Reporter: Pierre Salagnac > Priority: Minor > Time Spent: 3h > Remaining Estimate: 0h > > If the collection is large enough, there very well could be many shards on > one host and it could saturate the IO. Same issue if we backup many > collections concurrently. > We should have a protection mechanism, so a Solr node does not have transient > failures during a large backup or restore. -- This message was sent by Atlassian Jira (v8.20.10#820010) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For additional commands, e-mail: issues-h...@solr.apache.org