mlbiscoc commented on code in PR #4023:
URL: https://github.com/apache/solr/pull/4023#discussion_r3157499792
##########
solr/core/src/java/org/apache/solr/handler/RestoreCore.java:
##########
@@ -107,34 +125,140 @@ public boolean doRestore() throws Exception {
DirectoryFactory.DirContext.DEFAULT,
core.getSolrConfig().indexConfig.lockType);
Set<String> indexDirFiles = new
HashSet<>(Arrays.asList(indexDir.listAll()));
- // Move all files from backupDir to restoreIndexDir
- for (String filename : repository.listAllFiles()) {
- checkInterrupted();
- try {
- if (indexDirFiles.contains(filename)) {
- Checksum cs = repository.checksum(filename);
- IndexFetcher.CompareResult compareResult;
- if (cs == null) {
- compareResult = new IndexFetcher.CompareResult();
- compareResult.equal = false;
- } else {
- compareResult = IndexFetcher.compareFile(indexDir, filename,
cs.size, cs.checksum);
+
+ // Capture directories as final for lambda access
+ final Directory finalIndexDir = indexDir;
+ final Directory finalRestoreIndexDir = restoreIndexDir;
+
+ // Only use an executor for parallel downloads when parallelism > 1
+ // When set to 1, run synchronously to avoid thread-local state issues
with CallerRunsPolicy
+ int maxParallelDownloads = DEFAULT_MAX_PARALLEL_DOWNLOADS;
+ ExecutorService executor =
+ maxParallelDownloads > 1
+ ? new ExecutorUtil.MDCAwareThreadPoolExecutor(
+ 0,
+ maxParallelDownloads,
+ 60L,
+ TimeUnit.SECONDS,
+ new SynchronousQueue<>(),
+ new SolrNamedThreadFactory("RestoreCore"),
+ new ThreadPoolExecutor.CallerRunsPolicy())
Review Comment:
Actually now that I think of it more the static pool might make this worse
since then if set to 1 then all callers fight for 1 thread. So we should
definitely use what you are doing here and but don't create the pool in the
call, but make it a static pool with `ThreadPoolExecutor.CallerRunsPolicy()`.
Now if the global pool gets saturated, the calling thread takes over instead of
creating executors per call.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]