[ 
https://issues.apache.org/jira/browse/SOLR-9961?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16875448#comment-16875448
 ] 

Lucene/Solr QA commented on SOLR-9961:
--------------------------------------

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:red}-1{color} | {color:red} patch {color} | {color:red}  0m  6s{color} 
| {color:red} SOLR-9961 does not apply to master. Rebase required? Wrong 
Branch? See 
https://wiki.apache.org/solr/HowToContribute#Creating_the_patch_file for help. 
{color} |
\\
\\
|| Subsystem || Report/Notes ||
| JIRA Issue | SOLR-9961 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12973125/SOLR-9961.patch |
| Console output | 
https://builds.apache.org/job/PreCommit-SOLR-Build/466/console |
| Powered by | Apache Yetus 0.7.0   http://yetus.apache.org |


This message was automatically generated.



> RestoreCore needs the option to download files in parallel.
> -----------------------------------------------------------
>
>                 Key: SOLR-9961
>                 URL: https://issues.apache.org/jira/browse/SOLR-9961
>             Project: Solr
>          Issue Type: Improvement
>          Components: Backup/Restore
>    Affects Versions: 6.2.1
>            Reporter: Timothy Potter
>            Priority: Major
>         Attachments: SOLR-9961.patch, SOLR-9961.patch, SOLR-9961.patch
>
>
> My backup to cloud storage (Google cloud storage in this case, but I think 
> this is a general problem) takes 8 minutes ... the restore of the same core 
> takes hours. The restore loop in RestoreCore is serial and doesn't allow me 
> to parallelize the expensive part of this operation (the IO from the remote 
> cloud storage service). We need the option to parallelize the download (like 
> distcp). 
> Also, I tried downloading the same directory using gsutil and it was very 
> fast, like 2 minutes. So I know it's not the pipe that's limiting perf here.
> Here's a very rough patch that does the parallelization. We may also want to 
> consider a two-step approach: 1) download in parallel to a temp dir, 2) 
> perform all the of the checksum validation against the local temp dir. That 
> will save round trips to the remote cloud storage.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to