ozlerhakan created SOLR-16670:
---------------------------------

             Summary: Couldn't restore a backed up config set from S3
                 Key: SOLR-16670
                 URL: https://issues.apache.org/jira/browse/SOLR-16670
             Project: Solr
          Issue Type: Bug
      Security Level: Public (Default Security Level. Issues are Public)
          Components: Backup/Restore
    Affects Versions: 9.1.1, 9.1
            Reporter: ozlerhakan


 
Solr 9.1.x doesn't currently allow me to make a full restore of a backup where 
the data and config set are stored on a S3 bucket. The error I have received 
each run is "The specified key does not exist". Additionally, the full message 
is:
 
{code:java}
An AmazonServiceException was thrown! [serviceName=S3] [awsRequestId=2C6] 
[httpStatus=404] [s3ErrorCode=NoSuchKey] [message=The specified key does not 
exist.] {code}
 
After investigating the problem further, I have found that the path used to 
control whether it's a directory or not in the 
[isDirectory|https://github.com/apache/solr/blob/branch_9_1/solr/modules/s3-repository/src/java/org/apache/solr/s3/S3StorageClient.java#L323]
 method makes the 
`[S3Client.headObject|https://github.com/apache/solr/blob/branch_9_1/solr/modules/s3-repository/src/java/org/apache/solr/s3/S3StorageClient.java#L328]`
 method panic. On 
[line|https://github.com/apache/solr/blob/branch_9_1/solr/modules/s3-repository/src/java/org/apache/solr/s3/S3StorageClient.java#L324]
 324, the path pointing to a file is transformed into a path leading to a 
slash. When a path, for example, is 
"path1/path2/backup-name/collection-name/zk_backup_0/configs/config-set-v1/configoverlay.json",
 `sanitizedDirPath` adds a _slash_ "/" character to the end of the path as 
"path1/path2/backup-name/collection-name/zk_backup_0/configs/config-set-v1/configoverlay.json{color:#ff0000}/{color}".
 Although I'm able to restore the backup if the cluster already has the config 
schema definition in the zk, I cannot restore the backed up config schema files 
while creating an empty cluster due to this error.
  
For the sake of this question, here I am describing the other parts;
 
Backup definition:
{code:java}
  <backup>
    <repository name="s3-repo" class="org.apache.solr.s3.S3BackupRepository" 
default="false">
      <str name="s3.bucket.name">com.dev.bucket.backup.folder</str>
      <str name="s3.region">us-east-2</str>
    </repository>
  </backup> {code}
 
The backup folder structure on S3:
 
{code:java}
.
└── bucket-name
    └── path1
        └── path2
            └── backup-name
                └── collection-name
                    ├── backup_0.properties
                    ├── index ...
                    ├── shard_backup_metadata
                    │   └── md_shard1_0.json
                    └── zk_backup_0
                        ├── collection_state.json
                        └── configs
                            └── config-set-v1
                                ├── configoverlay.json
                                ├── solrconfig.xml
                                ├── stopwords.txt
                                └── synonyms.txt {code}
 

 
The cURL request I use for restore:
{code:java}
curl -i -X POST \
   -H "Content-Type:application/json" \
   -d \
'{
  "restore-collection": {
    "name": "backup-name",
    "collection": "collection-name-restored",
    "location": "path1/path2/"
    "repository": "s3-pro",
  }
}' \
 'http://localhost:8983/api/c' {code}
This is the original 
[question|https://lists.apache.org/thread/krrn3z5q4891fzcxs5dgcgkoohs86ncs] 
providing the same description.
 

 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org
For additional commands, e-mail: issues-h...@solr.apache.org

Reply via email to