Hi All,

Solr 9.1.1 doesn't currently allow me to make a full restore of a backup
where the data and config set are stored on a S3 bucket. The error I have
received each run is "The specified key does not exist". Additionally, the
full message is:

An AmazonServiceException was thrown! [serviceName=S3] [awsRequestId=2C6]
> [httpStatus=404] [s3ErrorCode=NoSuchKey] [message=The specified key does
> not exist.]


After investigating the problem further, I have found that the path used to
control whether it's a directory or not in the isDirectory
<https://github.com/apache/solr/blob/branch_9_1/solr/modules/s3-repository/src/java/org/apache/solr/s3/S3StorageClient.java#L323>
method makes the `S3Client.headObject
<https://github.com/apache/solr/blob/branch_9_1/solr/modules/s3-repository/src/java/org/apache/solr/s3/S3StorageClient.java#L328>`
method panic. On line
<https://github.com/apache/solr/blob/branch_9_1/solr/modules/s3-repository/src/java/org/apache/solr/s3/S3StorageClient.java#L324>
324, the
path pointing to a file is transformed into a path leading to a slash. When
a path, for example, is
"path1/path2/backup-name/collection-name/zk_backup_0/configs/config-set-v1/configoverlay.json",
`sanitizedDirPath` adds a *slash* "/" character to the end of the path as
"path1/path2/backup-name/collection-name/zk_backup_0/configs/config-set-v1/configoverlay.json
/". Although I'm able to restore the backup if the cluster already has the
config schema definition in the zk, I cannot restore the backed up config
schema files while creating an empty cluster due to this error.

For the sake of this question, here I am describing the other parts;

Backup definition:

>   <backup>
>     <repository name="s3-repo"
> class="org.apache.solr.s3.S3BackupRepository" default="false">
>       <str name="s3.bucket.name">com.dev.bucket.backup.folder</str>
>       <str name="s3.region">us-east-2</str>
>     </repository>
>   </backup>


The backup folder structure on S3:

> .
> └── bucket-name
>     └── path1
>         └── path2
>             └── backup-name
>                 └── collection-name
>                     ├── backup_0.properties
>                     ├── index ...

                    ├── shard_backup_metadata
>                     │   └── md_shard1_0.json
>                     └── zk_backup_0
>                         ├── collection_state.json
>                         └── configs
>                             └── config-set-v1
>                                 ├── configoverlay.json
>                                 ├── solrconfig.xml
>                                 ├── stopwords.txt
>                                 └── synonyms.txt


The cURL request I use for restore:

curl -i -X POST \
>    -H "Content-Type:application/json" \
>    -d \
> '{
>   "restore-collection": {
>     "name": "backup-name",
>     "collection": "collection-name-restored",
>     "location": "path1/path2/"
>     "repository": "s3-pro",
>   }
> }' \
>  'http://localhost:8983/api/c'
>

Could you please route me to the right direction regarding the issue?

Thanks!
Hakan

Reply via email to