[jira] [Created] (SOLR-16670) Couldn't restore a backed up config set from S3
ozlerhakan created SOLR-16670: - Summary: Couldn't restore a backed up config set from S3 Key: SOLR-16670 URL: https://issues.apache.org/jira/browse/SOLR-16670 Project: Solr Issue Type: Bug Security Level: Public (Default Security Level. Issues are Public) Components: Backup/Restore Affects Versions: 9.1.1, 9.1 Reporter: ozlerhakan Solr 9.1.x doesn't currently allow me to make a full restore of a backup where the data and config set are stored on a S3 bucket. The error I have received each run is "The specified key does not exist". Additionally, the full message is: {code:java} An AmazonServiceException was thrown! [serviceName=S3] [awsRequestId=2C6] [httpStatus=404] [s3ErrorCode=NoSuchKey] [message=The specified key does not exist.] {code} After investigating the problem further, I have found that the path used to control whether it's a directory or not in the [isDirectory|https://github.com/apache/solr/blob/branch_9_1/solr/modules/s3-repository/src/java/org/apache/solr/s3/S3StorageClient.java#L323] method makes the `[S3Client.headObject|https://github.com/apache/solr/blob/branch_9_1/solr/modules/s3-repository/src/java/org/apache/solr/s3/S3StorageClient.java#L328]` method panic. On [line|https://github.com/apache/solr/blob/branch_9_1/solr/modules/s3-repository/src/java/org/apache/solr/s3/S3StorageClient.java#L324] 324, the path pointing to a file is transformed into a path leading to a slash. When a path, for example, is "path1/path2/backup-name/collection-name/zk_backup_0/configs/config-set-v1/configoverlay.json", `sanitizedDirPath` adds a _slash_ "/" character to the end of the path as "path1/path2/backup-name/collection-name/zk_backup_0/configs/config-set-v1/configoverlay.json{color:#ff}/{color}". Although I'm able to restore the backup if the cluster already has the config schema definition in the zk, I cannot restore the backed up config schema files while creating an empty cluster due to this error. For the sake of this question, here I am describing the other parts; Backup definition: {code:java} com.dev.bucket.backup.folder us-east-2 {code} The backup folder structure on S3: {code:java} . └── bucket-name └── path1 └── path2 └── backup-name └── collection-name ├── backup_0.properties ├── index ... ├── shard_backup_metadata │ └── md_shard1_0.json └── zk_backup_0 ├── collection_state.json └── configs └── config-set-v1 ├── configoverlay.json ├── solrconfig.xml ├── stopwords.txt └── synonyms.txt {code} The cURL request I use for restore: {code:java} curl -i -X POST \ -H "Content-Type:application/json" \ -d \ '{ "restore-collection": { "name": "backup-name", "collection": "collection-name-restored", "location": "path1/path2/" "repository": "s3-pro", } }' \ 'http://localhost:8983/api/c' {code} This is the original [question|https://lists.apache.org/thread/krrn3z5q4891fzcxs5dgcgkoohs86ncs] providing the same description. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For additional commands, e-mail: issues-h...@solr.apache.org
[jira] [Commented] (SOLR-16670) Couldn't restore a backed up config set from S3
[ https://issues.apache.org/jira/browse/SOLR-16670?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17691505#comment-17691505 ] ozlerhakan commented on SOLR-16670: --- I'll let you know about the result. > Couldn't restore a backed up config set from S3 > --- > > Key: SOLR-16670 > URL: https://issues.apache.org/jira/browse/SOLR-16670 > Project: Solr > Issue Type: Bug > Security Level: Public(Default Security Level. Issues are Public) > Components: Backup/Restore >Affects Versions: 9.1, 9.1.1 >Reporter: ozlerhakan >Priority: Major > Labels: Restore, S3, aws-s3 > Time Spent: 10m > Remaining Estimate: 0h > > > Solr 9.1.x doesn't currently allow me to make a full restore of a backup > where the data and config set are stored on a S3 bucket. The error I have > received each run is "The specified key does not exist". Additionally, the > full message is: > > {code:java} > An AmazonServiceException was thrown! [serviceName=S3] [awsRequestId=2C6] > [httpStatus=404] [s3ErrorCode=NoSuchKey] [message=The specified key does not > exist.] {code} > > After investigating the problem further, I have found that the path used to > control whether it's a directory or not in the > [isDirectory|https://github.com/apache/solr/blob/branch_9_1/solr/modules/s3-repository/src/java/org/apache/solr/s3/S3StorageClient.java#L323] > method makes the > `[S3Client.headObject|https://github.com/apache/solr/blob/branch_9_1/solr/modules/s3-repository/src/java/org/apache/solr/s3/S3StorageClient.java#L328]` > method panic. On > [line|https://github.com/apache/solr/blob/branch_9_1/solr/modules/s3-repository/src/java/org/apache/solr/s3/S3StorageClient.java#L324] > 324, the path pointing to a file is transformed into a path leading to a > slash. When a path, for example, is > "path1/path2/backup-name/collection-name/zk_backup_0/configs/config-set-v1/configoverlay.json", > `sanitizedDirPath` adds a _slash_ "/" character to the end of the path as > "path1/path2/backup-name/collection-name/zk_backup_0/configs/config-set-v1/configoverlay.json{color:#ff}/{color}". > Although I'm able to restore the backup if the cluster already has the > config schema definition in the zk, I cannot restore the backed up config > schema files while creating an empty cluster due to this error. > > For the sake of this question, here I am describing the other parts; > > Backup definition: > {code:java} > > default="false"> > com.dev.bucket.backup.folder > us-east-2 > > {code} > > The backup folder structure on S3: > > {code:java} > . > └── bucket-name > └── path1 > └── path2 > └── backup-name > └── collection-name > ├── backup_0.properties > ├── index ... > ├── shard_backup_metadata > │ └── md_shard1_0.json > └── zk_backup_0 > ├── collection_state.json > └── configs > └── config-set-v1 > ├── configoverlay.json > ├── solrconfig.xml > ├── stopwords.txt > └── synonyms.txt {code} > > > The cURL request I use for restore: > {code:java} > curl -i -X POST \ > -H "Content-Type:application/json" \ > -d \ > '{ > "restore-collection": { > "name": "backup-name", > "collection": "collection-name-restored", > "location": "path1/path2/" > "repository": "s3-pro", > } > }' \ > 'http://localhost:8983/api/c' {code} > This is the original > [question|https://lists.apache.org/thread/krrn3z5q4891fzcxs5dgcgkoohs86ncs] > providing the same description. > > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For additional commands, e-mail: issues-h...@solr.apache.org
[jira] [Commented] (SOLR-16670) Couldn't restore a backed up config set from S3
[ https://issues.apache.org/jira/browse/SOLR-16670?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17691535#comment-17691535 ] ozlerhakan commented on SOLR-16670: --- It definitely addresses the issue! You could add another assert to the [testDirectory|https://github.com/apache/solr/blob/main/solr/modules/s3-repository/src/test/org/apache/solr/s3/S3PathsTest.java#L46] method to produce the exception. For example: {code:java} assertFalse("File should not exist in the path", client.isDirectory("/simple-directory/empty.xml")); {code} > Couldn't restore a backed up config set from S3 > --- > > Key: SOLR-16670 > URL: https://issues.apache.org/jira/browse/SOLR-16670 > Project: Solr > Issue Type: Bug > Security Level: Public(Default Security Level. Issues are Public) > Components: Backup/Restore >Affects Versions: 9.1, 9.1.1 >Reporter: ozlerhakan >Priority: Major > Labels: Restore, S3, aws-s3 > Time Spent: 10m > Remaining Estimate: 0h > > > Solr 9.1.x doesn't currently allow me to make a full restore of a backup > where the data and config set are stored on a S3 bucket. The error I have > received each run is "The specified key does not exist". Additionally, the > full message is: > > {code:java} > An AmazonServiceException was thrown! [serviceName=S3] [awsRequestId=2C6] > [httpStatus=404] [s3ErrorCode=NoSuchKey] [message=The specified key does not > exist.] {code} > > After investigating the problem further, I have found that the path used to > control whether it's a directory or not in the > [isDirectory|https://github.com/apache/solr/blob/branch_9_1/solr/modules/s3-repository/src/java/org/apache/solr/s3/S3StorageClient.java#L323] > method makes the > `[S3Client.headObject|https://github.com/apache/solr/blob/branch_9_1/solr/modules/s3-repository/src/java/org/apache/solr/s3/S3StorageClient.java#L328]` > method panic. On > [line|https://github.com/apache/solr/blob/branch_9_1/solr/modules/s3-repository/src/java/org/apache/solr/s3/S3StorageClient.java#L324] > 324, the path pointing to a file is transformed into a path leading to a > slash. When a path, for example, is > "path1/path2/backup-name/collection-name/zk_backup_0/configs/config-set-v1/configoverlay.json", > `sanitizedDirPath` adds a _slash_ "/" character to the end of the path as > "path1/path2/backup-name/collection-name/zk_backup_0/configs/config-set-v1/configoverlay.json{color:#ff}/{color}". > Although I'm able to restore the backup if the cluster already has the > config schema definition in the zk, I cannot restore the backed up config > schema files while creating an empty cluster due to this error. > > For the sake of this question, here I am describing the other parts; > > Backup definition: > {code:java} > > default="false"> > com.dev.bucket.backup.folder > us-east-2 > > {code} > > The backup folder structure on S3: > > {code:java} > . > └── bucket-name > └── path1 > └── path2 > └── backup-name > └── collection-name > ├── backup_0.properties > ├── index ... > ├── shard_backup_metadata > │ └── md_shard1_0.json > └── zk_backup_0 > ├── collection_state.json > └── configs > └── config-set-v1 > ├── configoverlay.json > ├── solrconfig.xml > ├── stopwords.txt > └── synonyms.txt {code} > > > The cURL request I use for restore: > {code:java} > curl -i -X POST \ > -H "Content-Type:application/json" \ > -d \ > '{ > "restore-collection": { > "name": "backup-name", > "collection": "collection-name-restored", > "location": "path1/path2/" > "repository": "s3-pro", > } > }' \ > 'http://localhost:8983/api/c' {code} > This is the original > [question|https://lists.apache.org/thread/krrn3z5q4891fzcxs5dgcgkoohs86ncs] > providing the same description. > > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For additional commands, e-mail: issues-h...@solr.apache.org