[jira] [Created] (SOLR-16670) Couldn't restore a backed up config set from S3

2023-02-17 Thread ozlerhakan (Jira)
ozlerhakan created SOLR-16670:
-

 Summary: Couldn't restore a backed up config set from S3
 Key: SOLR-16670
 URL: https://issues.apache.org/jira/browse/SOLR-16670
 Project: Solr
  Issue Type: Bug
  Security Level: Public (Default Security Level. Issues are Public)
  Components: Backup/Restore
Affects Versions: 9.1.1, 9.1
Reporter: ozlerhakan


 
Solr 9.1.x doesn't currently allow me to make a full restore of a backup where 
the data and config set are stored on a S3 bucket. The error I have received 
each run is "The specified key does not exist". Additionally, the full message 
is:
 
{code:java}
An AmazonServiceException was thrown! [serviceName=S3] [awsRequestId=2C6] 
[httpStatus=404] [s3ErrorCode=NoSuchKey] [message=The specified key does not 
exist.] {code}
 
After investigating the problem further, I have found that the path used to 
control whether it's a directory or not in the 
[isDirectory|https://github.com/apache/solr/blob/branch_9_1/solr/modules/s3-repository/src/java/org/apache/solr/s3/S3StorageClient.java#L323]
 method makes the 
`[S3Client.headObject|https://github.com/apache/solr/blob/branch_9_1/solr/modules/s3-repository/src/java/org/apache/solr/s3/S3StorageClient.java#L328]`
 method panic. On 
[line|https://github.com/apache/solr/blob/branch_9_1/solr/modules/s3-repository/src/java/org/apache/solr/s3/S3StorageClient.java#L324]
 324, the path pointing to a file is transformed into a path leading to a 
slash. When a path, for example, is 
"path1/path2/backup-name/collection-name/zk_backup_0/configs/config-set-v1/configoverlay.json",
 `sanitizedDirPath` adds a _slash_ "/" character to the end of the path as 
"path1/path2/backup-name/collection-name/zk_backup_0/configs/config-set-v1/configoverlay.json{color:#ff}/{color}".
 Although I'm able to restore the backup if the cluster already has the config 
schema definition in the zk, I cannot restore the backed up config schema files 
while creating an empty cluster due to this error.
  
For the sake of this question, here I am describing the other parts;
 
Backup definition:
{code:java}
  
    
      com.dev.bucket.backup.folder
      us-east-2
    
   {code}
 
The backup folder structure on S3:
 
{code:java}
.
└── bucket-name
    └── path1
        └── path2
            └── backup-name
                └── collection-name
                    ├── backup_0.properties
                    ├── index ...
                    ├── shard_backup_metadata
                    │   └── md_shard1_0.json
                    └── zk_backup_0
                        ├── collection_state.json
                        └── configs
                            └── config-set-v1
                                ├── configoverlay.json
                                ├── solrconfig.xml
                                ├── stopwords.txt
                                └── synonyms.txt {code}
 

 
The cURL request I use for restore:
{code:java}
curl -i -X POST \
   -H "Content-Type:application/json" \
   -d \
'{
  "restore-collection": {
    "name": "backup-name",
    "collection": "collection-name-restored",
    "location": "path1/path2/"
    "repository": "s3-pro",
  }
}' \
 'http://localhost:8983/api/c' {code}
This is the original 
[question|https://lists.apache.org/thread/krrn3z5q4891fzcxs5dgcgkoohs86ncs] 
providing the same description.
 

 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org
For additional commands, e-mail: issues-h...@solr.apache.org



[jira] [Commented] (SOLR-16670) Couldn't restore a backed up config set from S3

2023-02-21 Thread ozlerhakan (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-16670?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17691505#comment-17691505
 ] 

ozlerhakan commented on SOLR-16670:
---

I'll let you know about the result.

> Couldn't restore a backed up config set from S3
> ---
>
> Key: SOLR-16670
> URL: https://issues.apache.org/jira/browse/SOLR-16670
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: Backup/Restore
>Affects Versions: 9.1, 9.1.1
>Reporter: ozlerhakan
>Priority: Major
>  Labels: Restore, S3, aws-s3
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
>  
> Solr 9.1.x doesn't currently allow me to make a full restore of a backup 
> where the data and config set are stored on a S3 bucket. The error I have 
> received each run is "The specified key does not exist". Additionally, the 
> full message is:
>  
> {code:java}
> An AmazonServiceException was thrown! [serviceName=S3] [awsRequestId=2C6] 
> [httpStatus=404] [s3ErrorCode=NoSuchKey] [message=The specified key does not 
> exist.] {code}
>  
> After investigating the problem further, I have found that the path used to 
> control whether it's a directory or not in the 
> [isDirectory|https://github.com/apache/solr/blob/branch_9_1/solr/modules/s3-repository/src/java/org/apache/solr/s3/S3StorageClient.java#L323]
>  method makes the 
> `[S3Client.headObject|https://github.com/apache/solr/blob/branch_9_1/solr/modules/s3-repository/src/java/org/apache/solr/s3/S3StorageClient.java#L328]`
>  method panic. On 
> [line|https://github.com/apache/solr/blob/branch_9_1/solr/modules/s3-repository/src/java/org/apache/solr/s3/S3StorageClient.java#L324]
>  324, the path pointing to a file is transformed into a path leading to a 
> slash. When a path, for example, is 
> "path1/path2/backup-name/collection-name/zk_backup_0/configs/config-set-v1/configoverlay.json",
>  `sanitizedDirPath` adds a _slash_ "/" character to the end of the path as 
> "path1/path2/backup-name/collection-name/zk_backup_0/configs/config-set-v1/configoverlay.json{color:#ff}/{color}".
>  Although I'm able to restore the backup if the cluster already has the 
> config schema definition in the zk, I cannot restore the backed up config 
> schema files while creating an empty cluster due to this error.
>   
> For the sake of this question, here I am describing the other parts;
>  
> Backup definition:
> {code:java}
>   
>      default="false">
>       com.dev.bucket.backup.folder
>       us-east-2
>     
>    {code}
>  
> The backup folder structure on S3:
>  
> {code:java}
> .
> └── bucket-name
>     └── path1
>         └── path2
>             └── backup-name
>                 └── collection-name
>                     ├── backup_0.properties
>                     ├── index ...
>                     ├── shard_backup_metadata
>                     │   └── md_shard1_0.json
>                     └── zk_backup_0
>                         ├── collection_state.json
>                         └── configs
>                             └── config-set-v1
>                                 ├── configoverlay.json
>                                 ├── solrconfig.xml
>                                 ├── stopwords.txt
>                                 └── synonyms.txt {code}
>  
>  
> The cURL request I use for restore:
> {code:java}
> curl -i -X POST \
>    -H "Content-Type:application/json" \
>    -d \
> '{
>   "restore-collection": {
>     "name": "backup-name",
>     "collection": "collection-name-restored",
>     "location": "path1/path2/"
>     "repository": "s3-pro",
>   }
> }' \
>  'http://localhost:8983/api/c' {code}
> This is the original 
> [question|https://lists.apache.org/thread/krrn3z5q4891fzcxs5dgcgkoohs86ncs] 
> providing the same description.
>  
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org
For additional commands, e-mail: issues-h...@solr.apache.org



[jira] [Commented] (SOLR-16670) Couldn't restore a backed up config set from S3

2023-02-21 Thread ozlerhakan (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-16670?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17691535#comment-17691535
 ] 

ozlerhakan commented on SOLR-16670:
---

It definitely addresses the issue! You could add another assert to the 
[testDirectory|https://github.com/apache/solr/blob/main/solr/modules/s3-repository/src/test/org/apache/solr/s3/S3PathsTest.java#L46]
 method to produce the exception. For example:
{code:java}
assertFalse("File should not exist in the path", 
client.isDirectory("/simple-directory/empty.xml"));
{code}

> Couldn't restore a backed up config set from S3
> ---
>
> Key: SOLR-16670
> URL: https://issues.apache.org/jira/browse/SOLR-16670
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: Backup/Restore
>Affects Versions: 9.1, 9.1.1
>Reporter: ozlerhakan
>Priority: Major
>  Labels: Restore, S3, aws-s3
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
>  
> Solr 9.1.x doesn't currently allow me to make a full restore of a backup 
> where the data and config set are stored on a S3 bucket. The error I have 
> received each run is "The specified key does not exist". Additionally, the 
> full message is:
>  
> {code:java}
> An AmazonServiceException was thrown! [serviceName=S3] [awsRequestId=2C6] 
> [httpStatus=404] [s3ErrorCode=NoSuchKey] [message=The specified key does not 
> exist.] {code}
>  
> After investigating the problem further, I have found that the path used to 
> control whether it's a directory or not in the 
> [isDirectory|https://github.com/apache/solr/blob/branch_9_1/solr/modules/s3-repository/src/java/org/apache/solr/s3/S3StorageClient.java#L323]
>  method makes the 
> `[S3Client.headObject|https://github.com/apache/solr/blob/branch_9_1/solr/modules/s3-repository/src/java/org/apache/solr/s3/S3StorageClient.java#L328]`
>  method panic. On 
> [line|https://github.com/apache/solr/blob/branch_9_1/solr/modules/s3-repository/src/java/org/apache/solr/s3/S3StorageClient.java#L324]
>  324, the path pointing to a file is transformed into a path leading to a 
> slash. When a path, for example, is 
> "path1/path2/backup-name/collection-name/zk_backup_0/configs/config-set-v1/configoverlay.json",
>  `sanitizedDirPath` adds a _slash_ "/" character to the end of the path as 
> "path1/path2/backup-name/collection-name/zk_backup_0/configs/config-set-v1/configoverlay.json{color:#ff}/{color}".
>  Although I'm able to restore the backup if the cluster already has the 
> config schema definition in the zk, I cannot restore the backed up config 
> schema files while creating an empty cluster due to this error.
>   
> For the sake of this question, here I am describing the other parts;
>  
> Backup definition:
> {code:java}
>   
>      default="false">
>       com.dev.bucket.backup.folder
>       us-east-2
>     
>    {code}
>  
> The backup folder structure on S3:
>  
> {code:java}
> .
> └── bucket-name
>     └── path1
>         └── path2
>             └── backup-name
>                 └── collection-name
>                     ├── backup_0.properties
>                     ├── index ...
>                     ├── shard_backup_metadata
>                     │   └── md_shard1_0.json
>                     └── zk_backup_0
>                         ├── collection_state.json
>                         └── configs
>                             └── config-set-v1
>                                 ├── configoverlay.json
>                                 ├── solrconfig.xml
>                                 ├── stopwords.txt
>                                 └── synonyms.txt {code}
>  
>  
> The cURL request I use for restore:
> {code:java}
> curl -i -X POST \
>    -H "Content-Type:application/json" \
>    -d \
> '{
>   "restore-collection": {
>     "name": "backup-name",
>     "collection": "collection-name-restored",
>     "location": "path1/path2/"
>     "repository": "s3-pro",
>   }
> }' \
>  'http://localhost:8983/api/c' {code}
> This is the original 
> [question|https://lists.apache.org/thread/krrn3z5q4891fzcxs5dgcgkoohs86ncs] 
> providing the same description.
>  
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org
For additional commands, e-mail: issues-h...@solr.apache.org