[
https://issues.apache.org/jira/browse/SOLR-17949?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Prateek Singhal updated SOLR-17949:
-----------------------------------
Description:
Currently, Solr lacks native support for backing up and restoring collections
to Azure Blob Storage. Organizations running Solr on Azure infrastructure have
no built-in way to leverage Azure Blob Storage for their backup and restore
operations, forcing them to use either local filesystem repositories (which
don't scale well in cloud environments) or third-party solutions.
This is problematic for Azure-based deployments because:
- Azure Blob Storage is the natural, cost-effective storage solution in Azure
environments
- Lack of native Azure support creates operational complexity
- Users cannot take advantage of Azure's built-in durability, geo-replication,
and lifecycle management
This contribution adds a BlobBackupRepository module that implements Solr's
BackupRepository interface
for Azure Blob Storage, following the same patterns as the existing GCS and S3
backup repositories.
Implementation approach:
- New blob-repository module under solr/modules/
- Support for 4 authentication methods (Connection String, Account Name + Key,
SAS Token, Azure Identity)
- Compatible with Azurite emulator for local development
- Follows Solr's established backup repository patterns
- 76 unit tests covering all authentication methods and backup/restore
operations
- All dependencies use Apache 2.0 compatible licenses
This enables Solr users on Azure to perform native backup and restore
operations using Azure Blob Storage, with the same ease of use as S3 and GCS
repositories.
was:
This contribution adds a new backup repository implementation for Azure Blob
Storage, allowing Solr to backup and restore collections to Microsoft Azure.
*Features*
- Full backup/restore functionality to Azure Blob Storage
- Support for 4 authentication methods:
* Connection String (for development)
* Account Name + Key (for simple production)
* SAS Token (recommended for production)
* Azure Identity (Managed Identity, Service Principal, Azure CLI)
- Compatible with local testing using Azurite emulator
- Comprehensive documentation and tests
- Incremental backup support with versioning
- Data integrity verification (checksum validation)
- Tested with collections up to 1GB+ with 100K+ documents
*Implementation Details*
- 8 implementation files
- 8 test files
- 76/76 passing unit tests
- All authentication methods verified
- Integration tested with real Azure Blob Storage
*Dependencies*
All dependencies use Apache-compatible licenses:
- Azure SDK for Java (Storage Blobs) 12.25.0 - Apache 2.0 license
- Azure SDK for Java (Identity) 1.11.0 - Apache 2.0 license
*Testing*
Local Testing with Azurite:
```bash
Install Azurite
npm install -g azurite
Start Azurite
azurite --silent --location /tmp/azurite
Run tests
./gradlew :solr:modules:blob-repository:test
```
*Integration Testing:*
- 76/76 unit tests passing
- Tested with real Azure Blob Storage
- Large collection testing (1GB, 100K documents)
- All 4 authentication methods verified
- Data integrity verified
> Add Azure Blob Storage backup repository module
> -----------------------------------------------
>
> Key: SOLR-17949
> URL: https://issues.apache.org/jira/browse/SOLR-17949
> Project: Solr
> Issue Type: New Feature
> Components: Backup/Restore
> Environment: * Tested with Java 17+
> * Compatible with Solr 10.x
> * Works with Azurite (local) and Azure Blob Storage (production)
> * All major operating systems (macOS, Linux, Windows)
> Reporter: Prateek Singhal
> Priority: Major
> Labels: azure, azureblob, backup, pull-request-available, restore
> Time Spent: 10m
> Remaining Estimate: 0h
>
> Currently, Solr lacks native support for backing up and restoring collections
> to Azure Blob Storage. Organizations running Solr on Azure infrastructure
> have no built-in way to leverage Azure Blob Storage for their backup and
> restore operations, forcing them to use either local filesystem repositories
> (which don't scale well in cloud environments) or third-party solutions.
> This is problematic for Azure-based deployments because:
> - Azure Blob Storage is the natural, cost-effective storage solution in Azure
> environments
> - Lack of native Azure support creates operational complexity
> - Users cannot take advantage of Azure's built-in durability,
> geo-replication, and lifecycle management
> This contribution adds a BlobBackupRepository module that implements Solr's
> BackupRepository interface
> for Azure Blob Storage, following the same patterns as the existing GCS and
> S3 backup repositories.
> Implementation approach:
> - New blob-repository module under solr/modules/
> - Support for 4 authentication methods (Connection String, Account Name +
> Key, SAS Token, Azure Identity)
> - Compatible with Azurite emulator for local development
> - Follows Solr's established backup repository patterns
> - 76 unit tests covering all authentication methods and backup/restore
> operations
> - All dependencies use Apache 2.0 compatible licenses
> This enables Solr users on Azure to perform native backup and restore
> operations using Azure Blob Storage, with the same ease of use as S3 and GCS
> repositories.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]