[PR] Refactoring retention of backup schedules [cloudstack]

via GitHub Wed, 16 Jul 2025 07:42:40 -0700


bernardodemarco opened a new pull request, #11223:
URL: https://github.com/apache/cloudstack/pull/11223


   ### Description
   
   PR #10017 introduced, among other features, the ability to configure 
retention for backup schedules. This PR aims to fix some minor inconsistencies 
in that feature, refactor redundant workflows and behaviors, and improve code 
readability and maintainability by adding logs and more granular unit tests.
   
   Below, I have listed all the points that this PR addresses.
   
   ---
   
   #### Exposed `scheduleid` parameter
   
   When scheduling a backup creation, the ID of the schedule responsible for 
the backup is passed as a parameter to the `createBackup` API:
   
   
https://github.com/apache/cloudstack/blob/f52e05863e608f53f9e97e93fe47577b8dfe189b/server/src/main/java/org/apache/cloudstack/backup/BackupManagerImpl.java#L1307-L1322
   
   This parameter is intended solely for use by the scheduling workflow. End 
users should neither access nor be aware of it. If a backup is manually created 
while specifying a schedule ID, inconsistencies may occur later in the 
scheduling process. To prevent this, changes were made to ensure the parameter 
is not exposed to end users.
   
   ---
   
   #### Relationship between `cloud.backup` and `cloud.backup_schedule`
   
   To determine whether backups should be deleted to meet the retention 
requirements, the backup creation workflow must be able to identify whether the 
backup is scheduled and which schedule it belongs to. If the backup is 
scheduled, retention validation should be performed. On the other hand, if it 
is a manual backup, retention validation should be skipped. 
   
   Currently, the `cloud.backup` table uses the `backup_interval_type` column 
for this purpose:
   
   
https://github.com/apache/cloudstack/blob/f52e05863e608f53f9e97e93fe47577b8dfe189b/engine/schema/src/main/java/org/apache/cloudstack/backup/BackupVO.java#L91-L92
   
   
   However, the `cloud.backup_schedule` table already contains a 
`schedule_type` column, leading to data redundancy.
   
   
https://github.com/apache/cloudstack/blob/f52e05863e608f53f9e97e93fe47577b8dfe189b/engine/schema/src/main/java/org/apache/cloudstack/backup/BackupScheduleVO.java#L45-L46
   
   Because of this, the backup creation logic requires back-and-forth 
conversions between `DateUtil.IntervalType` and 
`org.apache.cloudstack.backup.Backup.Type`. For example:
   
   
https://github.com/apache/cloudstack/blob/f52e05863e608f53f9e97e93fe47577b8dfe189b/server/src/main/java/org/apache/cloudstack/backup/BackupManagerImpl.java#L819-L840
   
   Moreover, since the relationship is based solely on the interval type, 
deleting backups to comply with a schedule's retention policy may 
unintentionally remove backups from previously deleted schedules of the same 
type. For instance:
   
   1. Create an `HOURLY` schedule  
   2. 10 backups are created from this schedule  
   3. The schedule is deleted  
   4. A new `HOURLY` schedule is created with a retention of 3  
   5. On the next backup execution, 8 backups are deleted  
   
   This happens because the system currently treats all backups with the same 
interval type as belonging to the same schedule.
   
   Therefore, to address this, the relationship between `cloud.backup` and 
`cloud.backup_schedule` was refactored by removing the `schedule_type` from the 
`cloud.backup` table, and adding the `backup_schedule_id` column.
   
   This change makes it possible to associate each backup with a specific 
schedule, eliminating ambiguity and data redundancy. Manual backups will have a 
`NULL` value for `backup_schedule_id`, while scheduled backups will reference 
their respective schedule, from which the interval type can be determined.
   
   ---
   
   #### Removal of maximum allowed retention configurations
   
   The `backup.max.hourly`, `backup.max.daily`, `backup.max.weekly`, and 
`backup.max.monthly` settings are currently used to define the maximum 
retention values that end users can configure. These are validated during 
backup schedule creation, and an exception is thrown if the specified retention 
exceeds the configured maximum.
   
   However, since administrators can already define backup and allocated backup 
storage limits per account and domain, these settings have limited practical 
use. This PR proposes removing them, allowing users to omit retention entirely. 
In this approach, backup limits are enforced solely through account and 
domain-level control limits.
   
   ### Types of changes
   
   - [ ] Breaking change (fix or feature that would cause existing 
functionality to change)
   - [ ] New feature (non-breaking change which adds functionality)
   - [X] Bug fix (non-breaking change which fixes an issue)
   - [X] Enhancement (improves an existing feature and functionality)
   - [X] Cleanup (Code refactoring and cleanup, that may add test cases)
   - [ ] build/CI
   - [ ] test (unit or integration test code)
   
   ### Feature/Enhancement Scale or Bug Severity
   
   #### Feature/Enhancement Scale
   
   - [ ] Major
   - [X] Minor
   
   ### Screenshots (if appropriate):
   
   ### How Has This Been Tested?
   
   - Verified that manual backups are not associated with any schedule
   - Verified that for manual backups, the backup deletion flow is not executed 
after backups are created
   - Verified that when the retention is greater than 0 and the number of 
backups in a schedule is less than the retention, no backup is deleted
   - Verified that when the retention is greater than 0 and the number of 
backups in a schedule is greater than the retention, backups are correctly 
deleted
   - Verified that when the retention is increased, the updated retention value 
is respected
   - Verified that when the retention is reduced, the necessary number of 
backups is deleted to comply with the new retention value
   - Verified that when the backup schedule associated with a backup has a 
retention value of 0, the backup deletion flow is not executed
   - Deleted a schedule, created a new one with the same interval type, and 
verified that backups from the old schedule are not considered in the retention 
calculation
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@cloudstack.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[PR] Refactoring retention of backup schedules [cloudstack]

Reply via email to