[I] Apache CloudStack Usage Server repeatedly reprocesses historical usage from fixed start date and creates duplicate cloud_usage records [cloudstack]

via GitHub Wed, 06 May 2026 16:21:05 -0700


DaniloMurbach opened a new issue, #13112:
URL: https://github.com/apache/cloudstack/issues/13112


   ### problem
   
   We identified a persistent duplication issue in the Apache CloudStack Usage 
Server related to usage_type=13 (Network Offering usage).
   
   The Usage Server continuously reprocesses historical usage records starting 
from 2026-01-13 00:00:37, generating duplicate entries in the cloud_usage table 
every day.
   
   The issue appears to affect Network Offering usage accounting specifically 
(usage_type=13).
   
   Observed behavior:
   
   Duplicate cloud_usage rows are inserted repeatedly for the same:
   vm_instance_id
   offering_id
   usage_type
   start_date
   end_date
   raw_usage
   The duplicate count increases daily.
   Example:
   records for 2026-01-14 currently have 112 duplicates
   records for 2026-01-15 currently have 111 duplicates
   records for 2026-01-16 currently have 110 duplicates
   
   The pattern strongly suggests that the Usage Server is reprocessing the 
entire historical usage window every day instead of advancing the aggregation 
checkpoint.
   
   Example duplicate query:
   
   SELECT
     vm_instance_id,
     offering_id,
     usage_type,
     start_date,
     end_date,
     raw_usage,
     COUNT(*) AS duplicates
   FROM cloud_usage
   WHERE usage_type = 13
   GROUP BY
     vm_instance_id,
     offering_id,
     usage_type,
     start_date,
     end_date,
     raw_usage
   HAVING COUNT(*) > 1
   ORDER BY duplicates DESC;
   
   Example result:
   
   vm_instance_id = 456
   offering_id    = 39
   usage_type     = 13
   start_date     = 2026-01-14 00:00:00
   end_date       = 2026-01-14 23:59:59
   raw_usage      = 24
   duplicates     = 112
   
   We also observed repeated failed usage_job entries with invalid processing 
windows:
   
   start_date = 2026-05-06 00:00:00
   end_date   = 2026-05-05 23:59:59
   success    = 0
   
   At the same time, successful jobs continuously reprocess the same historical 
range:
   
   start_date = 2026-01-13 00:00:37
   end_date   = 2026-05-05 23:59:59
   
   This causes:
   
   continuous duplicate billing records
   inflated usage data
   incorrect accounting
   corrupted billing exports/reports
   
   ### versions
   
   Apache CloudStack Version:
   Apache CloudStack 4.21.0.0
   Database:  10.6.22-MariaDB-0
   Server: ubuntu0.22.04.1-log Ubuntu 22.04
   
   ### The steps to reproduce the bug
   
   1. Enable Usage Server in Apache CloudStack.
   2. Allow the Usage aggregation jobs to run normally for several days/weeks.
   3. Observe the usage_job table.
   4. Observe that successful jobs repeatedly start from the same historical 
date instead of advancing incrementally.
   5. Observe duplicate rows being inserted repeatedly into cloud_usage.
   6. Run:
   SELECT
     vm_instance_id,
     offering_id,
     usage_type,
     start_date,
     end_date,
     raw_usage,
     COUNT(*) AS duplicates
   FROM cloud_usage
   WHERE usage_type = 13
   GROUP BY
     vm_instance_id,
     offering_id,
     usage_type,
     start_date,
     end_date,
     raw_usage
   HAVING COUNT(*) > 1;
   Observe that duplicate counts increase daily.
   
   ### What to do about it?
   
   Potential areas to investigate:
   
   Usage Server aggregation checkpoint handling
   Usage job state persistence
   Reprocessing logic for historical aggregation windows
   Network Offering usage aggregation (usage_type=13)
   Failed usage_job recovery logic
   Validation for invalid aggregation windows where:
   start_date > end_date
   Deduplication protections before inserting into cloud_usage


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

[I] Apache CloudStack Usage Server repeatedly reprocesses historical usage from fixed start date and creates duplicate cloud_usage records [cloudstack]

Reply via email to