JoaoJandre commented on PR #8737: URL: https://github.com/apache/cloudstack/pull/8737#issuecomment-1992241242
> > > @JoaoJandre We can't run explain analyze with delete commands. Adding the index also helps reduce the load on database while cleaning up the entries. Otherwise the DB will have to go through each record to filter out all the entries which can cause spike in I/O as well. I understand that if the administrator reduces the retention period, it can cause issues but having the index will speed up deletion which runs every minute. > > > > > > @vishesh92 you are proposing to add an index to improve DELETE operations; however, you are presenting results for SELECT operations. Indeed, indexes can improve SELECT operations, which only list the data. However, the DELETE operation works different: it removes data, recalculates statistics, reorganizes indexes and more. Please, present some tests and results for DELETE operations; otherwise, it is not feasible to sustain the reason for adding those indexes. > > @JoaoJandre Here are the results for a delete operation where the filter doesn't matches any rows. > > ```sql > MySQL root@(none):test> DELETE FROM vm_stats WHERE timestamp < '1950-01-01'; > Query OK, 0 rows affected > Time: 0.001s > MySQL root@(none):test> DELETE FROM vm_stats WHERE timestamp < '1950-01-01'; > Query OK, 0 rows affected > Time: 0.001s > ``` > > query plan > > ```sql > +------------------------------------------------------------------------------------------------+ > | EXPLAIN | > |------------------------------------------------------------------------------------------------| > | { | > | "query_block": { | > | "select_id": 1, | > | "table": { | > | "delete": true, | > | "table_name": "vm_stats", | > | "access_type": "range", | > | "possible_keys": [ | > | "temp_idx" | > | ], | > | "key": "temp_idx", | > | "used_key_parts": [ | > | "timestamp" | > | ], | > | "key_length": "5", | > | "ref": [ | > | "const" | > | ], | > | "rows_examined_per_scan": 1, | > | "filtered": "100.00", | > | "attached_condition": "(`test`.`vm_stats`.`timestamp` < TIMESTAMP'1950-01-01 00:00:00')" | > | } | > | } | > | } | > +------------------------------------------------------------------------------------------------+ > ``` > > After dropping the index > > ```sql > MySQL root@(none):test> DELETE FROM vm_stats WHERE timestamp < '1950-01-01'; > Query OK, 0 rows affected > Time: 0.364s > MySQL root@(none):test> DELETE FROM vm_stats WHERE timestamp < '1950-01-01'; > Query OK, 0 rows affected > Time: 0.362s > ``` > > query plan > > ```sql > +------------------------------------------------------------------------------------------------+ > | EXPLAIN | > |------------------------------------------------------------------------------------------------| > | { | > | "query_block": { | > | "select_id": 1, | > | "table": { | > | "delete": true, | > | "table_name": "vm_stats", | > | "access_type": "ALL", | > | "rows_examined_per_scan": 660340, | > | "filtered": "100.00", | > | "attached_condition": "(`test`.`vm_stats`.`timestamp` < TIMESTAMP'1950-01-01 00:00:00')" | > | } | > | } | > | } | > +------------------------------------------------------------------------------------------------+ > ``` > > As you can see, without the index it takes around 0.36 seconds because it has to go through each row in the database. It takes 0.001 seconds with the index because it doesn't need to go through all the rows in the table. @vishesh92, I'm sorry, but I don't understand the point of a DELETE test where no rows are deleted. I've decided to test/compare the performance of deletion on the vm_stats table using a more realistic scenario. Consider the following: We have 1000 VMs, 2 MGMT servers each collecting stats every 30 seconds, and we have a retention time of 7 days, this will give us a table size of around 40 million rows. Thus, I've created a mock vm_stats table with that many rows, using random dates between `2024-03-01 00:00:00` and `2024-03-08 00:00:00` and made the following tests (I've regenerated the table when needed): Without adding the proposed indexes, I deleted rows that had a timestamp lower then `2024-03-01 00:01:00` : ``` MariaDB [teste]> ANALYZE DELETE FROM vm_stats WHERE timestamp < '2024-03-01 00:01:00'; +------+-------------+----------+------+---------------+------+---------+------+----------+-------------+----------+------------+-------------+ | id | select_type | table | type | possible_keys | key | key_len | ref | rows | r_rows | filtered | r_filtered | Extra | +------+-------------+----------+------+---------------+------+---------+------+----------+-------------+----------+------------+-------------+ | 1 | SIMPLE | vm_stats | ALL | NULL | NULL | NULL | NULL | 39379728 | 40198741.00 | 100.00 | 0.01 | Using where | +------+-------------+----------+------+---------------+------+---------+------+----------+-------------+----------+------------+-------------+ 1 row in set (31.688 sec) ``` After, I added the indexes and deleted all the rows with timestamp lower then `2024-03-01 00:02:00` (the ones before 00:01:00 were deleted already by the earlier test): ``` MariaDB [teste]> ANALYZE DELETE FROM vm_stats WHERE timestamp < '2024-03-01 00:02:00'; +------+-------------+----------+-------+---------------+--------+---------+------+------+---------+----------+------------+-------------+ | id | select_type | table | type | possible_keys | key | key_len | ref | rows | r_rows | filtered | r_filtered | Extra | +------+-------------+----------+-------+---------------+--------+---------+------+------+---------+----------+------------+-------------+ | 1 | SIMPLE | vm_stats | range | index1 | index1 | 5 | NULL | 4148 | 4148.00 | 100.00 | 100.00 | Using where | +------+-------------+----------+-------+---------------+--------+---------+------+------+---------+----------+------------+-------------+ 1 row in set (0.819 sec) ``` Great! the indexes made it much faster! roughly 30 times faster! Let's test deleting a whole day's worth of rows then: Without indexes: ``` MariaDB [teste]> ANALYZE DELETE FROM vm_stats WHERE timestamp < '2024-03-02 00:00:00'; +------+-------------+----------+------+---------------+------+---------+------+----------+-------------+----------+------------+-------------+ | id | select_type | table | type | possible_keys | key | key_len | ref | rows | r_rows | filtered | r_filtered | Extra | +------+-------------+----------+------+---------------+------+---------+------+----------+-------------+----------+------------+-------------+ | 1 | SIMPLE | vm_stats | ALL | NULL | NULL | NULL | NULL | 39285187 | 40198741.00 | 100.00 | 14.29 | Using where | +------+-------------+----------+------+---------------+------+---------+------+----------+-------------+----------+------------+-------------+ 1 row in set (2 min 57.537 sec) ``` With indexes: ``` MariaDB [teste]> ANALYZE DELETE FROM vm_stats WHERE timestamp < '2024-03-02 00:00:00'; +------+-------------+----------+------+---------------+------+---------+------+----------+-------------+----------+------------+-------------+ | id | select_type | table | type | possible_keys | key | key_len | ref | rows | r_rows | filtered | r_filtered | Extra | +------+-------------+----------+------+---------------+------+---------+------+----------+-------------+----------+------------+-------------+ | 1 | SIMPLE | vm_stats | ALL | index1 | NULL | NULL | NULL | 39371512 | 40190525.00 | 100.00 | 14.27 | Using where | +------+-------------+----------+------+---------------+------+---------+------+----------+-------------+----------+------------+-------------+ 1 row in set (45 min 50.800 sec) ``` The performance was about 15 times worst this time with indexes. This happens because of what I've explained in this comment: https://github.com/apache/cloudstack/pull/8737#issuecomment-1983608195. When deleting a small amount of the data (0.01% of it), we can find the specific rows we need to delete way faster using the indexes and all the overhead of updating them is worth it. However, when deleting a bigger amount of data (14% of the table) we can see the impact of the indexes' overhead, they make the deletion much slower. The above tests only consider the proposed PR, but I've also tested the performance of the indexes alongside what's proposed on #8740. I've tested making a query to delete a whole day's worth of rows with a limit of 100000 : Without indexes: ``` MariaDB [teste]> ANALYZE DELETE FROM vm_stats WHERE timestamp < '2024-03-02 00:00:00' limit 100000; +------+-------------+----------+------+---------------+------+---------+------+----------+-----------+----------+------------+-------------+ | id | select_type | table | type | possible_keys | key | key_len | ref | rows | r_rows | filtered | r_filtered | Extra | +------+-------------+----------+------+---------------+------+---------+------+----------+-----------+----------+------------+-------------+ | 1 | SIMPLE | vm_stats | ALL | NULL | NULL | NULL | NULL | 39285187 | 697806.00 | 100.00 | 14.33 | Using where | +------+-------------+----------+------+---------------+------+---------+------+----------+-----------+----------+------------+-------------+ 1 row in set (2.075 sec) ``` With indexes: ``` MariaDB [teste]> ANALYZE DELETE FROM vm_stats WHERE timestamp < '2024-03-02 00:00:00' limit 100000; +------+-------------+----------+-------+---------------+--------+---------+------+----------+-----------+----------+------------+-------------+ | id | select_type | table | type | possible_keys | key | key_len | ref | rows | r_rows | filtered | r_filtered | Extra | +------+-------------+----------+-------+---------------+--------+---------+------+----------+-----------+----------+------------+-------------+ | 1 | SIMPLE | vm_stats | range | index1 | index1 | 5 | NULL | 12540998 | 100000.00 | 100.00 | 100.00 | Using where | +------+-------------+----------+-------+---------------+--------+---------+------+----------+-----------+----------+------------+-------------+ 1 row in set (21.133 sec) ``` Deleting without indexes was 10 times faster then with indexes. Looking at the tests that I've done, I can agree with you that, for the most common case (deleting 1 minute worth of data), the index addition is worth it. However, if at any time the user decides that they want to retain less stats, the query to delete the excess stats might take much longer, and eventually timeout, leading to a snowball where ACS is unable to clean the `vm_stats` table and it keeps growing, making the problem worse. Another point is that, while the feature on #8740 is optional, and by default turned off, the indexes proposed here are not optional (at least not without manual DB intervention). Taking this into consideration, I really think that adding the `timestamp` index is not the best approach, as it might lead to the problems that I've described both here and on #8740. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@cloudstack.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org