On Mon, Jul 22, 2024 at 2:04 PM Melanie Plageman <melanieplage...@gmail.com> wrote: > > On Mon, Jul 22, 2024 at 2:17 PM Melanie Plageman > <melanieplage...@gmail.com> wrote: > > > > So, I've just gone through all the test failures on master and 17 for > > mamba, gull, mereswine, and copperhead. I wanted to confirm that the > > test was always failing for the same reason and also if it had any > > failures pre-TIDStore. > > > > We've only run tests with this commit on some of the back branches for > > some of these animals. Of those, I don't see any failures so far. So, > > it seems the test instability is just related to trying to get > > multiple passes of index vacuuming reliably with TIDStore. > > > > AFAICT, all the 32bit machine failures are timeouts waiting for the > > standby to catch up (mamba, gull, merswine). Unfortunately, the > > failures on copperhead (a 64 bit machine) are because we don't > > actually succeed in triggering a second vacuum pass. This would not be > > fixed by a longer timeout. > > > > Because of this, I'm inclined to revert the test on 17 and master to > > avoid distracting folks committing other work and seeing those animals > > go red. > > Okay, I reverted this for now on 17 and master. Adding Sawada-san to > the thread to see if he has any ideas for a smaller two-round index > vacuum example. >
+ CREATE TABLE ${table1}(col1 int) + WITH (autovacuum_enabled=false, fillfactor=10); + INSERT INTO $table1 VALUES(7); + INSERT INTO $table1 SELECT generate_series(1, $nrows) % 3; + CREATE INDEX on ${table1}(col1); + UPDATE $table1 SET col1 = 3 WHERE col1 = 0; + INSERT INTO $table1 VALUES(7); These queries make sense to me; these make the radix tree wide and use more nodes, instead of fattening lead nodes (i.e. the offset bitmap). The $table1 has 18182 blocks and the statistics of radix tree shows: max_val = 65535 num_keys = 18182 height = 1, n4 = 0, n16 = 1, n32 = 0, n64 = 0, n256 = 72, leaves = 18182 Which means that the height of the tree is 2 and we use the maximum size node for all nodes except for 1 node. I don't have any great idea to substantially reduce the total number of tuples in the $table1. Probably we can use DELETE instead of UPDATE to make garbage tuples (although I'm not sure it's okay for this test). Which reduces the amount of WAL records from 11MB to 4MB and would reduce the time to catch up. But I'm not sure how much it would help. There might be ideas to trigger a two-round index vacuum with fewer tuples but if the tests are too optimized for the current TidStore, we will have to re-adjust them if the TidStore changes in the future. So I think it's better and reliable to allow maintenance_work_mem to be a lower value or use injection points somehow. Regards, -- Masahiko Sawada Amazon Web Services: https://aws.amazon.com