> > If I try to select a single unused block this way: > > explain analyze select id from media.block b where nrefs =0 limit 1 > > then it runs for more than 10 minutes (I'm not sure how long, I cancelled > > the query after 10 minutes). > > Are you sure it isn't blocked on a lock? >
Yes, I'm sure. I have created a single database instance from a zfs snapshot and tried the query on that database. It was the only client. > Another theory is that the index contains many thousands of references > to now-dead rows, and the query is vainly searching for a live entry. > Given that EXPLAIN thinks there are only about 2300 live entries, > and yet you say the index is 400MB, this seems pretty plausible. > Nobody ever deleted anything from this table. Since it was created, this has been a write-only table. > Have you disabled autovacuum, or something like that? (REINDEX > could help here, at least till the index gets bloated again.) > I did not disable autovacuum. > > You might think that even so, it shouldn't take that long ... but > indexes on UUID columns are a well known performance antipattern. > The index entry order is likely to have precisely zip to do with > the table's physical order, resulting in exceedingly-random access > to the table, which'll be horribly expensive when the table is so > much bigger than RAM. Can you replace the UUID column with a simple > serial (identity) column? > I'm aware of the problems with random UUID values. I was using this function to create ulids from the beginning: CREATE OR REPLACE FUNCTION public.gen_ulid() RETURNS uuid LANGUAGE sql AS $function$ SELECT (lpad(to_hex(floor(extract(epoch FROM clock_timestamp()) * 1000):: bigint), 12, '0') || encode(gen_random_bytes(10), 'hex'))::uuid; $function$ ; If I order some rows by id values, I can see that their creation times are strictly ascending. I did not write this function, it was taken from this website: https://blog.daveallie.com/ulid-primary-keys They have a benchmark section where they show that these ULID values are slower to generate (at least with this implementation) but much faster to insert. I might be able to replace these with int8 values, I need to check. > > > I believe it is not actually using the index, because reading a single > > (random?) entry from an index should not run for >10 minutes. > > You should believe what EXPLAIN tells you about the plan shape. > (Its rowcount estimates are only estimates, though.) > All of the 40M rows in this table are live. I'm 100% sure about this, because nobody ever deleted rows from this table. I can try to do VACUUM on this table, but I'm limited on resources. I think it will take days to do this. Maybe I can try to dump the whole database and restore it on another machine. Would that eliminate dead rows? (Is there a way to check the number of dead rows?) Regards, Laszlo