Re: Slow query, possibly not using index

Les Sun, 27 Aug 2023 10:40:08 -0700

> > If I try to select a single unused block this way:
> > explain analyze select id from media.block b where nrefs =0 limit 1
> > then it runs for more than 10 minutes (I'm not sure how long, I cancelled
> > the query after 10 minutes).
>
> Are you sure it isn't blocked on a lock?
>


Yes, I'm sure. I have created a single database instance from a zfs
snapshot and tried the query on that database. It was the only client.


> Another theory is that the index contains many thousands of references
> to now-dead rows, and the query is vainly searching for a live entry.
> Given that EXPLAIN thinks there are only about 2300 live entries,
> and yet you say the index is 400MB, this seems pretty plausible.
>

Nobody ever deleted anything from this table. Since it was created, this
has been a write-only table.


> Have you disabled autovacuum, or something like that?  (REINDEX
> could help here, at least till the index gets bloated again.)
>
I did not disable autovacuum.

>
> You might think that even so, it shouldn't take that long ... but
> indexes on UUID columns are a well known performance antipattern.
> The index entry order is likely to have precisely zip to do with
> the table's physical order, resulting in exceedingly-random access
> to the table, which'll be horribly expensive when the table is so
> much bigger than RAM.  Can you replace the UUID column with a simple
> serial (identity) column?
>

I'm aware of the problems with random UUID values. I was using this
function to create ulids from the beginning:

CREATE OR REPLACE FUNCTION public.gen_ulid()

RETURNS uuid

LANGUAGE sql

AS $function$

SELECT (lpad(to_hex(floor(extract(epoch FROM clock_timestamp()) * 1000)::
bigint), 12, '0') || encode(gen_random_bytes(10), 'hex'))::uuid;

$function$

;


If I order some rows by id values, I can see that their creation times are
strictly ascending. I did not write this function, it was taken from this
website:

https://blog.daveallie.com/ulid-primary-keys

They have a benchmark section where they show that these ULID values are
slower to generate (at least with this implementation) but much faster to
insert.

I might be able to replace these with int8 values, I need to check.


>
> > I believe it is not actually using the index, because reading a single
> > (random?) entry from an index should not run for >10 minutes.
>
> You should believe what EXPLAIN tells you about the plan shape.
> (Its rowcount estimates are only estimates, though.)
>

All of the 40M rows in this table are live. I'm 100% sure about this,
because nobody ever deleted rows from this table.

I can try to do VACUUM on this table, but I'm limited on resources. I think
it will take days to do this. Maybe I can try to dump the whole database
and restore it on another machine. Would that eliminate dead rows? (Is
there a way to check the number of dead rows?)

Regards,

    Laszlo

Re: Slow query, possibly not using index

Reply via email to