On Sun, Sep 29, 2024 at 12:22 AM Alena Rybakina
<a.rybak...@postgrespro.ru> wrote:
> Hi! Thank you for your interesting for this patch!
>
> I took a very brief look at this and was wondering if it was worth
> having a way to make the per-table vacuum statistics opt-in (like a
> table storage parameter) in order to decrease the shared memory
> footprint of storing the stats.
>
> I'm not sure how users can select tables that enable vacuum statistics
> as I think they basically want to have statistics for all tables, but
> I see your point. Since the size of PgStat_TableCounts approximately
> tripled by this patch (112 bytes to 320 bytes), it might be worth
> considering ways to reduce the number of entries or reducing the size
> of vacuum statistics.
>
> The main purpose of these statistics is to see abnormal behavior of vacuum in 
> relation to a table or the database as a whole.
>
> For example, there may be a situation where vacuum has started to run more 
> often and spends a lot of resources on processing a certain index, but the 
> size of the index does not change significantly. Moreover, the table in which 
> this index is located can be much smaller in size. This may be because the 
> index is bloated and needs to be reindexed.
>
> This is exactly what vacuum statistics can show - we will see that compared 
> to other objects, vacuum processed more blocks and spent more time on this 
> index.
>
> Perhaps the vacuum parameters for the index should be set more aggressively 
> to avoid this in the future.
>
> I suppose that if we turn off statistics collection for a certain object, we 
> can miss it. In addition, the user may not enable the parameter for the 
> object in time, because he will forget about it.

I agree with this point.  Additionally, in order to benefit from
gatherting vacuum statistics only for some relations in terms of
space, we need to handle variable-size stat entries.  That would
greatly increase the complexity.

> As for the second option, now I cannot say which statistics can be removed, 
> to be honest. So far, they all seem necessary.

Yes, but as Masahiko-san pointed out, PgStat_TableCounts is almost
tripled in space.  That a huge change from having no statistics on
vacuum to have it in much more detail than everything else we
currently have.  I think the feasible way might be to introduce some
most demanded statistics first then see how it goes.

------
Regards,
Alexander Korotkov
Supabase


Reply via email to