Re: autovac issue with large number of tables

Kasahara Tatsuhito Mon, 30 Nov 2020 20:49:11 -0800

Hi,

On Mon, Nov 30, 2020 at 8:59 PM Fujii Masao <masao.fu...@oss.nttdata.com> wrote:
>
>
>
> On 2020/11/30 10:43, Masahiko Sawada wrote:
> > On Sun, Nov 29, 2020 at 10:34 PM Kasahara Tatsuhito
> > <kasahara.tatsuh...@gmail.com> wrote:
> >>
> >> Hi, Thanks for you comments.
> >>
> >> On Fri, Nov 27, 2020 at 9:51 PM Fujii Masao <masao.fu...@oss.nttdata.com> 
> >> wrote:
> >>>
> >>>
> >>>
> >>> On 2020/11/27 18:38, Kasahara Tatsuhito wrote:
> >>>> Hi,
> >>>>
> >>>> On Fri, Nov 27, 2020 at 1:43 AM Fujii Masao 
> >>>> <masao.fu...@oss.nttdata.com> wrote:
> >>>>>
> >>>>>
> >>>>>
> >>>>> On 2020/11/26 10:41, Kasahara Tatsuhito wrote:
> >>>>>> On Wed, Nov 25, 2020 at 8:46 PM Masahiko Sawada 
> >>>>>> <sawada.m...@gmail.com> wrote:
> >>>>>>>
> >>>>>>> On Wed, Nov 25, 2020 at 4:18 PM Kasahara Tatsuhito
> >>>>>>> <kasahara.tatsuh...@gmail.com> wrote:
> >>>>>>>>
> >>>>>>>> Hi,
> >>>>>>>>
> >>>>>>>> On Wed, Nov 25, 2020 at 2:17 PM Masahiko Sawada 
> >>>>>>>> <sawada.m...@gmail.com> wrote:
> >>>>>>>>>
> >>>>>>>>> On Fri, Sep 4, 2020 at 7:50 PM Kasahara Tatsuhito
> >>>>>>>>> <kasahara.tatsuh...@gmail.com> wrote:
> >>>>>>>>>>
> >>>>>>>>>> Hi,
> >>>>>>>>>>
> >>>>>>>>>> On Wed, Sep 2, 2020 at 2:10 AM Kasahara Tatsuhito
> >>>>>>>>>> <kasahara.tatsuh...@gmail.com> wrote:
> >>>>>>>>>>>> I wonder if we could have table_recheck_autovac do two probes of 
> >>>>>>>>>>>> the stats
> >>>>>>>>>>>> data.  First probe the existing stats data, and if it shows the 
> >>>>>>>>>>>> table to
> >>>>>>>>>>>> be already vacuumed, return immediately.  If not, *then* force a 
> >>>>>>>>>>>> stats
> >>>>>>>>>>>> re-read, and check a second time.
> >>>>>>>>>>> Does the above mean that the second and subsequent 
> >>>>>>>>>>> table_recheck_autovac()
> >>>>>>>>>>> will be improved to first check using the previous refreshed 
> >>>>>>>>>>> statistics?
> >>>>>>>>>>> I think that certainly works.
> >>>>>>>>>>>
> >>>>>>>>>>> If that's correct, I'll try to create a patch for the PoC
> >>>>>>>>>>
> >>>>>>>>>> I still don't know how to reproduce Jim's troubles, but I was able 
> >>>>>>>>>> to reproduce
> >>>>>>>>>> what was probably a very similar problem.
> >>>>>>>>>>
> >>>>>>>>>> This problem seems to be more likely to occur in cases where you 
> >>>>>>>>>> have
> >>>>>>>>>> a large number of tables,
> >>>>>>>>>> i.e., a large amount of stats, and many small tables need VACUUM at
> >>>>>>>>>> the same time.
> >>>>>>>>>>
> >>>>>>>>>> So I followed Tom's advice and created a patch for the PoC.
> >>>>>>>>>> This patch will enable a flag in the table_recheck_autovac 
> >>>>>>>>>> function to use
> >>>>>>>>>> the existing stats next time if VACUUM (or ANALYZE) has already 
> >>>>>>>>>> been done
> >>>>>>>>>> by another worker on the check after the stats have been updated.
> >>>>>>>>>> If the tables continue to require VACUUM after the refresh, then a 
> >>>>>>>>>> refresh
> >>>>>>>>>> will be required instead of using the existing statistics.
> >>>>>>>>>>
> >>>>>>>>>> I did simple test with HEAD and HEAD + this PoC patch.
> >>>>>>>>>> The tests were conducted in two cases.
> >>>>>>>>>> (I changed few configurations. see attached scripts)
> >>>>>>>>>>
> >>>>>>>>>> 1. Normal VACUUM case
> >>>>>>>>>>      - SET autovacuum = off
> >>>>>>>>>>      - CREATE tables with 100 rows
> >>>>>>>>>>      - DELETE 90 rows for each tables
> >>>>>>>>>>      - SET autovacuum = on and restart PostgreSQL
> >>>>>>>>>>      - Measure the time it takes for all tables to be VACUUMed
> >>>>>>>>>>
> >>>>>>>>>> 2. Anti wrap round VACUUM case
> >>>>>>>>>>      - CREATE brank tables
> >>>>>>>>>>      - SELECT all of these tables (for generate stats)
> >>>>>>>>>>      - SET autovacuum_freeze_max_age to low values and restart 
> >>>>>>>>>> PostgreSQL
> >>>>>>>>>>      - Consumes a lot of XIDs by using txid_curent()
> >>>>>>>>>>      - Measure the time it takes for all tables to be VACUUMed
> >>>>>>>>>>
> >>>>>>>>>> For each test case, the following results were obtained by changing
> >>>>>>>>>> autovacuum_max_workers parameters to 1, 2, 3(def) 5 and 10.
> >>>>>>>>>> Also changing num of tables to 1000, 5000, 10000 and 20000.
> >>>>>>>>>>
> >>>>>>>>>> Due to the poor VM environment (2 VCPU/4 GB), the results are a 
> >>>>>>>>>> little unstable,
> >>>>>>>>>> but I think it's enough to ask for a trend.
> >>>>>>>>>>
> >>>>>>>>>> ===========================================================================
> >>>>>>>>>> [1.Normal VACUUM case]
> >>>>>>>>>>     tables:1000
> >>>>>>>>>>      autovacuum_max_workers 1:   (HEAD) 20 sec VS (with patch)  20 
> >>>>>>>>>> sec
> >>>>>>>>>>      autovacuum_max_workers 2:   (HEAD) 18 sec VS (with patch)  16 
> >>>>>>>>>> sec
> >>>>>>>>>>      autovacuum_max_workers 3:   (HEAD) 18 sec VS (with patch)  16 
> >>>>>>>>>> sec
> >>>>>>>>>>      autovacuum_max_workers 5:   (HEAD) 19 sec VS (with patch)  17 
> >>>>>>>>>> sec
> >>>>>>>>>>      autovacuum_max_workers 10:  (HEAD) 19 sec VS (with patch)  17 
> >>>>>>>>>> sec
> >>>>>>>>>>
> >>>>>>>>>>     tables:5000
> >>>>>>>>>>      autovacuum_max_workers 1:   (HEAD) 77 sec VS (with patch)  78 
> >>>>>>>>>> sec
> >>>>>>>>>>      autovacuum_max_workers 2:   (HEAD) 61 sec VS (with patch)  43 
> >>>>>>>>>> sec
> >>>>>>>>>>      autovacuum_max_workers 3:   (HEAD) 38 sec VS (with patch)  38 
> >>>>>>>>>> sec
> >>>>>>>>>>      autovacuum_max_workers 5:   (HEAD) 45 sec VS (with patch)  37 
> >>>>>>>>>> sec
> >>>>>>>>>>      autovacuum_max_workers 10:  (HEAD) 43 sec VS (with patch)  35 
> >>>>>>>>>> sec
> >>>>>>>>>>
> >>>>>>>>>>     tables:10000
> >>>>>>>>>>      autovacuum_max_workers 1:   (HEAD) 152 sec VS (with patch)  
> >>>>>>>>>> 153 sec
> >>>>>>>>>>      autovacuum_max_workers 2:   (HEAD) 119 sec VS (with patch)   
> >>>>>>>>>> 98 sec
> >>>>>>>>>>      autovacuum_max_workers 3:   (HEAD)  87 sec VS (with patch)   
> >>>>>>>>>> 78 sec
> >>>>>>>>>>      autovacuum_max_workers 5:   (HEAD) 100 sec VS (with patch)   
> >>>>>>>>>> 66 sec
> >>>>>>>>>>      autovacuum_max_workers 10:  (HEAD)  97 sec VS (with patch)   
> >>>>>>>>>> 56 sec
> >>>>>>>>>>
> >>>>>>>>>>     tables:20000
> >>>>>>>>>>      autovacuum_max_workers 1:   (HEAD) 338 sec VS (with patch)  
> >>>>>>>>>> 339 sec
> >>>>>>>>>>      autovacuum_max_workers 2:   (HEAD) 231 sec VS (with patch)  
> >>>>>>>>>> 229 sec
> >>>>>>>>>>      autovacuum_max_workers 3:   (HEAD) 220 sec VS (with patch)  
> >>>>>>>>>> 191 sec
> >>>>>>>>>>      autovacuum_max_workers 5:   (HEAD) 234 sec VS (with patch)  
> >>>>>>>>>> 147 sec
> >>>>>>>>>>      autovacuum_max_workers 10:  (HEAD) 320 sec VS (with patch)  
> >>>>>>>>>> 113 sec
> >>>>>>>>>>
> >>>>>>>>>> [2.Anti wrap round VACUUM case]
> >>>>>>>>>>     tables:1000
> >>>>>>>>>>      autovacuum_max_workers 1:   (HEAD) 19 sec VS (with patch) 18 
> >>>>>>>>>> sec
> >>>>>>>>>>      autovacuum_max_workers 2:   (HEAD) 14 sec VS (with patch) 15 
> >>>>>>>>>> sec
> >>>>>>>>>>      autovacuum_max_workers 3:   (HEAD) 14 sec VS (with patch) 14 
> >>>>>>>>>> sec
> >>>>>>>>>>      autovacuum_max_workers 5:   (HEAD) 14 sec VS (with patch) 16 
> >>>>>>>>>> sec
> >>>>>>>>>>      autovacuum_max_workers 10:  (HEAD) 16 sec VS (with patch) 14 
> >>>>>>>>>> sec
> >>>>>>>>>>
> >>>>>>>>>>     tables:5000
> >>>>>>>>>>      autovacuum_max_workers 1:   (HEAD) 69 sec VS (with patch) 69 
> >>>>>>>>>> sec
> >>>>>>>>>>      autovacuum_max_workers 2:   (HEAD) 66 sec VS (with patch) 47 
> >>>>>>>>>> sec
> >>>>>>>>>>      autovacuum_max_workers 3:   (HEAD) 59 sec VS (with patch) 37 
> >>>>>>>>>> sec
> >>>>>>>>>>      autovacuum_max_workers 5:   (HEAD) 39 sec VS (with patch) 28 
> >>>>>>>>>> sec
> >>>>>>>>>>      autovacuum_max_workers 10:  (HEAD) 39 sec VS (with patch) 29 
> >>>>>>>>>> sec
> >>>>>>>>>>
> >>>>>>>>>>     tables:10000
> >>>>>>>>>>      autovacuum_max_workers 1:   (HEAD) 139 sec VS (with patch) 
> >>>>>>>>>> 138 sec
> >>>>>>>>>>      autovacuum_max_workers 2:   (HEAD) 130 sec VS (with patch)  
> >>>>>>>>>> 86 sec
> >>>>>>>>>>      autovacuum_max_workers 3:   (HEAD) 120 sec VS (with patch)  
> >>>>>>>>>> 68 sec
> >>>>>>>>>>      autovacuum_max_workers 5:   (HEAD)  96 sec VS (with patch)  
> >>>>>>>>>> 41 sec
> >>>>>>>>>>      autovacuum_max_workers 10:  (HEAD)  90 sec VS (with patch)  
> >>>>>>>>>> 39 sec
> >>>>>>>>>>
> >>>>>>>>>>     tables:20000
> >>>>>>>>>>      autovacuum_max_workers 1:   (HEAD) 313 sec VS (with patch) 
> >>>>>>>>>> 331 sec
> >>>>>>>>>>      autovacuum_max_workers 2:   (HEAD) 209 sec VS (with patch) 
> >>>>>>>>>> 201 sec
> >>>>>>>>>>      autovacuum_max_workers 3:   (HEAD) 227 sec VS (with patch) 
> >>>>>>>>>> 141 sec
> >>>>>>>>>>      autovacuum_max_workers 5:   (HEAD) 236 sec VS (with patch)  
> >>>>>>>>>> 88 sec
> >>>>>>>>>>      autovacuum_max_workers 10:  (HEAD) 309 sec VS (with patch)  
> >>>>>>>>>> 74 sec
> >>>>>>>>>> ===========================================================================
> >>>>>>>>>>
> >>>>>>>>>> The cases without patch, the scalability of the worker has 
> >>>>>>>>>> decreased
> >>>>>>>>>> as the number of tables has increased.
> >>>>>>>>>> In fact, the more workers there are, the longer it takes to 
> >>>>>>>>>> complete
> >>>>>>>>>> VACUUM to all tables.
> >>>>>>>>>> The cases with patch, it shows good scalability with respect to the
> >>>>>>>>>> number of workers.
> >>>>>>>>>
> >>>>>>>>> It seems a good performance improvement even without the patch of
> >>>>>>>>> shared memory based stats collector.
> >>>>>
> >>>>> Sounds great!
> >>>>>
> >>>>>
> >>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>> Note that perf top results showed that hash_search_with_hash_value,
> >>>>>>>>>> hash_seq_search and
> >>>>>>>>>> pgstat_read_statsfiles are dominant during VACUUM in all patterns,
> >>>>>>>>>> with or without the patch.
> >>>>>>>>>>
> >>>>>>>>>> Therefore, there is still a need to find ways to optimize the 
> >>>>>>>>>> reading
> >>>>>>>>>> of large amounts of stats.
> >>>>>>>>>> However, this patch is effective in its own right, and since there 
> >>>>>>>>>> are
> >>>>>>>>>> only a few parts to modify,
> >>>>>>>>>> I think it should be able to be applied to current (preferably
> >>>>>>>>>> pre-v13) PostgreSQL.
> >>>>>>>>>
> >>>>>>>>> +1
> >>>>>>>>>
> >>>>>>>>> +
> >>>>>>>>> +       /* We might be better to refresh stats */
> >>>>>>>>> +       use_existing_stats = false;
> >>>>>>>>>        }
> >>>>>>>>> +   else
> >>>>>>>>> +   {
> >>>>>>>>>
> >>>>>>>>> -   heap_freetuple(classTup);
> >>>>>>>>> +       heap_freetuple(classTup);
> >>>>>>>>> +       /* The relid has already vacuumed, so we might be better to
> >>>>>>>>> use exiting stats */
> >>>>>>>>> +       use_existing_stats = true;
> >>>>>>>>> +   }
> >>>>>>>>>
> >>>>>>>>> With that patch, the autovacuum process refreshes the stats in the
> >>>>>>>>> next check if it finds out that the table still needs to be 
> >>>>>>>>> vacuumed.
> >>>>>>>>> But I guess it's not necessarily true because the next table might 
> >>>>>>>>> be
> >>>>>>>>> vacuumed already. So I think we might want to always use the 
> >>>>>>>>> existing
> >>>>>>>>> for the first check. What do you think?
> >>>>>>>> Thanks for your comment.
> >>>>>>>>
> >>>>>>>> If we assume the case where some workers vacuum on large tables
> >>>>>>>> and a single worker vacuum on small tables, the processing
> >>>>>>>> performance of the single worker will be slightly lower if the
> >>>>>>>> existing statistics are checked every time.
> >>>>>>>>
> >>>>>>>> In fact, at first I tried to check the existing stats every time,
> >>>>>>>> but the performance was slightly worse in cases with a small number 
> >>>>>>>> of workers.
> >>>>>
> >>>>> Do you have this benchmark result?
> >>>>>
> >>>>>
> >>>>>>>> (Checking the existing stats is lightweight , but at high frequency,
> >>>>>>>>     it affects processing performance.)
> >>>>>>>> Therefore, at after refresh statistics, determine whether autovac
> >>>>>>>> should use the existing statistics.
> >>>>>>>
> >>>>>>> Yeah, since the test you used uses a lot of small tables, if there are
> >>>>>>> a few workers, checking the existing stats is unlikely to return true
> >>>>>>> (no need to vacuum). So the cost of existing stats check ends up being
> >>>>>>> overhead. Not sure how slow always checking the existing stats was,
> >>>>>>> but given that the shared memory based stats collector patch could
> >>>>>>> improve the performance of refreshing stats, it might be better not to
> >>>>>>> check the existing stats frequently like the patch does. Anyway, I
> >>>>>>> think it’s better to evaluate the performance improvement with other
> >>>>>>> cases too.
> >>>>>> Yeah, I would like to see how much the performance changes in other 
> >>>>>> cases.
> >>>>>> In addition, if the shared-based-stats patch is applied, we won't need 
> >>>>>> to reload
> >>>>>> a huge stats file, so we will just have to check the stats on
> >>>>>> shared-mem every time.
> >>>>>> Perhaps the logic of table_recheck_autovac could be simpler.
> >>>>>>
> >>>>>>>> BTW, I found some typos in comments, so attache a  fixed version.
> >>>>>
> >>>>> The patch adds some duplicated codes into table_recheck_autovac().
> >>>>> It's better to make the common function performing them and make
> >>>>> table_recheck_autovac() call that common function, to simplify the code.
> >>>> Thanks for your comment.
> >>>> Hmm.. I've cut out the duplicate part.
> >>>> Attach the patch.
> >>>> Could you confirm that it fits your expecting?
> >>>
> >>> Yes, thanks for updataing the patch! Here are another review comments.
> >>>
> >>> +       shared = pgstat_fetch_stat_dbentry(InvalidOid);
> >>> +       dbentry = pgstat_fetch_stat_dbentry(MyDatabaseId);
> >>>
> >>> When using the existing stats, ISTM that these are not necessary and
> >>> we can reuse "shared" and "dbentry" obtained before. Right?
> >> Yeah, but unless autovac_refresh_stats() is called, these functions
> >> read the information from the
> >> local hash table without re-read the stats file, so the process is very 
> >> light.
> >> Therefore, I think, it is better to keep the current logic to keep the
> >> code simple.
> >>
> >>>
> >>> +               /* We might be better to refresh stats */
> >>> +               use_existing_stats = false;
> >>>
> >>> I think that we should add more comments about why it's better to
> >>> refresh the stats in this case.
> >>>
> >>> +               /* The relid has already vacuumed, so we might be better 
> >>> to use existing stats */
> >>> +               use_existing_stats = true;
> >>>
> >>> I think that we should add more comments about why it's better to
> >>> reuse the stats in this case.
> >> I added  comments.
> >>
> >> Attache the patch.
> >>
> >
> > Thank you for updating the patch. Here are some small comments on the
> > latest (v4) patch.
> >
> > +    * So if the last time we checked a table that was already vacuumed 
> > after
> > +    * refres stats, check the current statistics before refreshing it.
> > +    */
> >
> > s/refres/refresh/
Thanks! fixed.
Attached the patch.


> >
> > -----
> > +/* Counter to determine if statistics should be refreshed */
> > +static bool use_existing_stats = false;
> > +
> >
> > I think 'use_existing_stats' can be declared within table_recheck_autovac().
> >
> > -----
> > While testing the performance, I realized that the statistics are
> > reset every time vacuumed one table, leading to re-reading the stats
> > file even if 'use_existing_stats' is true. Please refer that vacuum()
> > eventually calls AtEOXact_PgStat() which calls to
> > pgstat_clear_snapshot().
>
> Good catch!
>
>
> > I believe that's why the performance of the
> > method of always checking the existing stats wasn’t good as expected.
> > So if we save the statistics somewhere and use it for rechecking, the
> > results of the performance benchmark will differ between these two
> > methods.
Thanks for you checks.
But, if a worker did vacuum(), that means this worker had determined
need vacuum in the
table_recheck_autovac(). So, use_existing_stats set to false, and next
time, refresh stats.
Therefore I think the current patch is fine, as we want to avoid
unnecessary refreshing of
statistics before the actual vacuum(), right?

> Or it's simpler to make autovacuum worker skip calling
> pgstat_clear_snapshot() in AtEOXact_PgStat()?
Hmm. IMO the side effects are a bit scary, so I think it's fine the way it is.

Best regards,

> Regards,
>
> --
> Fujii Masao
> Advanced Computing Technology Center
> Research and Development Headquarters
> NTT DATA CORPORATION



-- 
Tatsuhito Kasahara
kasahara.tatsuhito _at_ gmail.com

v5_mod_table_recheck_autovac.patch
Description: Binary data

Re: autovac issue with large number of tables

Reply via email to