On Fri, Nov 27, 2020 at 5:22 PM Masahiko Sawada <sawada.m...@gmail.com> wrote: > > On Fri, Nov 27, 2020 at 1:43 AM Fujii Masao <masao.fu...@oss.nttdata.com> > wrote: > > > > > > > > On 2020/11/26 10:41, Kasahara Tatsuhito wrote: > > > On Wed, Nov 25, 2020 at 8:46 PM Masahiko Sawada <sawada.m...@gmail.com> > > > wrote: > > >> > > >> On Wed, Nov 25, 2020 at 4:18 PM Kasahara Tatsuhito > > >> <kasahara.tatsuh...@gmail.com> wrote: > > >>> > > >>> Hi, > > >>> > > >>> On Wed, Nov 25, 2020 at 2:17 PM Masahiko Sawada <sawada.m...@gmail.com> > > >>> wrote: > > >>>> > > >>>> On Fri, Sep 4, 2020 at 7:50 PM Kasahara Tatsuhito > > >>>> <kasahara.tatsuh...@gmail.com> wrote: > > >>>>> > > >>>>> Hi, > > >>>>> > > >>>>> On Wed, Sep 2, 2020 at 2:10 AM Kasahara Tatsuhito > > >>>>> <kasahara.tatsuh...@gmail.com> wrote: > > >>>>>>> I wonder if we could have table_recheck_autovac do two probes of > > >>>>>>> the stats > > >>>>>>> data. First probe the existing stats data, and if it shows the > > >>>>>>> table to > > >>>>>>> be already vacuumed, return immediately. If not, *then* force a > > >>>>>>> stats > > >>>>>>> re-read, and check a second time. > > >>>>>> Does the above mean that the second and subsequent > > >>>>>> table_recheck_autovac() > > >>>>>> will be improved to first check using the previous refreshed > > >>>>>> statistics? > > >>>>>> I think that certainly works. > > >>>>>> > > >>>>>> If that's correct, I'll try to create a patch for the PoC > > >>>>> > > >>>>> I still don't know how to reproduce Jim's troubles, but I was able to > > >>>>> reproduce > > >>>>> what was probably a very similar problem. > > >>>>> > > >>>>> This problem seems to be more likely to occur in cases where you have > > >>>>> a large number of tables, > > >>>>> i.e., a large amount of stats, and many small tables need VACUUM at > > >>>>> the same time. > > >>>>> > > >>>>> So I followed Tom's advice and created a patch for the PoC. > > >>>>> This patch will enable a flag in the table_recheck_autovac function > > >>>>> to use > > >>>>> the existing stats next time if VACUUM (or ANALYZE) has already been > > >>>>> done > > >>>>> by another worker on the check after the stats have been updated. > > >>>>> If the tables continue to require VACUUM after the refresh, then a > > >>>>> refresh > > >>>>> will be required instead of using the existing statistics. > > >>>>> > > >>>>> I did simple test with HEAD and HEAD + this PoC patch. > > >>>>> The tests were conducted in two cases. > > >>>>> (I changed few configurations. see attached scripts) > > >>>>> > > >>>>> 1. Normal VACUUM case > > >>>>> - SET autovacuum = off > > >>>>> - CREATE tables with 100 rows > > >>>>> - DELETE 90 rows for each tables > > >>>>> - SET autovacuum = on and restart PostgreSQL > > >>>>> - Measure the time it takes for all tables to be VACUUMed > > >>>>> > > >>>>> 2. Anti wrap round VACUUM case > > >>>>> - CREATE brank tables > > >>>>> - SELECT all of these tables (for generate stats) > > >>>>> - SET autovacuum_freeze_max_age to low values and restart > > >>>>> PostgreSQL > > >>>>> - Consumes a lot of XIDs by using txid_curent() > > >>>>> - Measure the time it takes for all tables to be VACUUMed > > >>>>> > > >>>>> For each test case, the following results were obtained by changing > > >>>>> autovacuum_max_workers parameters to 1, 2, 3(def) 5 and 10. > > >>>>> Also changing num of tables to 1000, 5000, 10000 and 20000. > > >>>>> > > >>>>> Due to the poor VM environment (2 VCPU/4 GB), the results are a > > >>>>> little unstable, > > >>>>> but I think it's enough to ask for a trend. > > >>>>> > > >>>>> =========================================================================== > > >>>>> [1.Normal VACUUM case] > > >>>>> tables:1000 > > >>>>> autovacuum_max_workers 1: (HEAD) 20 sec VS (with patch) 20 sec > > >>>>> autovacuum_max_workers 2: (HEAD) 18 sec VS (with patch) 16 sec > > >>>>> autovacuum_max_workers 3: (HEAD) 18 sec VS (with patch) 16 sec > > >>>>> autovacuum_max_workers 5: (HEAD) 19 sec VS (with patch) 17 sec > > >>>>> autovacuum_max_workers 10: (HEAD) 19 sec VS (with patch) 17 sec > > >>>>> > > >>>>> tables:5000 > > >>>>> autovacuum_max_workers 1: (HEAD) 77 sec VS (with patch) 78 sec > > >>>>> autovacuum_max_workers 2: (HEAD) 61 sec VS (with patch) 43 sec > > >>>>> autovacuum_max_workers 3: (HEAD) 38 sec VS (with patch) 38 sec > > >>>>> autovacuum_max_workers 5: (HEAD) 45 sec VS (with patch) 37 sec > > >>>>> autovacuum_max_workers 10: (HEAD) 43 sec VS (with patch) 35 sec > > >>>>> > > >>>>> tables:10000 > > >>>>> autovacuum_max_workers 1: (HEAD) 152 sec VS (with patch) 153 sec > > >>>>> autovacuum_max_workers 2: (HEAD) 119 sec VS (with patch) 98 sec > > >>>>> autovacuum_max_workers 3: (HEAD) 87 sec VS (with patch) 78 sec > > >>>>> autovacuum_max_workers 5: (HEAD) 100 sec VS (with patch) 66 sec > > >>>>> autovacuum_max_workers 10: (HEAD) 97 sec VS (with patch) 56 sec > > >>>>> > > >>>>> tables:20000 > > >>>>> autovacuum_max_workers 1: (HEAD) 338 sec VS (with patch) 339 sec > > >>>>> autovacuum_max_workers 2: (HEAD) 231 sec VS (with patch) 229 sec > > >>>>> autovacuum_max_workers 3: (HEAD) 220 sec VS (with patch) 191 sec > > >>>>> autovacuum_max_workers 5: (HEAD) 234 sec VS (with patch) 147 sec > > >>>>> autovacuum_max_workers 10: (HEAD) 320 sec VS (with patch) 113 sec > > >>>>> > > >>>>> [2.Anti wrap round VACUUM case] > > >>>>> tables:1000 > > >>>>> autovacuum_max_workers 1: (HEAD) 19 sec VS (with patch) 18 sec > > >>>>> autovacuum_max_workers 2: (HEAD) 14 sec VS (with patch) 15 sec > > >>>>> autovacuum_max_workers 3: (HEAD) 14 sec VS (with patch) 14 sec > > >>>>> autovacuum_max_workers 5: (HEAD) 14 sec VS (with patch) 16 sec > > >>>>> autovacuum_max_workers 10: (HEAD) 16 sec VS (with patch) 14 sec > > >>>>> > > >>>>> tables:5000 > > >>>>> autovacuum_max_workers 1: (HEAD) 69 sec VS (with patch) 69 sec > > >>>>> autovacuum_max_workers 2: (HEAD) 66 sec VS (with patch) 47 sec > > >>>>> autovacuum_max_workers 3: (HEAD) 59 sec VS (with patch) 37 sec > > >>>>> autovacuum_max_workers 5: (HEAD) 39 sec VS (with patch) 28 sec > > >>>>> autovacuum_max_workers 10: (HEAD) 39 sec VS (with patch) 29 sec > > >>>>> > > >>>>> tables:10000 > > >>>>> autovacuum_max_workers 1: (HEAD) 139 sec VS (with patch) 138 sec > > >>>>> autovacuum_max_workers 2: (HEAD) 130 sec VS (with patch) 86 sec > > >>>>> autovacuum_max_workers 3: (HEAD) 120 sec VS (with patch) 68 sec > > >>>>> autovacuum_max_workers 5: (HEAD) 96 sec VS (with patch) 41 sec > > >>>>> autovacuum_max_workers 10: (HEAD) 90 sec VS (with patch) 39 sec > > >>>>> > > >>>>> tables:20000 > > >>>>> autovacuum_max_workers 1: (HEAD) 313 sec VS (with patch) 331 sec > > >>>>> autovacuum_max_workers 2: (HEAD) 209 sec VS (with patch) 201 sec > > >>>>> autovacuum_max_workers 3: (HEAD) 227 sec VS (with patch) 141 sec > > >>>>> autovacuum_max_workers 5: (HEAD) 236 sec VS (with patch) 88 sec > > >>>>> autovacuum_max_workers 10: (HEAD) 309 sec VS (with patch) 74 sec > > >>>>> =========================================================================== > > >>>>> > > >>>>> The cases without patch, the scalability of the worker has decreased > > >>>>> as the number of tables has increased. > > >>>>> In fact, the more workers there are, the longer it takes to complete > > >>>>> VACUUM to all tables. > > >>>>> The cases with patch, it shows good scalability with respect to the > > >>>>> number of workers. > > >>>> > > >>>> It seems a good performance improvement even without the patch of > > >>>> shared memory based stats collector. > > > > Sounds great! > > > > > > >>>> > > >>>>> > > >>>>> Note that perf top results showed that hash_search_with_hash_value, > > >>>>> hash_seq_search and > > >>>>> pgstat_read_statsfiles are dominant during VACUUM in all patterns, > > >>>>> with or without the patch. > > >>>>> > > >>>>> Therefore, there is still a need to find ways to optimize the reading > > >>>>> of large amounts of stats. > > >>>>> However, this patch is effective in its own right, and since there are > > >>>>> only a few parts to modify, > > >>>>> I think it should be able to be applied to current (preferably > > >>>>> pre-v13) PostgreSQL. > > >>>> > > >>>> +1 > > >>>> > > >>>> + > > >>>> + /* We might be better to refresh stats */ > > >>>> + use_existing_stats = false; > > >>>> } > > >>>> + else > > >>>> + { > > >>>> > > >>>> - heap_freetuple(classTup); > > >>>> + heap_freetuple(classTup); > > >>>> + /* The relid has already vacuumed, so we might be better to > > >>>> use exiting stats */ > > >>>> + use_existing_stats = true; > > >>>> + } > > >>>> > > >>>> With that patch, the autovacuum process refreshes the stats in the > > >>>> next check if it finds out that the table still needs to be vacuumed. > > >>>> But I guess it's not necessarily true because the next table might be > > >>>> vacuumed already. So I think we might want to always use the existing > > >>>> for the first check. What do you think? > > >>> Thanks for your comment. > > >>> > > >>> If we assume the case where some workers vacuum on large tables > > >>> and a single worker vacuum on small tables, the processing > > >>> performance of the single worker will be slightly lower if the > > >>> existing statistics are checked every time. > > >>> > > >>> In fact, at first I tried to check the existing stats every time, > > >>> but the performance was slightly worse in cases with a small number of > > >>> workers. > > > > Do you have this benchmark result? > > FWIW I'd like to share the benchmark results of the same test in my > environment as Kasahara-san did. In this performance evaluation, I > measured the execution time for the loop in do_autovacuum(), line 2318 > in autovacuum.c, where taking a major time of autovacuum. So it checks > how much time an autovacuum worker took to process the list of the > collected all tables, including refreshing and checking the stats, > vacuuming tables, and checking the existing stats. Since all tables > are the same size (only 1 page) there is no big difference in the > execution time between concurrent autovacuum workers. The following > results show the maximum execution time among the autovacuum workers. > From the left the execution time of the current HEAD, Kasahara-san's > patch, the method of always checking the existing stats, in seconds. > The result has a similar trend to what Kasahara-san mentioned. Thanks! Yes, I think the results are as expected.
> 1000 tables: > autovac_workers 1 : 13s, 13s, 13s > autovac_workers 2 : 6s, 4s, 5s > autovac_workers 3 : 3s, 4s, 4s > autovac_workers 5 : 3s, 3s, 3s > autovac_workers 10: 2s, 3s, 3s > > 5000 tables: > autovac_workers 1 : 71s, 71s, 132s > autovac_workers 2 : 37s, 32s, 48s > autovac_workers 3 : 29s, 26s, 38s > autovac_workers 5 : 20s, 19s, 19s > autovac_workers 10: 13s, 8s, 9s > > 10000 tables: > autovac_workers 1 : 158s,157s, 290s > autovac_workers 2 : 80s, 53s, 151s > autovac_workers 3 : 75s, 67s, 89s > autovac_workers 5 : 61s, 42s, 53s > autovac_workers 10: 69s, 26s, 33s > > 20000 tables: > autovac_workers 1 : 379s, 380s, 695s > autovac_workers 2 : 236s, 232s, 369s > autovac_workers 3 : 222s, 181s, 238s > autovac_workers 5 : 212s, 132s, 167s > autovac_workers 10: 317s, 91s, 117s > > I'm benchmarking the performance improvement by the patch on other > workloads. I'll share that result. +1 If you would like to try the patch I just posted, it would be very helpful. Best regards, > > Regards, > > -- > Masahiko Sawada > EnterpriseDB: https://www.enterprisedb.com/ -- Tatsuhito Kasahara kasahara.tatsuhito _at_ gmail.com