Hi, On Fri, Nov 27, 2020 at 1:43 AM Fujii Masao <masao.fu...@oss.nttdata.com> wrote: > > > > On 2020/11/26 10:41, Kasahara Tatsuhito wrote: > > On Wed, Nov 25, 2020 at 8:46 PM Masahiko Sawada <sawada.m...@gmail.com> > > wrote: > >> > >> On Wed, Nov 25, 2020 at 4:18 PM Kasahara Tatsuhito > >> <kasahara.tatsuh...@gmail.com> wrote: > >>> > >>> Hi, > >>> > >>> On Wed, Nov 25, 2020 at 2:17 PM Masahiko Sawada <sawada.m...@gmail.com> > >>> wrote: > >>>> > >>>> On Fri, Sep 4, 2020 at 7:50 PM Kasahara Tatsuhito > >>>> <kasahara.tatsuh...@gmail.com> wrote: > >>>>> > >>>>> Hi, > >>>>> > >>>>> On Wed, Sep 2, 2020 at 2:10 AM Kasahara Tatsuhito > >>>>> <kasahara.tatsuh...@gmail.com> wrote: > >>>>>>> I wonder if we could have table_recheck_autovac do two probes of the > >>>>>>> stats > >>>>>>> data. First probe the existing stats data, and if it shows the table > >>>>>>> to > >>>>>>> be already vacuumed, return immediately. If not, *then* force a stats > >>>>>>> re-read, and check a second time. > >>>>>> Does the above mean that the second and subsequent > >>>>>> table_recheck_autovac() > >>>>>> will be improved to first check using the previous refreshed > >>>>>> statistics? > >>>>>> I think that certainly works. > >>>>>> > >>>>>> If that's correct, I'll try to create a patch for the PoC > >>>>> > >>>>> I still don't know how to reproduce Jim's troubles, but I was able to > >>>>> reproduce > >>>>> what was probably a very similar problem. > >>>>> > >>>>> This problem seems to be more likely to occur in cases where you have > >>>>> a large number of tables, > >>>>> i.e., a large amount of stats, and many small tables need VACUUM at > >>>>> the same time. > >>>>> > >>>>> So I followed Tom's advice and created a patch for the PoC. > >>>>> This patch will enable a flag in the table_recheck_autovac function to > >>>>> use > >>>>> the existing stats next time if VACUUM (or ANALYZE) has already been > >>>>> done > >>>>> by another worker on the check after the stats have been updated. > >>>>> If the tables continue to require VACUUM after the refresh, then a > >>>>> refresh > >>>>> will be required instead of using the existing statistics. > >>>>> > >>>>> I did simple test with HEAD and HEAD + this PoC patch. > >>>>> The tests were conducted in two cases. > >>>>> (I changed few configurations. see attached scripts) > >>>>> > >>>>> 1. Normal VACUUM case > >>>>> - SET autovacuum = off > >>>>> - CREATE tables with 100 rows > >>>>> - DELETE 90 rows for each tables > >>>>> - SET autovacuum = on and restart PostgreSQL > >>>>> - Measure the time it takes for all tables to be VACUUMed > >>>>> > >>>>> 2. Anti wrap round VACUUM case > >>>>> - CREATE brank tables > >>>>> - SELECT all of these tables (for generate stats) > >>>>> - SET autovacuum_freeze_max_age to low values and restart PostgreSQL > >>>>> - Consumes a lot of XIDs by using txid_curent() > >>>>> - Measure the time it takes for all tables to be VACUUMed > >>>>> > >>>>> For each test case, the following results were obtained by changing > >>>>> autovacuum_max_workers parameters to 1, 2, 3(def) 5 and 10. > >>>>> Also changing num of tables to 1000, 5000, 10000 and 20000. > >>>>> > >>>>> Due to the poor VM environment (2 VCPU/4 GB), the results are a little > >>>>> unstable, > >>>>> but I think it's enough to ask for a trend. > >>>>> > >>>>> =========================================================================== > >>>>> [1.Normal VACUUM case] > >>>>> tables:1000 > >>>>> autovacuum_max_workers 1: (HEAD) 20 sec VS (with patch) 20 sec > >>>>> autovacuum_max_workers 2: (HEAD) 18 sec VS (with patch) 16 sec > >>>>> autovacuum_max_workers 3: (HEAD) 18 sec VS (with patch) 16 sec > >>>>> autovacuum_max_workers 5: (HEAD) 19 sec VS (with patch) 17 sec > >>>>> autovacuum_max_workers 10: (HEAD) 19 sec VS (with patch) 17 sec > >>>>> > >>>>> tables:5000 > >>>>> autovacuum_max_workers 1: (HEAD) 77 sec VS (with patch) 78 sec > >>>>> autovacuum_max_workers 2: (HEAD) 61 sec VS (with patch) 43 sec > >>>>> autovacuum_max_workers 3: (HEAD) 38 sec VS (with patch) 38 sec > >>>>> autovacuum_max_workers 5: (HEAD) 45 sec VS (with patch) 37 sec > >>>>> autovacuum_max_workers 10: (HEAD) 43 sec VS (with patch) 35 sec > >>>>> > >>>>> tables:10000 > >>>>> autovacuum_max_workers 1: (HEAD) 152 sec VS (with patch) 153 sec > >>>>> autovacuum_max_workers 2: (HEAD) 119 sec VS (with patch) 98 sec > >>>>> autovacuum_max_workers 3: (HEAD) 87 sec VS (with patch) 78 sec > >>>>> autovacuum_max_workers 5: (HEAD) 100 sec VS (with patch) 66 sec > >>>>> autovacuum_max_workers 10: (HEAD) 97 sec VS (with patch) 56 sec > >>>>> > >>>>> tables:20000 > >>>>> autovacuum_max_workers 1: (HEAD) 338 sec VS (with patch) 339 sec > >>>>> autovacuum_max_workers 2: (HEAD) 231 sec VS (with patch) 229 sec > >>>>> autovacuum_max_workers 3: (HEAD) 220 sec VS (with patch) 191 sec > >>>>> autovacuum_max_workers 5: (HEAD) 234 sec VS (with patch) 147 sec > >>>>> autovacuum_max_workers 10: (HEAD) 320 sec VS (with patch) 113 sec > >>>>> > >>>>> [2.Anti wrap round VACUUM case] > >>>>> tables:1000 > >>>>> autovacuum_max_workers 1: (HEAD) 19 sec VS (with patch) 18 sec > >>>>> autovacuum_max_workers 2: (HEAD) 14 sec VS (with patch) 15 sec > >>>>> autovacuum_max_workers 3: (HEAD) 14 sec VS (with patch) 14 sec > >>>>> autovacuum_max_workers 5: (HEAD) 14 sec VS (with patch) 16 sec > >>>>> autovacuum_max_workers 10: (HEAD) 16 sec VS (with patch) 14 sec > >>>>> > >>>>> tables:5000 > >>>>> autovacuum_max_workers 1: (HEAD) 69 sec VS (with patch) 69 sec > >>>>> autovacuum_max_workers 2: (HEAD) 66 sec VS (with patch) 47 sec > >>>>> autovacuum_max_workers 3: (HEAD) 59 sec VS (with patch) 37 sec > >>>>> autovacuum_max_workers 5: (HEAD) 39 sec VS (with patch) 28 sec > >>>>> autovacuum_max_workers 10: (HEAD) 39 sec VS (with patch) 29 sec > >>>>> > >>>>> tables:10000 > >>>>> autovacuum_max_workers 1: (HEAD) 139 sec VS (with patch) 138 sec > >>>>> autovacuum_max_workers 2: (HEAD) 130 sec VS (with patch) 86 sec > >>>>> autovacuum_max_workers 3: (HEAD) 120 sec VS (with patch) 68 sec > >>>>> autovacuum_max_workers 5: (HEAD) 96 sec VS (with patch) 41 sec > >>>>> autovacuum_max_workers 10: (HEAD) 90 sec VS (with patch) 39 sec > >>>>> > >>>>> tables:20000 > >>>>> autovacuum_max_workers 1: (HEAD) 313 sec VS (with patch) 331 sec > >>>>> autovacuum_max_workers 2: (HEAD) 209 sec VS (with patch) 201 sec > >>>>> autovacuum_max_workers 3: (HEAD) 227 sec VS (with patch) 141 sec > >>>>> autovacuum_max_workers 5: (HEAD) 236 sec VS (with patch) 88 sec > >>>>> autovacuum_max_workers 10: (HEAD) 309 sec VS (with patch) 74 sec > >>>>> =========================================================================== > >>>>> > >>>>> The cases without patch, the scalability of the worker has decreased > >>>>> as the number of tables has increased. > >>>>> In fact, the more workers there are, the longer it takes to complete > >>>>> VACUUM to all tables. > >>>>> The cases with patch, it shows good scalability with respect to the > >>>>> number of workers. > >>>> > >>>> It seems a good performance improvement even without the patch of > >>>> shared memory based stats collector. > > Sounds great! > > > >>>> > >>>>> > >>>>> Note that perf top results showed that hash_search_with_hash_value, > >>>>> hash_seq_search and > >>>>> pgstat_read_statsfiles are dominant during VACUUM in all patterns, > >>>>> with or without the patch. > >>>>> > >>>>> Therefore, there is still a need to find ways to optimize the reading > >>>>> of large amounts of stats. > >>>>> However, this patch is effective in its own right, and since there are > >>>>> only a few parts to modify, > >>>>> I think it should be able to be applied to current (preferably > >>>>> pre-v13) PostgreSQL. > >>>> > >>>> +1 > >>>> > >>>> + > >>>> + /* We might be better to refresh stats */ > >>>> + use_existing_stats = false; > >>>> } > >>>> + else > >>>> + { > >>>> > >>>> - heap_freetuple(classTup); > >>>> + heap_freetuple(classTup); > >>>> + /* The relid has already vacuumed, so we might be better to > >>>> use exiting stats */ > >>>> + use_existing_stats = true; > >>>> + } > >>>> > >>>> With that patch, the autovacuum process refreshes the stats in the > >>>> next check if it finds out that the table still needs to be vacuumed. > >>>> But I guess it's not necessarily true because the next table might be > >>>> vacuumed already. So I think we might want to always use the existing > >>>> for the first check. What do you think? > >>> Thanks for your comment. > >>> > >>> If we assume the case where some workers vacuum on large tables > >>> and a single worker vacuum on small tables, the processing > >>> performance of the single worker will be slightly lower if the > >>> existing statistics are checked every time. > >>> > >>> In fact, at first I tried to check the existing stats every time, > >>> but the performance was slightly worse in cases with a small number of > >>> workers. > > Do you have this benchmark result? > > > >>> (Checking the existing stats is lightweight , but at high frequency, > >>> it affects processing performance.) > >>> Therefore, at after refresh statistics, determine whether autovac > >>> should use the existing statistics. > >> > >> Yeah, since the test you used uses a lot of small tables, if there are > >> a few workers, checking the existing stats is unlikely to return true > >> (no need to vacuum). So the cost of existing stats check ends up being > >> overhead. Not sure how slow always checking the existing stats was, > >> but given that the shared memory based stats collector patch could > >> improve the performance of refreshing stats, it might be better not to > >> check the existing stats frequently like the patch does. Anyway, I > >> think it’s better to evaluate the performance improvement with other > >> cases too. > > Yeah, I would like to see how much the performance changes in other cases. > > In addition, if the shared-based-stats patch is applied, we won't need to > > reload > > a huge stats file, so we will just have to check the stats on > > shared-mem every time. > > Perhaps the logic of table_recheck_autovac could be simpler. > > > >>> BTW, I found some typos in comments, so attache a fixed version. > > The patch adds some duplicated codes into table_recheck_autovac(). > It's better to make the common function performing them and make > table_recheck_autovac() call that common function, to simplify the code. Thanks for your comment. Hmm.. I've cut out the duplicate part. Attach the patch. Could you confirm that it fits your expecting?
> > + /* > + * Get the applicable reloptions. If it is a TOAST table, > try to get the > + * main table reloptions if the toast table itself doesn't > have. > + */ > + avopts = extract_autovac_opts(classTup, pg_class_desc); > + if (classForm->relkind == RELKIND_TOASTVALUE && > + avopts == NULL && table_toast_map != NULL) > + { > + av_relation *hentry; > + bool found; > + > + hentry = hash_search(table_toast_map, &relid, > HASH_FIND, &found); > + if (found && hentry->ar_hasrelopts) > + avopts = &hentry->ar_reloptions; > + } > > The above is performed both when using the existing stats and > also when the stats are refreshed. But it's actually required > only at once? Yeah right. Fixed. > > - heap_freetuple(classTup); > + heap_freetuple(classTup); > > With the patch, heap_freetuple() is not called when either doanalyze > or dovacuum is true. But it should be called even in that case, > like it is originally? Yeah right. Fixed. Best regards, -- Tatsuhito Kasahara kasahara.tatsuhito _at_ gmail.com
v3_mod_table_recheck_autovac.patch
Description: Binary data