On 06 December 2013 11:57 Amit Kapila wrote: > On Fri, Nov 29, 2013 at 6:55 PM, Haribabu kommi > <haribabu.ko...@huawei.com> wrote: > > On 29 November 2013 12:00 Amit Kapila wrote: > >> On Tue, Nov 26, 2013 at 7:26 PM, Haribabu kommi > >> <haribabu.ko...@huawei.com> wrote: > >> Few questions about your latest patch: > >> a. Is there any reason why you are doing estimation of dead tuples > >> only for Autovacuum and not for Vacuum. > > > > No, changed. > > > >> /* clear and get the new stats for calculating proper dead tuples */ > >> pgstat_clear_snapshot(); tabentry = > >> pgstat_fetch_stat_tabentry(RelationGetRelid(onerel)); > >> b. In the above code, to get latest data you are first clearing > >> snapshot and then calling pgstat function. It will inturn perform > I/O > >> (read of stats file) and send/receive message from stats > >> collector to ensure it can read latest data. I think it will add > overhead > >> to Vacuum, especially if 'nkeep' calculated in function > >> lazy_scan_heap() can serve the purpose. In my simple test[1], I > >> observed > >> that value of keep can serve the purpose. > >> > >> Can you please once try the test on 'nkeep' approach patch. > > > > Using the nkeep and snapshot approach, I ran the test for 40 mins > with > > a high analyze_threshold and results are below. > > > > Auto vacuum count Bloat size > > Master 11 220MB > > Patched_nkeep 14 215MB > > Patched_snapshot 18 198MB > > > > Both the approaches are showing good improvement in the test. > > Updated patches, test script and configuration is attached in the > mail. > > I think updating dead tuple count using nkeep is good idea as similar > thing is done for Analyze as well in acquire_sample_rows(). > One minor point, I think it is better to log dead tuples is below error > message: > ereport(LOG, > (errmsg("automatic vacuum of table \"%s.%s.%s\": index scans: %d\n" > "pages: %d removed, %d remain\n" > "tuples: %.0f removed, %.0f remain\n" > > "tuples: %.0f removed, %.0f remain, %.0f dead\n" > > > About your test, how to collect the data by running this script, are > you manually stopping it after 40 mins, because I ran it for more than > an hour, the final result didn't came. > As I mentioned you last time, please simplify your test, for other > person in its current form, it is difficult to make meaning out of it. > Write comments on top of it in steps form to explain what exactly it is > doing and how to take data using it (for example, do I need to wait, > till script ends; how long this test can take to complete).
A simplified test and updated patch by taking care the above comment are attached in the mail. I am not able to reduce the test duration but changed as the test automatically exists after 45 mins run. Please check vacuum_test.sh file more details for running the test. Auto vacuum count Bloat size Master 15 220MB Patched_nkeep 18 213MB Please let me know your suggestions. Regards, Hari babu.
vacuum_fix_v7_nkeep.patch
Description: vacuum_fix_v7_nkeep.patch
test_script.tar.gz
Description: test_script.tar.gz
-- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers