Re: [HACKERS] Vacuum: allow usage of more than 1GB of work mem

Masahiko Sawada Wed, 25 Jan 2017 08:55:41 -0800

On Tue, Jan 24, 2017 at 1:49 AM, Claudio Freire <klaussfre...@gmail.com> wrote:
> On Fri, Jan 20, 2017 at 6:24 AM, Masahiko Sawada <sawada.m...@gmail.com> 
> wrote:
>> On Thu, Jan 19, 2017 at 8:31 PM, Claudio Freire <klaussfre...@gmail.com> 
>> wrote:
>>> On Thu, Jan 19, 2017 at 6:33 AM, Anastasia Lubennikova
>>> <a.lubennik...@postgrespro.ru> wrote:
>>>> 28.12.2016 23:43, Claudio Freire:
>>>>
>>>> Attached v4 patches with the requested fixes.
>>>>
>>>>
>>>> Sorry for being late, but the tests took a lot of time.
>>>
>>> I know. Takes me several days to run my test scripts once.
>>>
>>>> create table t1 as select i, md5(random()::text) from
>>>> generate_series(0,400000000) as i;
>>>> create index md5_idx ON  t1(md5);
>>>> update t1 set md5 = md5((random() * (100 + 500))::text);
>>>> vacuum;
>>>>
>>>> Patched vacuum used 2.9Gb of memory and vacuumed the index in one pass,
>>>> while for old version it took three passes (1GB+1GB+0.9GB).
>>>> Vacuum duration results:
>>>>
>>>> vanilla:
>>>> LOG: duration: 4359006.327 ms  statement: vacuum verbose t1;
>>>> patched:
>>>> LOG: duration: 3076827.378 ms  statement: vacuum verbose t1;
>>>>
>>>> We can see 30% vacuum speedup. I should note that this case can be
>>>> considered
>>>> as favorable to vanilla vacuum: the table is not that big, it has just one
>>>> index
>>>> and disk used is a fast fusionIO. We can expect even more gain on slower
>>>> disks.
>>>>
>>>> Thank you again for the patch. Hope to see it in 10.0.
>>>
>>> Cool. Thanks for the review and the tests.
>>>
>>
>> I encountered a bug with following scenario.
>> 1. Create table and disable autovacuum on that table.
>> 2. Make about 200000 dead tuples on the table.
>> 3. SET maintenance_work_mem TO 1024
>> 4. VACUUM
>>
>> @@ -729,7 +759,7 @@ lazy_scan_heap(Relation onerel, int options,
>> LVRelStats *vacrelstats,
>>                          * not to reset latestRemovedXid since we want
>> that value to be
>>                          * valid.
>>                          */
>> -                       vacrelstats->num_dead_tuples = 0;
>> +                       lazy_clear_dead_tuples(vacrelstats);
>>                         vacrelstats->num_index_scans++;
>>
>>                         /* Report that we are once again scanning the heap */
>>
>> I think that we should do vacrelstats->dead_tuples.num_entries = 0 as
>> well in lazy_clear_dead_tuples(). Once the amount of dead tuples
>> reached to maintenance_work_mem, lazy_scan_heap can never finish.
>
> That's right.
>
> I added a test for it in the attached patch set, which uncovered
> another bug in lazy_clear_dead_tuples, and took the opportunity to
> rebase.
>
> On Mon, Jan 23, 2017 at 1:06 PM, Alvaro Herrera
> <alvhe...@2ndquadrant.com> wrote:
>> I pushed this patch after rewriting it rather completely.  I added
>> tracing notices to inspect the blocks it was prefetching and observed
>> that the original coding was failing to prefetch the final streak of
>> blocks in the table, which is an important oversight considering that it
>> may very well be that those are the only blocks to read at all.
>>
>> I timed vacuuming a 4000-block table in my laptop (single SSD disk;
>> dropped FS caches after deleting all rows in table, so that vacuum has
>> to read all blocks from disk); it changes from 387ms without patch to
>> 155ms with patch.  I didn't measure how much it takes to run the other
>> steps in the vacuum, but it's clear that the speedup for the truncation
>> phase is considerable.
>>
>> ĄThanks, Claudio!
>
> Cool.
>
> Though it wasn't the first time this idea has been floating around, I
> can't take all the credit.
>
>
> On Fri, Jan 20, 2017 at 6:25 PM, Alvaro Herrera
> <alvhe...@2ndquadrant.com> wrote:
>> FWIW, I think this patch is completely separate from the maint_work_mem
>> patch and should have had its own thread and its own commitfest entry.
>> I intend to get a look at the other patch next week, after pushing this
>> one.
>
> That's because it did have it, and was left in limbo due to lack of
> testing on SSDs. I just had to adopt it here because otherwise tests
> took way too long.


Thank you for updating the patch!

+       /*
+        * Quickly rule out by lower bound (should happen a lot) Upper bound was
+        * already checked by segment search
+        */
+       if (vac_cmp_itemptr((void *) itemptr, (void *) rseg->dead_tuples) < 0)
+               return false;

I think that if the above result is 0, we can return true as itemptr
matched lower bound item pointer in rseg->dead_tuples.

 +typedef struct DeadTuplesSegment
+{
+       int                     num_dead_tuples;        /* # of
entries in the segment */
+       int                     max_dead_tuples;        /* # of
entries allocated in the segment */
+       ItemPointerData last_dead_tuple;        /* Copy of the last
dead tuple (unset
+
          * until the segment is fully
+
          * populated) */
+       unsigned short padding;
+       ItemPointer dead_tuples;        /* Array of dead tuples */
+}      DeadTuplesSegment;
+
+typedef struct DeadTuplesMultiArray
+{
+       int                     num_entries;    /* current # of entries */
+       int                     max_entries;    /* total # of slots
that can be allocated in
+                                                                * array */
+       int                     num_segs;               /* number of
dead tuple segments allocated */
+       int                     last_seg;               /* last dead
tuple segment with data (or 0) */
+       DeadTuplesSegment *dead_tuples;         /* array of num_segs segments */
+}      DeadTuplesMultiArray;

It's a matter of personal preference but some same dead_tuples
variables having different meaning confused me.
If we want to access first dead tuple location of first segment, we
need to do 'vacrelstats->dead_tuples.dead_tuples.dead_tuples'. For
example, 'vacrelstats->dead_tuples.dt_segment.dt_array' is better to
me.


+                                       nseg->num_dead_tuples = 0;
+                                       nseg->max_dead_tuples = 0;
+                                       nseg->dead_tuples = NULL;
+                                       vacrelstats->dead_tuples.num_segs++;
+                               }
+                               seg = DeadTuplesCurrentSegment(vacrelstats);
+                       }
+                       vacrelstats->dead_tuples.last_seg++;
+                       seg = DeadTuplesCurrentSegment(vacrelstats);

Because seg is always set later I think the first line starting with
"seg = ..." is not necessary. Thought?

-- 
Regards,

--
Masahiko Sawada
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Vacuum: allow usage of more than 1GB of work mem

Reply via email to