Re: zheap: a new storage format for PostgreSQL

Amit Kapila Wed, 21 Nov 2018 19:37:20 -0800

On Tue, Nov 20, 2018 at 12:53 PM Darafei "Komяpa" Praliaskouski
<[email protected]> wrote:
>>
>> > In PostGIS workloads, UPDATE table SET geom = ST_CostyFunction(geom, 
>> > magicnumber); is one of biggest time-eaters that happen upon initial load 
>> > and clean up of your data. It is commonly followed by CLUSTER table using 
>> > table_geom_idx; to make sure you're back at full speed and no VACUUM is 
>> > needed, and your table (usually static after that) is more-or-less 
>> > spatially ordered. I see that zheap can remove the need for VACUUM, which 
>> > is a big win already. If you can do something that will allow reorder of 
>> > tuples according to index happen during an UPDATE that rewrites most of 
>> > table, that would be a game changer :)
>> >
>>
>> If the tuples are already in the order of the index, then we would
>> retain the order, otherwise, we might not want to anything special for
>> ordering w.r.t index.  I think this is important as we are not sure of
>> the user's intention and I guess it won't be easy to do such
>> rearrangement during Update statement.
>
>
> User's clustering intention is recorded in existence of CLUSTER index over 
> table. That's not used by anything other than CLUSTER command now though.
>
> When I was looking into current heap implementation it seemed that it's 
> possible to hook in a lookup for a couple blocks with values adjacent to the 
> new value, and prefer them to FSM lookup and "current page", for clustered 
> table. Due to dead tuples, free space is going to end very very soon in usual 
> heap, so it probably doesn't make sense there - you're consuming space with 
> old one in old page and new one in new page.
>
> If I understand correctly, in zheap an update would not result in a dead 
> tuple in old page, so space is not going to end immediately, and this may 
> unblock path for such further developments. That is, if there is a spot where 
> to plug in such or similar logic in code :)
>


Yeah, in zheap the dead tuples will be less or may not be there in
many cases, but I am not sure how much it can help for your use case.

> I've described the business case in [1].
>

I am not sure but maybe you need something like Clustered Index where
heap pages are linked via leaf pages of btree.

-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

Re: zheap: a new storage format for PostgreSQL

Reply via email to