Re: [HACKERS] questions about PG update performance

Michael Paquier Sun, 25 Oct 2015 19:13:05 -0700

On Mon, Oct 26, 2015 at 10:52 AM, Kisung Kim  <ks...@bitnine.co.kr>wrote:


> Because of the internal implementation of MVCC in PG
> the update of a row is actually a insertion of a new version row.
> So if the size of a row is huge, then it incurs some overhead compare to
> in-place update strategy.
>

Yeah, that's how an UPDATE in Postgres for MVCC usage. The xmax of the old
row is updated, and a new row is inserted with an xmin equal to the
previous xmax. So if you update tuple fields one by one the cost is going
to be high.


> Let's assume that a table has 200 columns,
> and a user updates one of the columns of one row in the table.
> Then PG will rewrite the whole contents of the updated row
> including the updated columns and not-updated columns.
>

When a table has a large number of columns, usually I would say that you
have a normalization problem and such schemas could be split into a smaller
set of tables, minimizing the UPDATE cost.


> I'm not sure about the implementation of Oracle's update.
> But if the Oracle can overwrite only the updated column,
> the performance difference between Oracle and PG in that case may be
> significant.
>
> I researched about this issues in mailing list and google.
> But I've not found anything related to this issues.
>

What you are looking at here is columnar storage, Alvaro and 2nd Quadrant
folks have been doing some work in this area recently:
http://www.postgresql.org/message-id/20150831225328.GM2912@alvherre.pgsql
Also, you may want to have a look at cstore_fdw:
https://github.com/citusdata/cstore_fdw.
Regards,
-- 
Michael

Re: [HACKERS] questions about PG update performance

Reply via email to