On 10/26/2015 05:49 AM, Amit Kapila wrote:
On Mon, Oct 26, 2015 at 9:03 AM, Любен Каравелов <karave...@mail.bg
<mailto:karave...@mail.bg>> wrote:
>
>
> ----- Цитат от Kisung Kim (ks...@bitnine.co.kr
<mailto:ks...@bitnine.co.kr>), на 26.10.2015 в 04:36 -----
>
> > However, what I want to know is about the update performance
difference
> > between PG and Oracle if there any.
> > The case I described is for exaggerating the difference between PG and
> > Oracle.
> >
> > I want to explain for our clients that PG's update performance is
> > comparable to Oracle's.
> >
>
> Oracle is also using MVCC but copies the old row in the rollback
segment and
> rewrites the values in-place.
I think Oracle just copies the changed part of old row to rollback
segment.
Also in Redo logs, it just writes the changed column value (both old and
new). So for the case we are discussing in this thread (one changed
column out of 200 columns), Oracle will just write the old value of that
column in Redo and then in rollback segment, and write the new value
in Redo and then do the in-place update in heap row.
IMV, where Oracle is heavily optimized for "most DML transactions will
commit successfully" and "no long-running transactions shall ever
exists" / "not many transactions will have to read previous
snapshots"(based on PI), Postgres does not actually make any such
assumptions.
Hence, for long running transactions / massive concurrency-many
clients reading and writing older snapshots, Postgres will be faster
(less work to do compared to re-constructing rows based on PIs)
Plus, for updates where the size of the NEW row is bigger than the
previous one (think adding text) the overhead is actually greater for
Oracle (plus, they don't compress variable length values by default / no
TOAST )... so here Postgres would be faster.
For text-intensive workloads, Postgres is measurably faster than
Oracle mostly due to this fact (plus much more efficient in it use of
storage/RAM...)
In PostgreSQL, the whole new row is written in heap and diff tuple
(difference
of old and new tuple; this optimization is done in 9.4, commit a3115f0d)
in WAL. I think we can try to optimize and use the same technique for
heap as used for WAL to make PostgreSQL more efficient for such
scenario's,
however as of today, my guess is that PostgreSQL's update would be lagging
in this area.
Yup. But see above for a potential reason where it might not be that
bad, especially after the optimization you mention.
> It is still 2 writes as in Postgres.
The difference is in the amount of data written per write.
Yes, but compressed (for varlena-based datum/data), batched
(group-commit) so mostly sequential, and non-duplicated (WAL vs REDO+UNDO).
So I guess the difference is quite small nowadays, and differences will
be heavily influenced by actual workload.
Just my 2 (euro-) cents.
/ J.L.