On 5/24/21 9:53 AM, Masahiko Sawada wrote:
On Sat, May 22, 2021 at 3:10 AM Tomas Vondra
<tomas.von...@enterprisedb.com> wrote:

On 5/21/21 6:43 PM, Andres Freund wrote:
Hi,

  > ...
  >
Attached are the flame graphs for all three cases. The change in master is
pretty clearly visible, but I don't see any clear difference between old and
patched code :-(

I'm pretty sure it's the additional WAL records?


Not sure. If I understand what you suggested elsewhere in the thread, it
should be fine to modify heap_insert to pass the page recptr to
visibilitymap_set, roughly per the attached patch.

I'm not sure it's correct, but it does eliminate the Heap2/VISIBILITY
records for me (when applied on top of your patch). Funnily enough it
does make it a wee bit slower:

patch #1: 56941.505
patch #2: 58099.788

I wonder if this might be due to -fno-omit-frame-pointer, though, as
without it I get these timings:

0c7d3bb99: 25540.417
master:    31868.236
patch #1:  26566.199
patch #2:  26487.943

So without the frame pointers there's no slowdown, but there's no clear
improvement after removal of the WAL records either :-(

Can we verify that the additional WAL records are the cause of this
difference by making the matview unlogged by manually updating
relpersistence = 'u'?

Here are the results of benchmarks with unlogged matviews on my environment:

1) head: 22.927 sec
2) head w/ Andres’s patch: 16.629 sec
3) before 39b66a91b commit: 15.377 sec
4) head w/o freezing tuples: 14.551 sec

And here are the results of logged matviews ICYMI:

1) head: 42.397 sec
2) head w/ Andres’s patch: 34.857 sec
3) before 39b66a91b commit: 32.556 sec
4) head w/o freezing tuples: 32.752 sec

There seems no difference in the tendency. Which means the additional
WAL is not the culprit?


Yeah, I agree the WAL does not seem to be the culprit here.

The patch I posted skips the WAL logging entirely (verified by pg_waldump, although I have not mentioned that), and there's no clear improvement. (FWIW I'm not sure the patch is 100% correct, but it does eliminate the the extra WAL.)

The patch however does not skip the whole visibilitymap_set, it still does the initial error checks. I wonder if that might play a role ...

Another option might be changes in the binary layout - 5% change is well within the range that could be attributed to this, but it feels very hand-wavy and more like an excuse than real analysis.

Interestingly, my previously proposed patch[1] was a better
performance. With the patch, we skip all VM-related work on all
insertions except for when inserting a tuple into a page for the first
time.

logged matviews: 31.591 sec
unlogged matviews: 15.317 sec


Hmmm, thanks for reminding us that patch. Why did we reject that approach in favor of the current one?

I think at this point we have these two options:

1) Revert the freeze patches, either completely or just the heap_insert part, which is what seems to be causing issues. And try again in PG15, perhaps using a different approach, allow disabling freezing in refresh, or something like that.

2) Polish and commit the pinning patch from Andres, which does reduce the slowdown quite a bit. And either call it a day, or continue with the investigation / analysis regarding the remaining ~5% (but I personally have no idea what might be the problem ...).


I'd like to keep the improvement, but I find the 5% regression rather annoying and hard to defend, considering how much we fight for every little improvement.


regards

--
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


Reply via email to