On 5/24/21 9:53 AM, Masahiko Sawada wrote:
On Sat, May 22, 2021 at 3:10 AM Tomas Vondra
<tomas.von...@enterprisedb.com> wrote:
On 5/21/21 6:43 PM, Andres Freund wrote:
Hi,
> ...
>
Attached are the flame graphs for all three cases. The change in master is
pretty clearly visible, but I don't see any clear difference between old and
patched code :-(
I'm pretty sure it's the additional WAL records?
Not sure. If I understand what you suggested elsewhere in the thread, it
should be fine to modify heap_insert to pass the page recptr to
visibilitymap_set, roughly per the attached patch.
I'm not sure it's correct, but it does eliminate the Heap2/VISIBILITY
records for me (when applied on top of your patch). Funnily enough it
does make it a wee bit slower:
patch #1: 56941.505
patch #2: 58099.788
I wonder if this might be due to -fno-omit-frame-pointer, though, as
without it I get these timings:
0c7d3bb99: 25540.417
master: 31868.236
patch #1: 26566.199
patch #2: 26487.943
So without the frame pointers there's no slowdown, but there's no clear
improvement after removal of the WAL records either :-(
Can we verify that the additional WAL records are the cause of this
difference by making the matview unlogged by manually updating
relpersistence = 'u'?
Here are the results of benchmarks with unlogged matviews on my environment:
1) head: 22.927 sec
2) head w/ Andres’s patch: 16.629 sec
3) before 39b66a91b commit: 15.377 sec
4) head w/o freezing tuples: 14.551 sec
And here are the results of logged matviews ICYMI:
1) head: 42.397 sec
2) head w/ Andres’s patch: 34.857 sec
3) before 39b66a91b commit: 32.556 sec
4) head w/o freezing tuples: 32.752 sec
There seems no difference in the tendency. Which means the additional
WAL is not the culprit?
Yeah, I agree the WAL does not seem to be the culprit here.
The patch I posted skips the WAL logging entirely (verified by
pg_waldump, although I have not mentioned that), and there's no clear
improvement. (FWIW I'm not sure the patch is 100% correct, but it does
eliminate the the extra WAL.)
The patch however does not skip the whole visibilitymap_set, it still
does the initial error checks. I wonder if that might play a role ...
Another option might be changes in the binary layout - 5% change is well
within the range that could be attributed to this, but it feels very
hand-wavy and more like an excuse than real analysis.
Interestingly, my previously proposed patch[1] was a better
performance. With the patch, we skip all VM-related work on all
insertions except for when inserting a tuple into a page for the first
time.
logged matviews: 31.591 sec
unlogged matviews: 15.317 sec
Hmmm, thanks for reminding us that patch. Why did we reject that
approach in favor of the current one?
I think at this point we have these two options:
1) Revert the freeze patches, either completely or just the heap_insert
part, which is what seems to be causing issues. And try again in PG15,
perhaps using a different approach, allow disabling freezing in refresh,
or something like that.
2) Polish and commit the pinning patch from Andres, which does reduce
the slowdown quite a bit. And either call it a day, or continue with the
investigation / analysis regarding the remaining ~5% (but I personally
have no idea what might be the problem ...).
I'd like to keep the improvement, but I find the 5% regression rather
annoying and hard to defend, considering how much we fight for every
little improvement.
regards
--
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company