Re: PATCH: logical_work_mem and logical streaming of large in-progress transactions

Alexey Kondratov Wed, 19 Dec 2018 02:00:02 -0800

Hi Tomas,

I'm a bit confused by the changes to TAP tests. Per the patch summary,
some .pl files get renamed (nor sure why), a new one is added, etc.

I added new tap test case, streaming=true option inside old stream_* ones and incremented streaming tests number (+2) because of the collision between 009_matviews.pl / 009_stream_simple.pl and 010_truncate.pl / 010_stream_subxact.pl. At least in the previous version of the patch they were under the same numbers. Nothing special, but for simplicity, please, find attached my new tap test separately.

  So
I've instead enabled streaming subscriptions in all tests, which with
this patch produces two failures:

Test Summary Report
-------------------
t/004_sync.pl                    (Wstat: 7424 Tests: 1 Failed: 0)
   Non-zero exit status: 29
   Parse errors: Bad plan.  You planned 7 tests but ran 1.
t/011_stream_ddl.pl              (Wstat: 256 Tests: 2 Failed: 1)
   Failed test:  2
   Non-zero exit status: 1

So yeah, there's more stuff to fix. But I can't directly apply your
fixes because the updated patches are somewhat different.

Fixes should apply clearly to the previous version of your patch. Also, I am not sure, that it is a good idea to simply enable streaming subscriptions in all tests (e.g. pre streaming patch t/004_sync.pl), since then they do not hit not streaming code.

Interesting. Any idea where does the extra overhead in this particular
case come from? It's hard to deduce that from the single flame graph,
when I don't have anything to compare it with (i.e. the flame graph for
the "normal" case).

I guess that bottleneck is in disk operations. You can check
logical_repl_worker_new_perf.svg flame graph: disk reads (~9%) and
writes (~26%) take around 35% of CPU time in summary. To compare,
please, see attached flame graph for the following transaction:

INSERT INTO large_text
SELECT (SELECT string_agg('x', ',')
FROM generate_series(1, 2000)) FROM generate_series(1, 1000000);

Execution Time: 44519.816 ms
Time: 98333,642 ms (01:38,334)

where disk IO is only ~7-8% in total. So we get very roughly the same
~x4-5 performance drop here. JFYI, I am using a machine with SSD for tests.

Therefore, probably you may write changes on receiver in bigger chunks,
not each change separately.

Possibly, I/O is certainly a possible culprit, although we should be
using buffered I/O and there certainly are not any fsyncs here. So I'm
not sure why would it be cheaper to do the writes in batches.

BTW does this mean you see the overhead on the apply side? Or are you
running this on a single machine, and it's difficult to decide?

I run this on a single machine, but walsender and worker are utilizing almost 100% of CPU per each process all the time, and at apply side I/O syscalls take about 1/3 of CPU time. Though I am still not sure, but for me this result somehow links performance drop with problems at receiver side.

Writing in batches was just a hypothesis and to validate it I have performed test with large txn, but consisting of a smaller number of wide rows. This test does not exhibit any significant performance drop, while it was streamed too. So it seems to be valid. Anyway, I do not have other reasonable ideas beside that right now.



Regards

--
Alexey Kondratov

Postgres Professional https://www.postgrespro.com
Russian Postgres Company

0xx_stream_tough_ddl.pl
Description: Perl program

Re: PATCH: logical_work_mem and logical streaming of large in-progress transactions

Reply via email to