Hi,Here is the parallel backup performance test results with and without the patch "parallel_backup_v15" on AWS cloud environment. Two "t2.xlarge" machines were used: one for Postgres server and the other one for pg_basebackup with the same machine configuration showing below.
Machine configuration: Instance Type :t2.xlarge Volume type :io1 Memory (MiB) :16GB vCPU # :4 Architecture :x86_64 IOP :6000 Database Size (GB) :108 Performance test results: without patch: real 18m49.346s user 1m24.178s sys 7m2.966s 1 worker with patch: real 18m43.201s user 1m55.787s sys 7m24.724s 2 worker with patch: real 18m47.373s user 2m22.970s sys 11m23.891s 4 worker with patch: real 18m46.878s user 2m26.791s sys 13m14.716sAs required, I didn't have the pgbench running in parallel like we did in the previous benchmark.
The perf report files for both Postgres server and pg_basebackup sides are attached.
The files are listed like below. i.e. without patch 1 worker, and with patch 1, 2, 4 workers.
perf report on Postgres server side: perf.data-postgres-without-parallel_backup_v15.txt perf.data-postgres-with-parallel_backup_v15-j1.txt perf.data-postgres-with-parallel_backup_v15-j2.txt perf.data-postgres-with-parallel_backup_v15-j4.txt perf report on pg_basebackup side: perf.data-pg_basebackup-without-parallel_backup_v15.txt perf.data-pg_basebackup-with-parallel_backup_v15-j1.txt perf.data-pg_basebackup-with-parallel_backup_v15-j2.txt perf.data-pg_basebackup-with-parallel_backup_v15-j4.txt If any more information required please let me know. On 2020-04-21 7:12 a.m., Amit Kapila wrote:
On Tue, Apr 21, 2020 at 5:26 PM Ahsan Hadi <ahsan.h...@gmail.com> wrote:On Tue, Apr 21, 2020 at 4:50 PM Amit Kapila <amit.kapil...@gmail.com> wrote:On Tue, Apr 21, 2020 at 5:18 PM Amit Kapila <amit.kapil...@gmail.com> wrote:On Tue, Apr 21, 2020 at 1:00 PM Asif Rehman <asifr.reh...@gmail.com> wrote:I did some tests a while back, and here are the results. The tests were done to simulate a live database environment using pgbench. machine configuration used for this test: Instance Type: t2.xlarge Volume Type : io1 Memory (MiB) : 16384 vCPU # : 4 Architecture : X86_64 IOP : 16000 Database Size (GB) : 102 The setup consist of 3 machines. - one for database instances - one for pg_basebackup client and - one for pgbench with some parallel workers, simulating SELECT loads. basebackup | 4 workers | 8 Workers | 16 workers Backup Duration(Min): 69.25 | 20.44 | 19.86 | 20.15 (pgbench running with 50 parallel client simulating SELECT load) Backup Duration(Min): 154.75 | 49.28 | 45.27 | 20.35 (pgbench running with 100 parallel client simulating SELECT load)Thanks for sharing the results, these show nice speedup! However, I think we should try to find what exactly causes this speed up. If you see the recent discussion on another thread related to this topic, Andres, pointed out that he doesn't think that we can gain much by having multiple connections[1]. It might be due to some internal limitations (like small buffers) [2] due to which we are seeing these speedups. It might help if you can share the perf reports of the server-side and pg_basebackup side.Just to be clear, we need perf reports both with and without patch-set.These tests were done a while back, I think it would be good to run the benchmark again with the latest patches of parallel backup and share the results and perf reports.Sounds good. I think we should also try to run the test with 1 worker as well. The reason it will be good to see the results with 1 worker is that we can know if the technique to send file by file as is done in this patch is better or worse than the current HEAD code. So, it will be good to see the results of an unpatched code, 1 worker, 2 workers, 4 workers, etc.
-- David Software Engineer Highgo Software Inc. (Canada) www.highgo.ca
<<attachment: perf-report-parallel_backup_v15.zip>>