On Mon, Apr 1, 2024 at 9:46 PM Tomas Vondra <tomas.von...@enterprisedb.com> wrote: > > Hi, > > I've been running some benchmarks and experimenting with various stuff, > trying to improve the poor performance on ZFS, and the regression on XFS > when using copy_file_range. And oh boy, did I find interesting stuff ...
[..] Congratulations on great results! > 4) after each pg_combinebackup run to pg_verifybackup, start the cluster > to finish recovery, run pg_checksums --check (to check the patches don't > produce something broken) I've performed some follow-up small testing on all patches mentioned here (1..7), with the earlier developed nano-incremental-backup-tests that helped detect some issues for Robert earlier during original development. They all went fine in both cases: - no special options when using pg_combinebackup - using pg_combinebackup --copy-file-range --manifest-checksums=NONE Those were: test_across_wallevelminimal.sh test_full_pri__incr_stby__restore_on_pri.sh test_full_pri__incr_stby__restore_on_stby.sh test_full_stby__incr_stby__restore_on_pri.sh test_full_stby__incr_stby__restore_on_stby.sh test_incr_after_timelineincrease.sh test_incr_on_standby_after_promote.sh test_many_incrementals_dbcreate_duplicateOID.sh test_many_incrementals_dbcreate_filecopy_NOINCR.sh test_many_incrementals_dbcreate_filecopy.sh test_many_incrementals_dbcreate.sh test_many_incrementals.sh test_multixact.sh test_pending_2pc.sh test_reindex_and_vacuum_full.sh test_repro_assert_RP.sh test_repro_assert.sh test_standby_incr_just_backup.sh test_stuck_walsum.sh test_truncaterollback.sh test_unlogged_table.sh > Now to the findings .... > > > 1) block alignment [..] > And I think we probably want to do this now, because this affects all > tools dealing with incremental backups - even if someone writes a custom > version of pg_combinebackup, it will have to deal with misaligned data. > Perhaps there might be something like pg_basebackup that "transforms" > the data received from the server (and also the backup manifest), but > that does not seem like a great direction. If anything is on the table, then I think in the far future pg_refresh_standby_using_incremental_backup_from_primary would be the only other tool using the format ? > 2) prefetch > ----------- [..] > I think this means we may need a "--prefetch" option, that'd force > prefetching, probably both before pread and copy_file_range. Otherwise > people on ZFS are doomed and will have poor performance. Right, we could optionally cover in the docs later-on various options to get the performance (on XFS use $this, but without $that and so on). It's kind of madness dealing with all those performance variations. Another idea: remove that 128 posifx_fadvise() hardcode in 0002 and a getopt variant like: --prefetch[=HOWMANY] with 128 being as default ? -J.