Hi, This thread started at https://www.postgresql.org/message-id/20220213021746.GM31460%40telsasoft.com but is mostly independent, so I split the thread off
On 2022-02-12 20:17:46 -0600, Justin Pryzby wrote: > On Sat, Feb 12, 2022 at 06:00:44PM -0800, Andres Freund wrote: > > I bet using COW file copies would speed up our own regression tests > > noticeably > > - on slower systems we spend a fair bit of time and space creating template0 > > and postgres, with the bulk of the data never changing. > > > > Template databases are also fairly commonly used by application developers > > to > > avoid the cost of rerunning all the setup DDL & initial data loading for > > different tests. Making that measurably cheaper would be a significant win. > > +1 > > I ran into this last week and was still thinking about proposing it. > > Would this help CI It could theoretically help linux - but currently I think the filesystem for CI is ext4, which doesn't support FICLONE. I assume it'd help macos, but I don't know the performance characteristics of copyfile(). I don't think any of the other OSs have working reflink / file clone support. You could prototype it for CI on macos by using the "template initdb" patch and passing -c to cp. On linux it might be worth using copy_file_range(), if supported, if not file cloning. But that's kind of an even more separate topic... > or any significant fraction of buildfarm ? Not sure how many are on new enough linux / mac to benefit and use a suitable filesystem. There are a few animals with slow-ish storage but running fairly new linux. Don't think we can see the FS. Those would likely benefit the most. > Or just tests run locally on supporting filesystems. Probably depends on your storage subsystem. If not that fast, and running tests concurrently, it'd likely help. On my workstation, with lots of cores and very fast storage, using the initdb caching patch modified to do cp --reflink=never / always yields the following time for concurrent check-world (-j40 PROVE_FLAGS=-j4): cp --reflink=never: 96.64user 61.74system 1:04.69elapsed 244%CPU (0avgtext+0avgdata 97544maxresident)k 0inputs+34124296outputs (2584major+7247038minor)pagefaults 0swaps pcheck-world-success cp --reflink=always: 91.79user 56.16system 1:04.21elapsed 230%CPU (0avgtext+0avgdata 97716maxresident)k 189328inputs+16361720outputs (2674major+7229696minor)pagefaults 0swaps pcheck-world-success Seems roughly stable across three runs. Just comparing the time for cp -r of a fresh initdb'd cluster: cp -a --reflink=never real 0m0.043s user 0m0.000s sys 0m0.043s cp -a --reflink=always real 0m0.021s user 0m0.004s sys 0m0.018s so that's a pretty nice win. > Note that pg_upgrade already supports copy/link/clone. (Obviously, link > wouldn't do anything desirable for CREATE DATABASE). Yea. We'd likely have to move relevant code into src/port. Greetings, Andres Freund