Hi,
On 2020-05-04 14:04:32 -0400, Robert Haas wrote:
> OK, thanks. Let me see if I can summarize here. On the strength of
> previous experience, you'll probably tell me that some parts of this
> summary are wildly wrong or at least "not quite correct" but I'm going
> to try my best.
> - Server-si
On Sun, May 3, 2020 at 1:49 PM Andres Freund wrote:
> > > The run-to-run variations between the runs without cache control are
> > > pretty large. So this is probably not the end-all-be-all numbers. But I
> > > think the trends are pretty clear.
> >
> > Could you be explicit about what you think t
Hi,
On 2020-05-03 09:12:59 -0400, Robert Haas wrote:
> On Sat, May 2, 2020 at 10:36 PM Andres Freund wrote:
> > I changed Robert's test program to optionall fallocate,
> > sync_file_range(WRITE), posix_fadvise(DONTNEED), to avoid a large
> > footprint in the page cache. The performance
> > differ
On Sat, May 2, 2020 at 10:36 PM Andres Freund wrote:
> I changed Robert's test program to optionall fallocate,
> sync_file_range(WRITE), posix_fadvise(DONTNEED), to avoid a large
> footprint in the page cache. The performance
> differences are quite substantial:
>
> gcc -Wall -ggdb ~/tmp/write_and
Hi,
On 2020-05-01 16:32:15 -0400, Robert Haas wrote:
> On Thu, Apr 30, 2020 at 6:06 PM Robert Haas wrote:
> > On Thu, Apr 30, 2020 at 3:52 PM Andres Freund wrote:
> > > Why 8kb? That's smaller than what we currently do in pg_basebackup,
> > > afaictl, and you're actually going to be bottlenecked
On Thu, Apr 30, 2020 at 6:06 PM Robert Haas wrote:
> On Thu, Apr 30, 2020 at 3:52 PM Andres Freund wrote:
> > Why 8kb? That's smaller than what we currently do in pg_basebackup,
> > afaictl, and you're actually going to be bottlenecked by syscall
> > overhead at that point (unless you disable / d
On Thu, Apr 30, 2020 at 3:52 PM Andres Freund wrote:
> Why 8kb? That's smaller than what we currently do in pg_basebackup,
> afaictl, and you're actually going to be bottlenecked by syscall
> overhead at that point (unless you disable / don't have the whole intel
> security mitigation stuff).
I j
Hi,
On 2020-04-30 14:50:34 -0400, Robert Haas wrote:
> On Mon, Apr 20, 2020 at 4:19 PM Andres Freund wrote:
> > One question I have not really seen answered well:
> >
> > Why do we want parallelism here. Or to be more precise: What do we hope
> > to accelerate by making what part of creating a ba
On Mon, Apr 20, 2020 at 4:19 PM Andres Freund wrote:
> One question I have not really seen answered well:
>
> Why do we want parallelism here. Or to be more precise: What do we hope
> to accelerate by making what part of creating a base backup
> parallel. There's several potential bottlenecks, and
On Wed, Apr 22, 2020 at 3:03 PM Andres Freund wrote:
> The 7zip format, perhaps. Does have format level support to address what
> we were discussing earlier: "Support for solid compression, where
> multiple files of like type are compressed within a single stream, in
> order to exploit the combine
Hi,
On 2020-04-22 14:40:17 -0400, Robert Haas wrote:
> > Oh? I find it *extremely* exciting here. This is pretty close to the
> > worst case compressability-wise, and zstd takes only ~22% of the time as
> > gzip does, while still delivering better compression. A nearly 5x
> > improvement in compr
On Wed, Apr 22, 2020 at 2:06 PM Andres Freund wrote:
> I also can see a case for using N backends and one connection, but I
> think that'll be too complicated / too much bound by lcoking around the
> socket etc.
Agreed.
> Oh? I find it *extremely* exciting here. This is pretty close to the
> wor
Hi,
On 2020-04-22 12:12:32 -0400, Robert Haas wrote:
> On Wed, Apr 22, 2020 at 11:24 AM Andres Freund wrote:
> > *My* gut feeling is that you're going to have a harder time using CPU
> > time efficiently when doing parallel compression via multiple processes
> > and independent connections. You'r
On Wed, Apr 22, 2020 at 12:20 PM Peter Eisentraut
wrote:
> On 2020-04-20 22:36, Robert Haas wrote:
> > My suspicion is that it has mostly to do with adequately utilizing the
> > hardware resources on the server side. If you are network-constrained,
> > adding more connections won't help, unless th
On 2020-04-20 22:36, Robert Haas wrote:
My suspicion is that it has mostly to do with adequately utilizing the
hardware resources on the server side. If you are network-constrained,
adding more connections won't help, unless there's something shaping
the traffic which can be gamed by having multi
On Wed, Apr 22, 2020 at 11:24 AM Andres Freund wrote:
> *My* gut feeling is that you're going to have a harder time using CPU
> time efficiently when doing parallel compression via multiple processes
> and independent connections. You're e.g. going to have a lot more
> context switches, I think. A
Hi,
On 2020-04-22 09:52:53 -0400, Robert Haas wrote:
> On Tue, Apr 21, 2020 at 6:57 PM Andres Freund wrote:
> > I agree that trying to make backups very fast is a good goal (or well, I
> > think not very slow would be a good descriptor for the current
> > situation). I am just trying to make sure
On Tue, Apr 21, 2020 at 6:57 PM Andres Freund wrote:
> I agree that trying to make backups very fast is a good goal (or well, I
> think not very slow would be a good descriptor for the current
> situation). I am just trying to make sure we tackle the right problems
> for that. My gut feeling is th
Hi,
On 2020-04-21 17:09:50 -0400, Robert Haas wrote:
> On Tue, Apr 21, 2020 at 4:14 PM Andres Freund wrote:
> > It was local TCP. The speeds I can reach are faster than the 10GiB/s
> > (unidirectional) I can do between the laptop & workstation, so testing
> > it over "actual" network isn't inform
On Tue, Apr 21, 2020 at 4:14 PM Andres Freund wrote:
> It was local TCP. The speeds I can reach are faster than the 10GiB/s
> (unidirectional) I can do between the laptop & workstation, so testing
> it over "actual" network isn't informative - I basically can reach line
> speed between them with a
Hi,
On 2020-04-21 14:01:28 -0400, Robert Haas wrote:
> On Tue, Apr 21, 2020 at 11:36 AM Andres Freund wrote:
> > It's all CRC overhead. I don't see a difference with
> > --manifest-checksums=none anymore. We really should look for a better
> > "fast" checksum.
>
> Hmm, OK. I'm wondering exactly w
On Tue, Apr 21, 2020 at 11:36 AM Andres Freund wrote:
> It's all CRC overhead. I don't see a difference with
> --manifest-checksums=none anymore. We really should look for a better
> "fast" checksum.
Hmm, OK. I'm wondering exactly what you tested here. Was this over
your 20GiB/s connection betwee
Hi,
On 2020-04-21 07:18:20 -0400, Robert Haas wrote:
> On Tue, Apr 21, 2020 at 2:44 AM Andres Freund wrote:
> > FWIW, I just tested pg_basebackup locally.
> >
> > Without compression and a stock postgres I get:
> > unixtcp tcp+ssl:
> > 1.74GiB/s 1.02GiB/
On Tue, Apr 21, 2020 at 2:44 AM Andres Freund wrote:
> FWIW, I just tested pg_basebackup locally.
>
> Without compression and a stock postgres I get:
> unixtcp tcp+ssl:
> 1.74GiB/s 1.02GiB/s699MiB/s
>
> That turns out to be bottlenecked by the
Hi,
On 2020-04-20 22:31:49 -0700, Andres Freund wrote:
> On 2020-04-21 10:20:01 +0530, Amit Kapila wrote:
> > It is quite likely that compression can benefit more from parallelism
> > as compared to the network I/O as that is mostly a CPU intensive
> > operation but I am not sure if we can just ig
Hi,
On 2020-04-21 10:20:01 +0530, Amit Kapila wrote:
> It is quite likely that compression can benefit more from parallelism
> as compared to the network I/O as that is mostly a CPU intensive
> operation but I am not sure if we can just ignore the benefit of
> utilizing the network bandwidth. In
On Tue, Apr 21, 2020 at 2:40 AM Andres Freund wrote:
>
> On 2020-04-20 16:36:16 -0400, Robert Haas wrote:
>
> > If a backup client - either current or hypothetical - is compressing
> > and encrypting, then it doesn't have either a filesystem I/O or a
> > network I/O in progress while it's doing so
Hi,
On 2020-04-20 16:36:16 -0400, Robert Haas wrote:
> My suspicion is that it has mostly to do with adequately utilizing the
> hardware resources on the server side. If you are network-constrained,
> adding more connections won't help, unless there's something shaping
> the traffic which can be g
On Mon, Apr 20, 2020 at 4:19 PM Andres Freund wrote:
> Why do we want parallelism here. Or to be more precise: What do we hope
> to accelerate by making what part of creating a base backup
> parallel. There's several potential bottlenecks, and I think it's
> important to know the design priorities
Thanks for your thoughts.
On Mon, Apr 20, 2020 at 4:02 PM Peter Eisentraut
wrote:
> That would clearly be a good goal. Non-parallel backup should ideally
> be parallel backup with one worker.
Right.
> But it doesn't follow that the proposed design is wrong. It might just
> be that the design
Hi,
On 2020-04-15 11:57:29 -0400, Robert Haas wrote:
> Over at
> http://postgr.es/m/CADM=JehKgobEknb+_nab9179HzGj=9eitzwmod2mpqr_rif...@mail.gmail.com
> there's a proposal for a parallel backup patch which works in the way
> that I have always thought parallel backup would work: instead of
> havi
On 2020-04-15 17:57, Robert Haas wrote:
Over at
http://postgr.es/m/CADM=JehKgobEknb+_nab9179HzGj=9eitzwmod2mpqr_rif...@mail.gmail.com
there's a proposal for a parallel backup patch which works in the way
that I have always thought parallel backup would work: instead of
having a monolithic comman
On Mon, Apr 20, 2020 at 8:50 AM Amit Kapila wrote:
> It is not apparent how you are envisioning this division on the
> server-side. I think in the currently proposed patch, each worker on
> the client-side requests the specific files. So, how are workers going
> to request such numbered files and
On Wed, Apr 15, 2020 at 9:27 PM Robert Haas wrote:
>
> Over at
> http://postgr.es/m/CADM=JehKgobEknb+_nab9179HzGj=9eitzwmod2mpqr_rif...@mail.gmail.com
> there's a proposal for a parallel backup patch which works in the way
> that I have always thought parallel backup would work: instead of
> havi
Hi,
Over at
http://postgr.es/m/CADM=JehKgobEknb+_nab9179HzGj=9eitzwmod2mpqr_rif...@mail.gmail.com
there's a proposal for a parallel backup patch which works in the way
that I have always thought parallel backup would work: instead of
having a monolithic command that returns a series of tarballs,
35 matches
Mail list logo