Re: Improving connection scalability: GetSnapshotData()

2021-03-01 Thread Konstantin Knizhnik
On 27.02.2021 20:40, AJG wrote: Hi, Greatly appreciate if you could please reply to the following questions as time allows. I have seen previous discussion/patches on a built-in connection pooler. How does this scalability improvement, particularly idle connection improvements etc, affect th

Re: Improving connection scalability: GetSnapshotData()

2021-03-01 Thread luis . roberto
- Mensagem original - > De: "AJG" > Para: "Pg Hackers" > Enviadas: Sábado, 27 de fevereiro de 2021 14:40:58 > Assunto: Re: Improving connection scalability: GetSnapshotData() > Hi, > Greatly appreciate if you could please reply to the following

Re: Improving connection scalability: GetSnapshotData()

2021-03-01 Thread AJG
Hi, Greatly appreciate if you could please reply to the following questions as time allows. I have seen previous discussion/patches on a built-in connection pooler. How does this scalability improvement, particularly idle connection improvements etc, affect that built-in pooler need, if any? Sa

Re: Improving connection scalability: GetSnapshotData()

2020-10-09 Thread Andrew Dunstan
On 10/5/20 10:33 PM, Andres Freund wrote: > Hi, > > On 2020-10-01 19:21:14 -0400, Andrew Dunstan wrote: >> On 10/1/20 4:22 PM, Andres Freund wrote: >>> # Note: on Windows, IPC::Run seems to convert \r\n to \n in program >>> output >>> # if we're using native Perl, but not if we're using

Re: Improving connection scalability: GetSnapshotData()

2020-10-05 Thread Andres Freund
Hi, On 2020-10-01 19:21:14 -0400, Andrew Dunstan wrote: > On 10/1/20 4:22 PM, Andres Freund wrote: > > # Note: on Windows, IPC::Run seems to convert \r\n to \n in program > > output > > # if we're using native Perl, but not if we're using MSys Perl. So do > > it > > # by hand in the

Re: Improving connection scalability: GetSnapshotData()

2020-10-01 Thread Ian Barwick
On 2020/10/02 3:26, Andres Freund wrote: Hi Ian, Andrew, All, On 2020-09-30 15:43:17 -0700, Andres Freund wrote: Attached is an updated version of the test (better utility function, stricter regexes, bailing out instead of failing just the current when psql times out). I'm leaving it in this t

Re: Improving connection scalability: GetSnapshotData()

2020-10-01 Thread Andrew Dunstan
On 10/1/20 4:22 PM, Andres Freund wrote: > Hi, > > On 2020-10-01 16:00:20 -0400, Andrew Dunstan wrote: >> My strong suspicion is that we're getting unwanted CRs. Note the >> presence of numerous instances of this in PostgresNode.pm: > >> $stdout =~ s/\r\n/\n/g if $Config{osname} eq 'msys'; >>

Re: Improving connection scalability: GetSnapshotData()

2020-10-01 Thread Andres Freund
Hi, On 2020-10-01 16:44:03 -0400, Andrew Dunstan wrote: > > I assume it's not, as the comments says > > # Note: on Windows, IPC::Run seems to convert \r\n to \n in program > > output > > # if we're using native Perl, but not if we're using MSys Perl. So do > > it > > # by hand in th

Re: Improving connection scalability: GetSnapshotData()

2020-10-01 Thread Andrew Dunstan
On 10/1/20 4:22 PM, Andres Freund wrote: > Hi, > > On 2020-10-01 16:00:20 -0400, Andrew Dunstan wrote: >> My strong suspicion is that we're getting unwanted CRs. Note the >> presence of numerous instances of this in PostgresNode.pm: > >> $stdout =~ s/\r\n/\n/g if $Config{osname} eq 'msys'; >>

Re: Improving connection scalability: GetSnapshotData()

2020-10-01 Thread Andres Freund
Hi, On 2020-10-01 16:00:20 -0400, Andrew Dunstan wrote: > My strong suspicion is that we're getting unwanted CRs. Note the > presence of numerous instances of this in PostgresNode.pm: > $stdout =~ s/\r\n/\n/g if $Config{osname} eq 'msys'; > > So you probably want something along those lines

Re: Improving connection scalability: GetSnapshotData()

2020-10-01 Thread Andrew Dunstan
On 10/1/20 2:26 PM, Andres Freund wrote: > Hi Ian, Andrew, All, > > On 2020-09-30 15:43:17 -0700, Andres Freund wrote: >> Attached is an updated version of the test (better utility function, >> stricter regexes, bailing out instead of failing just the current when >> psql times out). I'm leaving

Re: Improving connection scalability: GetSnapshotData()

2020-10-01 Thread Andres Freund
Hi Ian, Andrew, All, On 2020-09-30 15:43:17 -0700, Andres Freund wrote: > Attached is an updated version of the test (better utility function, > stricter regexes, bailing out instead of failing just the current when > psql times out). I'm leaving it in this test for now, but it's fairly > easy to

Re: Improving connection scalability: GetSnapshotData()

2020-10-01 Thread Craig Ringer
On Tue, 15 Sep 2020 at 07:17, Andres Freund wrote: > > Hi, > > On 2020-09-09 17:02:58 +0900, Ian Barwick wrote: > > Attached, though bear in mind I'm not very familiar with parts of this, > > particularly 2PC stuff, so consider it educated guesswork. > > Thanks for this, and the test case! > > You

Re: Improving connection scalability: GetSnapshotData()

2020-09-30 Thread Andres Freund
Hi, On 2020-09-14 16:17:18 -0700, Andres Freund wrote: > I've also included a quite heavily revised version of your test. Instead > of using dblink I went for having a long-running psql that I feed over > stdin. The main reason for not liking the previous version is that it > seems fragile, with t

Re: Improving connection scalability: GetSnapshotData()

2020-09-14 Thread Andres Freund
Hi, On 2020-09-15 11:56:24 +0900, Michael Paquier wrote: > On Mon, Sep 14, 2020 at 05:42:51PM -0700, Andres Freund wrote: > > My test uses IPC::Run - although I'm indirectly 'use'ing, which I guess > > isn't pretty. Just as 013_crash_restart.pl already did (even before > > psql/t/010_tab_completio

Re: Improving connection scalability: GetSnapshotData()

2020-09-14 Thread Michael Paquier
On Mon, Sep 14, 2020 at 05:42:51PM -0700, Andres Freund wrote: > My test uses IPC::Run - although I'm indirectly 'use'ing, which I guess > isn't pretty. Just as 013_crash_restart.pl already did (even before > psql/t/010_tab_completion.pl). I am mostly wondering whether we could > avoid copying the

Re: Improving connection scalability: GetSnapshotData()

2020-09-14 Thread Andres Freund
Hi, On 2020-09-14 20:14:48 -0400, Tom Lane wrote: > Andres Freund writes: > > I think the approach of having a long running psql session is really > > useful, and probably would speed up some tests. Does anybody have a good > > idea for how to best, and without undue effort, to integrate this int

Re: Improving connection scalability: GetSnapshotData()

2020-09-14 Thread Tom Lane
Andres Freund writes: > I think the approach of having a long running psql session is really > useful, and probably would speed up some tests. Does anybody have a good > idea for how to best, and without undue effort, to integrate this into > PostgresNode.pm? I don't really have a great idea, so

Re: Improving connection scalability: GetSnapshotData()

2020-09-14 Thread Andres Freund
Hi, On 2020-09-09 17:02:58 +0900, Ian Barwick wrote: > Attached, though bear in mind I'm not very familiar with parts of this, > particularly 2PC stuff, so consider it educated guesswork. Thanks for this, and the test case! Your commit fixes the issues, but not quite correctly. Multixacts should

Re: Improving connection scalability: GetSnapshotData()

2020-09-09 Thread Ian Barwick
On 2020/09/08 13:23, Ian Barwick wrote: On 2020/09/08 13:11, Andres Freund wrote: Hi, On 2020-09-08 13:03:01 +0900, Ian Barwick wrote: (...) I wonder if it's possible to increment "xactCompletionCount" during replay along these lines: *** a/src/backend/access/transam/xact.c --- b/s

Re: Improving connection scalability: GetSnapshotData()

2020-09-08 Thread Ian Barwick
On 2020/09/09 2:53, Andres Freund wrote: Hi, On 2020-09-08 16:44:17 +1200, Thomas Munro wrote: On Tue, Sep 8, 2020 at 4:11 PM Andres Freund wrote: At first I was very confused as to why none of the existing tests have found this significant issue. But after thinking about it for a minute that

Re: Improving connection scalability: GetSnapshotData()

2020-09-08 Thread Andres Freund
Hi, On 2020-06-07 11:24:50 +0300, Michail Nikolaev wrote: > Hello, hackers. > Andres, nice work! > > Sorry for the off-top. > > Some of my work [1] related to the support of index hint bits on > standby is highly interfering with this patch. > Is it safe to consider it committed and start rebasi

Re: Improving connection scalability: GetSnapshotData()

2020-09-08 Thread Andres Freund
Hi, On 2020-09-08 16:44:17 +1200, Thomas Munro wrote: > On Tue, Sep 8, 2020 at 4:11 PM Andres Freund wrote: > > At first I was very confused as to why none of the existing tests have > > found this significant issue. But after thinking about it for a minute > > that's because they all use psql, a

Re: Improving connection scalability: GetSnapshotData()

2020-09-08 Thread Konstantin Knizhnik
On 07.09.2020 23:45, Andres Freund wrote: Hi, On Mon, Sep 7, 2020, at 07:20, Konstantin Knizhnik wrote: And which pgbench database scale factor you have used? 200 Another thing you could try is to run 2-4 pgench instances in different databases. I tried to reinitialize database with scal

Re: Improving connection scalability: GetSnapshotData()

2020-09-07 Thread Thomas Munro
On Tue, Sep 8, 2020 at 4:11 PM Andres Freund wrote: > At first I was very confused as to why none of the existing tests have > found this significant issue. But after thinking about it for a minute > that's because they all use psql, and largely separate psql invocations > for each query :(. Which

Re: Improving connection scalability: GetSnapshotData()

2020-09-07 Thread Ian Barwick
On 2020/09/08 13:11, Andres Freund wrote: Hi, On 2020-09-08 13:03:01 +0900, Ian Barwick wrote: (...) I wonder if it's possible to increment "xactCompletionCount" during replay along these lines: *** a/src/backend/access/transam/xact.c --- b/src/backend/access/transam/xact.c ***

Re: Improving connection scalability: GetSnapshotData()

2020-09-07 Thread Andres Freund
Hi, On 2020-09-08 13:03:01 +0900, Ian Barwick wrote: > On 2020/09/03 17:18, Michael Paquier wrote: > > On Sun, Aug 16, 2020 at 02:26:57PM -0700, Andres Freund wrote: > > > So we get some builfarm results while thinking about this. > > > > Andres, there is an entry in the CF for this thread: > > h

Re: Improving connection scalability: GetSnapshotData()

2020-09-07 Thread Ian Barwick
On 2020/09/03 17:18, Michael Paquier wrote: On Sun, Aug 16, 2020 at 02:26:57PM -0700, Andres Freund wrote: So we get some builfarm results while thinking about this. Andres, there is an entry in the CF for this thread: https://commitfest.postgresql.org/29/2500/ A lot of work has been committe

Re: Improving connection scalability: GetSnapshotData()

2020-09-07 Thread Andres Freund
Hi, On Mon, Sep 7, 2020, at 07:20, Konstantin Knizhnik wrote: > >> And which pgbench database scale factor you have used? > > 200 > > > > Another thing you could try is to run 2-4 pgench instances in different > > databases. > I tried to reinitialize database with scale 200 but there was no > si

Re: Improving connection scalability: GetSnapshotData()

2020-09-07 Thread Konstantin Knizhnik
On 06.09.2020 21:52, Andres Freund wrote: Hi, On 2020-09-05 16:58:31 +0300, Konstantin Knizhnik wrote: On 04.09.2020 21:53, Andres Freund wrote: I also used huge_pages=on / configured them on the OS level. Otherwise TLB misses will be a significant factor. As far as I understand there shou

Re: Improving connection scalability: GetSnapshotData()

2020-09-06 Thread Andres Freund
Hi, On 2020-09-06 14:05:35 +0300, Konstantin Knizhnik wrote: > On 04.09.2020 21:53, Andres Freund wrote: > > > > > May be it is because of more complex architecture of my server? > > Think we'll need profiles to know... > > This is "perf top" of pgebch -c 100 -j 100 -M prepared -S > >   12.16% 

Re: Improving connection scalability: GetSnapshotData()

2020-09-06 Thread Andres Freund
Hi, On 2020-09-05 16:58:31 +0300, Konstantin Knizhnik wrote: > On 04.09.2020 21:53, Andres Freund wrote: > > > > I also used huge_pages=on / configured them on the OS level. Otherwise > > TLB misses will be a significant factor. > > As far as I understand there should not be no any TLB misses be

Re: Improving connection scalability: GetSnapshotData()

2020-09-06 Thread Konstantin Knizhnik
On 04.09.2020 21:53, Andres Freund wrote: May be it is because of more complex architecture of my server? Think we'll need profiles to know... This is "perf top" of pgebch -c 100 -j 100 -M prepared -S   12.16%  postgres   [.] PinBuffer   11.92%  postgres  

Re: Improving connection scalability: GetSnapshotData()

2020-09-05 Thread Konstantin Knizhnik
On 04.09.2020 21:53, Andres Freund wrote: I also used huge_pages=on / configured them on the OS level. Otherwise TLB misses will be a significant factor. As far as I understand there should not be no any TLB misses because size of the shared buffers (8Mb) as several order of magnitude smal

Re: Improving connection scalability: GetSnapshotData()

2020-09-04 Thread Michael Paquier
On Fri, Sep 04, 2020 at 10:35:19AM -0700, Andres Freund wrote: > I think it's best to close the entry. There's substantial further wins > possible, in particular not acquiring ProcArrayLock in GetSnapshotData() > when the cache is valid improves performance substantially. But it's > non-trivial eno

Re: Improving connection scalability: GetSnapshotData()

2020-09-04 Thread Andres Freund
On 2020-09-04 11:53:04 -0700, Andres Freund wrote: > There's a seperate benchmark that I found to be quite revealing that's > far less dependent on scheduler behaviour. Run two pgbench instances: > > 1) With a very simply script '\sleep 1s' or such, and many connections >(e.g. 100,1000,5000).

Re: Improving connection scalability: GetSnapshotData()

2020-09-04 Thread Andres Freund
Hi, On 2020-09-04 18:24:12 +0300, Konstantin Knizhnik wrote: > Reported results looks very impressive. > But I tried to reproduce them and didn't observed similar behavior. > So I am wondering what can be the difference and what I am doing wrong. That is odd - I did reproduce it on quite a few sy

Re: Improving connection scalability: GetSnapshotData()

2020-09-04 Thread Andres Freund
Hi, On 2020-09-03 17:18:29 +0900, Michael Paquier wrote: > On Sun, Aug 16, 2020 at 02:26:57PM -0700, Andres Freund wrote: > > So we get some builfarm results while thinking about this. > > Andres, there is an entry in the CF for this thread: > https://commitfest.postgresql.org/29/2500/ > > A lot

Re: Improving connection scalability: GetSnapshotData()

2020-09-04 Thread Konstantin Knizhnik
On 03.09.2020 11:18, Michael Paquier wrote: On Sun, Aug 16, 2020 at 02:26:57PM -0700, Andres Freund wrote: So we get some builfarm results while thinking about this. Andres, there is an entry in the CF for this thread: https://commitfest.postgresql.org/29/2500/ A lot of work has been committ

Re: Improving connection scalability: GetSnapshotData()

2020-09-03 Thread Michael Paquier
On Sun, Aug 16, 2020 at 02:26:57PM -0700, Andres Freund wrote: > So we get some builfarm results while thinking about this. Andres, there is an entry in the CF for this thread: https://commitfest.postgresql.org/29/2500/ A lot of work has been committed with 623a9ba, 73487a6, 5788e25, etc. Now tha

Re: Improving connection scalability: GetSnapshotData()

2020-08-17 Thread Alvaro Herrera
On 2020-Aug-16, Peter Geoghegan wrote: > On Sun, Aug 16, 2020 at 2:11 PM Andres Freund wrote: > > For the first, one issue is that there's no obviously good candidate for > > an uninitialized xid. We could use something like FrozenTransactionId, > > which may never be in the procarray. But it's n

Re: Improving connection scalability: GetSnapshotData()

2020-08-16 Thread Andres Freund
Hi, On 2020-08-16 17:28:46 -0400, Tom Lane wrote: > Andres Freund writes: > > For the first, one issue is that there's no obviously good candidate for > > an uninitialized xid. We could use something like FrozenTransactionId, > > which may never be in the procarray. But it's not exactly pretty. >

Re: Improving connection scalability: GetSnapshotData()

2020-08-16 Thread Tom Lane
Andres Freund writes: > For the first, one issue is that there's no obviously good candidate for > an uninitialized xid. We could use something like FrozenTransactionId, > which may never be in the procarray. But it's not exactly pretty. Huh? What's wrong with using InvalidTransactionId?

Re: Improving connection scalability: GetSnapshotData()

2020-08-16 Thread Peter Geoghegan
On Sun, Aug 16, 2020 at 2:11 PM Andres Freund wrote: > For the first, one issue is that there's no obviously good candidate for > an uninitialized xid. We could use something like FrozenTransactionId, > which may never be in the procarray. But it's not exactly pretty. Maybe it would make sense to

Re: Improving connection scalability: GetSnapshotData()

2020-08-16 Thread Andres Freund
Hi, On 2020-08-16 14:11:46 -0700, Andres Freund wrote: > On 2020-08-16 13:52:58 -0700, Andres Freund wrote: > > On 2020-08-16 13:31:53 -0700, Andres Freund wrote: > > Gna, I think I see the problem. In at least one place I wrongly > > accessed the 'dense' array of in-progress xids using the 'pgpr

Re: Improving connection scalability: GetSnapshotData()

2020-08-16 Thread Andres Freund
Hi, On 2020-08-16 13:52:58 -0700, Andres Freund wrote: > On 2020-08-16 13:31:53 -0700, Andres Freund wrote: > > I now luckily have a rr trace of the problem, so I hope I can narrow it > > down to the original problem fairly quickly. > > Gna, I think I see the problem. In at least one place I wro

Re: Improving connection scalability: GetSnapshotData()

2020-08-16 Thread Andres Freund
Hi, On 2020-08-16 13:31:53 -0700, Andres Freund wrote: > I now luckily have a rr trace of the problem, so I hope I can narrow it > down to the original problem fairly quickly. Gna, I think I see the problem. In at least one place I wrongly accessed the 'dense' array of in-progress xids using the

Re: Improving connection scalability: GetSnapshotData()

2020-08-16 Thread Andres Freund
Hi, On 2020-08-16 16:17:23 -0400, Tom Lane wrote: > I wrote: > > It seems entirely likely that there's a timing component in this, for > > instance autovacuum coming along at just the right time. > > D'oh. The attached seems to make it 100% reproducible. Great! It interestingly didn't work as

Re: Improving connection scalability: GetSnapshotData()

2020-08-16 Thread Tom Lane
I wrote: > It seems entirely likely that there's a timing component in this, for > instance autovacuum coming along at just the right time. D'oh. The attached seems to make it 100% reproducible. regards, tom lane diff --git a/src/test/isolation/specs/freeze-the-dead.spec

Re: Improving connection scalability: GetSnapshotData()

2020-08-16 Thread Andres Freund
On 2020-08-16 14:30:24 -0400, Tom Lane wrote: > Andres Freund writes: > > 690 successful runs later, it didn't trigger for me :(. Seems pretty > > clear that there's another variable than pure chance, otherwise it seems > > like that number of runs should have hit the issue, given the number of >

Re: Improving connection scalability: GetSnapshotData()

2020-08-16 Thread Tom Lane
Andres Freund writes: > 690 successful runs later, it didn't trigger for me :(. Seems pretty > clear that there's another variable than pure chance, otherwise it seems > like that number of runs should have hit the issue, given the number of > bf hits vs bf runs. It seems entirely likely that the

Re: Improving connection scalability: GetSnapshotData()

2020-08-16 Thread Andres Freund
Hi, On 2020-08-15 09:42:00 -0700, Andres Freund wrote: > On 2020-08-15 11:10:51 -0400, Tom Lane wrote: > > We have two essentially identical buildfarm failures since these patches > > went in: > > > > https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=damselfly&dt=2020-08-15%2011%3A27%3A32 >

Re: Improving connection scalability: GetSnapshotData()

2020-08-15 Thread Andres Freund
Hi, On 2020-08-15 11:10:51 -0400, Tom Lane wrote: > We have two essentially identical buildfarm failures since these patches > went in: > > https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=damselfly&dt=2020-08-15%2011%3A27%3A32 > https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=perip

Re: Improving connection scalability: GetSnapshotData()

2020-08-15 Thread Tom Lane
We have two essentially identical buildfarm failures since these patches went in: https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=damselfly&dt=2020-08-15%2011%3A27%3A32 https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=peripatus&dt=2020-08-15%2003%3A09%3A14 They're both in the same

Re: Improving connection scalability: GetSnapshotData()

2020-08-11 Thread Thomas Munro
On Wed, Aug 12, 2020 at 12:19 PM Andres Freund wrote: > On 2020-07-29 19:20:04 +1200, Thomas Munro wrote: > > On Wed, Jul 29, 2020 at 6:15 PM Thomas Munro wrote: > > > +static inline FullTransactionId > > > +FullXidViaRelative(FullTransactionId rel, TransactionId xid) > > > > > > I'm struggling t

Re: Improving connection scalability: GetSnapshotData()

2020-08-11 Thread Andres Freund
Hi, On 2020-07-29 19:20:04 +1200, Thomas Munro wrote: > On Wed, Jul 29, 2020 at 6:15 PM Thomas Munro wrote: > > +static inline FullTransactionId > > +FullXidViaRelative(FullTransactionId rel, TransactionId xid) > > > > I'm struggling to find a better word for this than "relative". > > The best I

Re: Improving connection scalability: GetSnapshotData()

2020-07-31 Thread Daniel Gustafsson
> On 24 Jul 2020, at 03:11, Andres Freund wrote: > I've done that in the attached. As this is actively being reviewed but time is running short, I'm moving this to the next CF. cheers ./daniel

Re: Improving connection scalability: GetSnapshotData()

2020-07-29 Thread Thomas Munro
On Wed, Jul 29, 2020 at 6:15 PM Thomas Munro wrote: > +static inline FullTransactionId > +FullXidViaRelative(FullTransactionId rel, TransactionId xid) > > I'm struggling to find a better word for this than "relative". The best I've got is "anchor" xid. It is an xid that is known to limit nextFul

Re: Improving connection scalability: GetSnapshotData()

2020-07-28 Thread Thomas Munro
On Fri, Jul 24, 2020 at 1:11 PM Andres Freund wrote: > On 2020-07-15 21:33:06 -0400, Alvaro Herrera wrote: > > On 2020-Jul-15, Andres Freund wrote: > > > It could make sense to split the conversion of > > > VariableCacheData->latestCompletedXid to FullTransactionId out from 0001 > > > into is own

Re: Improving connection scalability: GetSnapshotData()

2020-07-25 Thread Ranier Vilela
Em sex., 24 de jul. de 2020 às 21:00, Andres Freund escreveu: > On 2020-07-24 18:15:15 -0300, Ranier Vilela wrote: > > Em sex., 24 de jul. de 2020 às 14:16, Andres Freund > > escreveu: > > > > > On 2020-07-24 14:05:04 -0300, Ranier Vilela wrote: > > > > Latest Postgres > > > > Windows 64 bits >

Re: Improving connection scalability: GetSnapshotData()

2020-07-24 Thread Andres Freund
On 2020-07-24 18:15:15 -0300, Ranier Vilela wrote: > Em sex., 24 de jul. de 2020 às 14:16, Andres Freund > escreveu: > > > On 2020-07-24 14:05:04 -0300, Ranier Vilela wrote: > > > Latest Postgres > > > Windows 64 bits > > > msvc 2019 64 bits > > > > > > Patches applied v12-0001 to v12-0007: > > >

Re: Improving connection scalability: GetSnapshotData()

2020-07-24 Thread Ranier Vilela
Em sex., 24 de jul. de 2020 às 14:16, Andres Freund escreveu: > On 2020-07-24 14:05:04 -0300, Ranier Vilela wrote: > > Latest Postgres > > Windows 64 bits > > msvc 2019 64 bits > > > > Patches applied v12-0001 to v12-0007: > > > > C:\dll\postgres\contrib\pgstattuple\pgstatapprox.c(74,28): warnin

Re: Improving connection scalability: GetSnapshotData()

2020-07-24 Thread Andres Freund
On 2020-07-24 14:05:04 -0300, Ranier Vilela wrote: > Latest Postgres > Windows 64 bits > msvc 2019 64 bits > > Patches applied v12-0001 to v12-0007: > > C:\dll\postgres\contrib\pgstattuple\pgstatapprox.c(74,28): warning C4013: > 'GetOldestXmin' indefinido; assumindo extern retornando int > [C:\d

Re: Improving connection scalability: GetSnapshotData()

2020-07-24 Thread Ranier Vilela
Latest Postgres Windows 64 bits msvc 2019 64 bits Patches applied v12-0001 to v12-0007: C:\dll\postgres\contrib\pgstattuple\pgstatapprox.c(74,28): warning C4013: 'GetOldestXmin' indefinido; assumindo extern retornando int [C:\dll\postgres C:\dll\postgres\contrib\pg_visibility\pg_visibility.c(569

Re: Improving connection scalability: GetSnapshotData()

2020-07-15 Thread Alvaro Herrera
On 2020-Jul-15, Andres Freund wrote: > It could make sense to split the conversion of > VariableCacheData->latestCompletedXid to FullTransactionId out from 0001 > into is own commit. Not sure... +1, the commit is large enough and that change can be had in advance. Note you forward-declare struct

Re: Improving connection scalability: GetSnapshotData()

2020-07-01 Thread Daniel Gustafsson
This patch no longer applies to HEAD, please submit a rebased version. I've marked it Waiting on Author in the meantime. cheers ./daniel

Re: Improving connection scalability: GetSnapshotData()

2020-06-07 Thread Michail Nikolaev
Hello, hackers. Andres, nice work! Sorry for the off-top. Some of my work [1] related to the support of index hint bits on standby is highly interfering with this patch. Is it safe to consider it committed and start rebasing on top of the patches? Thanks, Michail. [1]: https://www.postgresql.o

Re: Improving connection scalability: GetSnapshotData()

2020-04-08 Thread Michael Paquier
On Wed, Apr 08, 2020 at 03:17:41PM -0700, Andres Freund wrote: > On 2020-04-08 09:26:42 -0400, Jonathan S. Katz wrote: >> Lastly, with the ongoing world events, perhaps time that could have been >> dedicated to this and other patches likely affected their completion. I >> know most things in my lif

Re: Improving connection scalability: GetSnapshotData()

2020-04-08 Thread Bruce Momjian
On Wed, Apr 8, 2020 at 03:25:34PM -0700, Andres Freund wrote: > Hi, > > On 2020-04-08 18:06:23 -0400, Bruce Momjian wrote: > > If we don't commit this, where does this leave us with the > > old_snapshot_threshold feature? We remove it in back branches and have > > no working version in PG 13? T

Re: Improving connection scalability: GetSnapshotData()

2020-04-08 Thread Andres Freund
Hi, On 2020-04-08 18:06:23 -0400, Bruce Momjian wrote: > If we don't commit this, where does this leave us with the > old_snapshot_threshold feature? We remove it in back branches and have > no working version in PG 13? That seems kind of bad. I don't think this patch changes the situation for

Re: Improving connection scalability: GetSnapshotData()

2020-04-08 Thread Andres Freund
Hi, On 2020-04-08 09:44:16 -0400, Robert Haas wrote: > Moreover, shakedown time will be minimal because we're so late in the > release cycle My impression increasingly is that there's very little actual shakedown before beta :(. As e.g. evidenced by the fact that 2PC did basically not work for se

Re: Improving connection scalability: GetSnapshotData()

2020-04-08 Thread Andres Freund
Hi, On 2020-04-08 09:26:42 -0400, Jonathan S. Katz wrote: > On 4/8/20 8:59 AM, Alexander Korotkov wrote: > > On Wed, Apr 8, 2020 at 3:43 PM Andres Freund wrote: > >> Realistically it still 2-3 hours of proof-reading. > >> > >> This makes me sad :( > > > > Can we ask RMT to extend feature freeze f

Re: Improving connection scalability: GetSnapshotData()

2020-04-08 Thread Bruce Momjian
On Wed, Apr 8, 2020 at 09:44:16AM -0400, Robert Haas wrote: > I don't know what the right thing to do is. I agree with everyone who > says this is a very important problem, and I have the highest respect > for Andres's technical ability. On the other hand, I have been around > here long enough to

Re: Improving connection scalability: GetSnapshotData()

2020-04-08 Thread Andres Freund
Hi, On 2020-04-08 09:24:13 -0400, Robert Haas wrote: > On Tue, Apr 7, 2020 at 4:27 PM Andres Freund wrote: > > The main reason is that we want to be able to cheaply check the current > > state of the variables (mostly when checking a backend's own state). We > > can't access the "dense" ones with

Re: Improving connection scalability: GetSnapshotData()

2020-04-08 Thread Robert Haas
On Wed, Apr 8, 2020 at 9:27 AM Jonathan S. Katz wrote: > One of the features of RMT responsibilities[1] is to be "hands off" as > much as possible, so perhaps a reverse ask: how would people feel about > this patch going into PG13, knowing that the commit would come after the > feature freeze date

Re: Improving connection scalability: GetSnapshotData()

2020-04-08 Thread Jonathan S. Katz
On 4/8/20 8:59 AM, Alexander Korotkov wrote: > On Wed, Apr 8, 2020 at 3:43 PM Andres Freund wrote: >> Realistically it still 2-3 hours of proof-reading. >> >> This makes me sad :( > > Can we ask RMT to extend feature freeze for this particular patchset? > I think it's reasonable assuming extreme

Re: Improving connection scalability: GetSnapshotData()

2020-04-08 Thread Robert Haas
On Tue, Apr 7, 2020 at 4:27 PM Andres Freund wrote: > The main reason is that we want to be able to cheaply check the current > state of the variables (mostly when checking a backend's own state). We > can't access the "dense" ones without holding a lock, but we e.g. don't > want to make ProcArray

Re: Improving connection scalability: GetSnapshotData()

2020-04-08 Thread Alexander Korotkov
On Wed, Apr 8, 2020 at 3:43 PM Andres Freund wrote: > Realistically it still 2-3 hours of proof-reading. > > This makes me sad :( Can we ask RMT to extend feature freeze for this particular patchset? I think it's reasonable assuming extreme importance of this patchset. -- Alexander Korotkov

Re: Improving connection scalability: GetSnapshotData()

2020-04-07 Thread Andres Freund
Hi, On 2020-04-07 16:13:07 -0400, Robert Haas wrote: > On Tue, Apr 7, 2020 at 3:24 PM Andres Freund wrote: > > > + ProcGlobal->xids[pgxactoff] = InvalidTransactionId; > > > > > > Apparently this array is not dense in the sense that it excludes > > > unused slots, but comments elsewhere don't seem

Re: Improving connection scalability: GetSnapshotData()

2020-04-07 Thread Andres Freund
Hi, On 2020-04-07 15:26:36 -0400, Robert Haas wrote: > 0008 - > > Here again, I greatly dislike putting Copy in the name. It makes > little sense to pretend that either is the original and the other is > the copy. You just have the same data in two places. If one of them is > more authoritative,

Re: Improving connection scalability: GetSnapshotData()

2020-04-07 Thread Robert Haas
On Tue, Apr 7, 2020 at 3:24 PM Andres Freund wrote: > > 0007 - > > > > + TransactionId xidCopy; /* this backend's xid, a copy of this proc's > > +ProcGlobal->xids[] entry. */ > > > > Can we please NOT put Copy into the name like that? Pretty please? > > Do you have a suggested naming scheme? S

Re: Improving connection scalability: GetSnapshotData()

2020-04-07 Thread Robert Haas
On Tue, Apr 7, 2020 at 3:31 PM Andres Freund wrote: > Well, it *is* only a vague test :). It shouldn't ever have a false > positive, but there's plenty chance for false negatives (if wrapped > around far enough). Sure, but I think you get my point. Asserting that something "might be" true isn't m

Re: Improving connection scalability: GetSnapshotData()

2020-04-07 Thread Andres Freund
Hi, On 2020-04-07 15:03:46 -0400, Robert Haas wrote: > On Tue, Apr 7, 2020 at 1:51 PM Andres Freund wrote: > > > ComputedHorizons seems like a fairly generic name. I think there's > > > some relationship between InvisibleToEveryoneState and > > > ComputedHorizons that should be brought out more c

Re: Improving connection scalability: GetSnapshotData()

2020-04-07 Thread Andres Freund
Hi, On 2020-04-07 14:51:52 -0400, Robert Haas wrote: > On Tue, Apr 7, 2020 at 2:28 PM Andres Freund wrote: > > Does that make some sense? Do you have a better suggestion for a name? > > I think it makes sense. I have two basic problems with the name. The > first is that "on disk" doesn't seem to

Re: Improving connection scalability: GetSnapshotData()

2020-04-07 Thread Robert Haas
0008 - Here again, I greatly dislike putting Copy in the name. It makes little sense to pretend that either is the original and the other is the copy. You just have the same data in two places. If one of them is more authoritative, the place to explain that is in the comments, not by elongating th

Re: Improving connection scalability: GetSnapshotData()

2020-04-07 Thread Andres Freund
Hi, On 2020-04-07 14:28:09 -0400, Robert Haas wrote: > More review, since it sounds like you like it: > > 0006 - Boring. But I'd probably make this move both xmin and xid back, > with related comment changes; see also next comment. > > 0007 - > > + TransactionId xidCopy; /* this backend's xid, a c

Re: Improving connection scalability: GetSnapshotData()

2020-04-07 Thread Robert Haas
On Tue, Apr 7, 2020 at 1:51 PM Andres Freund wrote: > > ComputedHorizons seems like a fairly generic name. I think there's > > some relationship between InvisibleToEveryoneState and > > ComputedHorizons that should be brought out more clearly by the naming > > and the comments. > > I don't like th

Re: Improving connection scalability: GetSnapshotData()

2020-04-07 Thread Robert Haas
On Tue, Apr 7, 2020 at 2:28 PM Andres Freund wrote: > Does that make some sense? Do you have a better suggestion for a name? I think it makes sense. I have two basic problems with the name. The first is that "on disk" doesn't seem to be a very clear way of describing what you're actually checking

Re: Improving connection scalability: GetSnapshotData()

2020-04-07 Thread Peter Geoghegan
On Tue, Apr 7, 2020 at 11:28 AM Andres Freund wrote: > There is a lot of code that is pretty unsafe around wraparounds... They > are getting easier and easier to hit on a regular schedule in production > (plenty of databases that hit wraparounds multiple times a week). And I > don't think we as PG

Re: Improving connection scalability: GetSnapshotData()

2020-04-07 Thread Andres Freund
Hi, On 2020-04-07 10:51:12 -0700, Andres Freund wrote: > > +void AssertTransactionIdMayBeOnDisk(TransactionId xid) > > > > Formatting. > > > > + * Assert that xid is one that we could actually see on disk. > > > > I don't know what this means. The whole purpose of this routine is > > very uncle

Re: Improving connection scalability: GetSnapshotData()

2020-04-07 Thread Robert Haas
More review, since it sounds like you like it: 0006 - Boring. But I'd probably make this move both xmin and xid back, with related comment changes; see also next comment. 0007 - + TransactionId xidCopy; /* this backend's xid, a copy of this proc's +ProcGlobal->xids[] entry. */ Can we please

Re: Improving connection scalability: GetSnapshotData()

2020-04-07 Thread Andres Freund
Hi, Thanks for the review! On 2020-04-07 12:41:07 -0400, Robert Haas wrote: > In 0002, the comments in SnapshotSet() are virtually incomprehensible. > There's no commit message so the reasons for the changes are unclear. > But mostly looks unproblematic. I was planning to drop that patch pre-co

Re: Improving connection scalability: GetSnapshotData()

2020-04-07 Thread Robert Haas
Comments: In 0002, the comments in SnapshotSet() are virtually incomprehensible. There's no commit message so the reasons for the changes are unclear. But mostly looks unproblematic. 0003 looks like a fairly unrelated bug fix that deserves to be discussed on the thread related to the original pat

Re: Improving connection scalability: GetSnapshotData()

2020-04-07 Thread Jonathan S. Katz
On 4/7/20 8:15 AM, Andres Freund wrote: > I think this is pretty close to being committable. > > > But: This patch came in very late for v13, and it took me much longer to > polish it up than I had hoped (partially distraction due to various bugs > I found (in particular snapshot_too_old), parti

Re: Improving connection scalability: GetSnapshotData()

2020-04-07 Thread Andres Freund
On 2020-04-07 05:15:03 -0700, Andres Freund wrote: > Attached is a substantially polished version of my patches. Note that > the first three patches, as well as the last, are not intended to be > committed at this time / in this form - they're there to make testing > easier. I didn't actually atta

Re: Improving connection scalability: GetSnapshotData()

2020-04-06 Thread Andres Freund
Hi, On 2020-04-06 06:39:59 -0700, Andres Freund wrote: > > 3) Plain pgbench read-write (you already did it for sure) > > -s 100 -M prepared -T 700 > > autovacuum=off, fsync on: > clients tps master tps pgxact > 1 474 479 > 16 43564476 > 40

Re: Improving connection scalability: GetSnapshotData()

2020-04-06 Thread Andres Freund
Hi, On 2020-04-06 06:39:59 -0700, Andres Freund wrote: > These benchmarks are on my workstation. The larger VM I used in the last > round wasn't currently available. One way to reproduce the problem at smaller connection counts / smaller machines is to take more snapshots. Doesn't fully reproduce

Re: Improving connection scalability: GetSnapshotData()

2020-04-06 Thread Andres Freund
Hi, These benchmarks are on my workstation. The larger VM I used in the last round wasn't currently available. HW: 2 x Intel(R) Xeon(R) Gold 5215 (each 10 cores / 20 threads) 192GB Ram. data directory is on a Samsung SSD 970 PRO 1TB A bunch of terminals, emacs, mutt are open while the benchmark

Re: Improving connection scalability: GetSnapshotData()

2020-04-05 Thread David Rowley
On Sun, 1 Mar 2020 at 21:47, Andres Freund wrote: > On 2020-03-01 00:36:01 -0800, Andres Freund wrote: > > conns tps mastertps pgxact-split > > > > 1 26842.49284526524.194821 > > 10 246923.158682 249224.782661 > > 50 695956.539704 70983

Re: Improving connection scalability: GetSnapshotData()

2020-03-31 Thread Andres Freund
Hi, On 2020-03-31 13:04:38 -0700, Andres Freund wrote: > I'm still fighting with snapshot_too_old. The feature is just badly > undertested, underdocumented, and there's lots of other oddities. I've > now spent about as much time on that feature than on the whole rest of > the patchset. To expand

  1   2   >