procarray.c)

Ranier Vilela Fri, 27 May 2022 17:37:00 -0700

Em sex., 27 de mai. de 2022 às 18:22, Andres Freund <and...@anarazel.de>
escreveu:


> Hi,
>
> On 2022-05-27 10:35:08 -0300, Ranier Vilela wrote:
> > Em qui., 26 de mai. de 2022 às 22:30, Tomas Vondra <
> > tomas.von...@enterprisedb.com> escreveu:
> >
> > > On 5/27/22 02:11, Ranier Vilela wrote:
> > > >
> > > > ...
> > > >
> > > > Here the results with -T 60:
> > >
> > > Might be a good idea to share your analysis / interpretation of the
> > > results, not just the raw data. After all, the change is being proposed
> > > by you, so do you think this shows the change is beneficial?
> > >
> > I think so, but the expectation has diminished.
> > I expected that the more connections, the better the performance.
> > And for both patch and head, this doesn't happen in tests.
> > Performance degrades with a greater number of connections.
>
> Your system has four CPUs. Once they're all busy, adding more connections
> won't improve performance. It'll just add more and more context switching,
> cache misses, and make the OS scheduler do more work.
>
conns       tps head
10      82365.634750
50      74593.714180
80      69219.756038
90      67419.574189
100     66613.771701
Yes it is quite disappointing that with 100 connections, tps loses to 10
connections.


>
>
>
> > GetSnapShowData() isn't a bottleneck?
>
> I'd be surprised if it showed up in a profile on your machine with that
> workload in any sort of meaningful way. The snapshot reuse logic will
> always
> work - because there are no writes - and thus the only work that needs to
> be
> done is to acquire the ProcArrayLock briefly.  And because there is only a
> small number of cores, contention on the cacheline for that isn't a
> problem.
>
Thanks for sharing this.


>
>
> > > These results look much saner, but IMHO it also does not show any clear
> > > benefit of the patch. Or are you still claiming there is a benefit?
> > >
> > We agree that they are micro-optimizations.  However, I think they
> should be
> > considered micro-optimizations in inner loops, because all in
> procarray.c is
> > a hotpath.
>
> As explained earlier, I don't agree that they optimize anything - you're
> making some of the scalability behaviour *worse*, if it's changed at all.
>
>
> > The first objective, I believe, was achieved, with no performance
> > regression.
> > I agree, the gains are small, by the tests done.
>
> There are no gains.
>
IMHO, I must disagree.


>
>
> > But, IMHO, this is a good way, small gains turn into big gains in the
> end,
> > when applied to all code.
> >
> > Consider GetSnapShotData()
> > 1. Most of the time the snapshot is not null, so:
> > if (snaphost == NULL), will fail most of the time.
> >
> > With the patch:
> > if (snapshot->xip != NULL)
> > {
> >     if (GetSnapshotDataReuse(snapshot))
> >      return snapshot;
> > }
> >
> > Most of the time the test is true and GetSnapshotDataReuse is not called
> > for new
> > snapshots.
> > count, subcount and  suboverflowed, will not be initialized, for all
> > snapshots.
>
> But that's irrelevant. There's only a few "new" snapshots in the life of a
> connection. You're optimizing something irrelevant.
>
IMHO, when GetSnapShotData() is the bottleneck, all is relevant.


>
>
> > 2. If snapshot is taken during recoverys
> > The pgprocnos and ProcGlobal->subxidStates are not touched unnecessarily.
>
> That code isn't reached when in recovery?
>
Currently it is reached *even* when not in recovery.
With the patch, *only* is reached when in recovery.


>
> > 3. Calling GetSnapshotDataReuse() without first acquiring ProcArrayLock.
> > There's an agreement that this would be fine, for now.
>
> There's no such agreement at all. It's not correct.
>
Ok, but there is a chance it will work correctly.


>
> > Consider ComputeXidHorizons()
> > 1. ProcGlobal->statusFlags is touched before the lock.
>
> Hard to believe that'd have a measurable effect.
>
IMHO, anything you take out of the lock is a benefit.


>
>
> > 2. allStatusFlags[index] is not touched for all numProcs.
>
> I'd be surprised if the compiler couldn't defer that load on its own.
>
Better be sure of that, no?

regards,
Ranier Vilela

Re: Improving connection scalability (src/backend/storage/ipc/procarray.c)

Reply via email to