Re: 20% performance drop on PostgreSQL 9.2 from kernel 3.5.3 to 3.6-rc5 on AMD chipsets - bisected

Borislav Petkov Wed, 26 Sep 2012 22:18:36 -0700

On Thu, Sep 27, 2012 at 07:09:28AM +0200, Mike Galbraith wrote:
> > The way I understand it is, you either want to share L2 with a process,
> > because, for example, both working sets fit in the L2 and/or there's
> > some sharing which saves you moving everything over the L3. This is
> > where selecting a core on the same L2 is actually a good thing.
> 
> Yeah, and if the wakee can't get to the L2 hot data instantly, it may be
> better to let wakee drag the data to an instantly accessible spot.


Yep, then moving it to another L2 is the same.

[ … ]

> > A crazy thought: one could go and sample tasks while running their
> > timeslices with the perf counters to know exactly what type of workload
> > we're looking at. I.e., do I have a large number of L2 evictions? Yes,
> > then spread them out. No, then select the other core on the L2. And so
> > on.
> 
> Hm.  That sampling better be really cheap.  Might help...

Yeah, that's why I said sampling and not run the perfcounters during
every timeslice.

But if you count the proper events, you should be able to know exactly
what the workload is doing (compute-bound, io-bound, contention, etc...)

> but how does that affect pgbench and ilk that must spread regardless
> of footprints.

Well, how do you measure latency of the 1 process in the 1:N case? Maybe
pipeline stalls of the 1 along with some way to recognize it is the 1 in
the 1:N case.

Hmm.

-- 
Regards/Gruss,
    Boris.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: 20% performance drop on PostgreSQL 9.2 from kernel 3.5.3 to 3.6-rc5 on AMD chipsets - bisected

Reply via email to