Re: [HACKERS] Timing overhead and Linux clock sources

Greg Smith Tue, 06 Dec 2011 23:41:10 -0800

On 12/06/2011 10:20 PM, Robert Haas wrote:

EXPLAIN ANALYZE is extremely
expensive mostly because it's timing entry and exit into every plan
node, and the way our executor works, those are very frequent
operations.


The plan for the query I was timing looks like this:

Aggregate (cost=738.00..738.01 rows=1 width=0) (actualtime=3.045..3.045 rows=1 loops=1)-> Seq Scan on customers (cost=0.00..688.00 rows=20000 width=0)(actual time=0.002..1.700 rows=20000 loops=1)

That's then 20000 * 2 timing calls for the Seq Scan dominating theruntime. On the system with fast TSC, the fastest execution was1.478ms, the slowest with timing 2.945ms. That's 1.467ms of totaltiming overhead, worst-case, so approximately 37ns per timing call. Ifyou're executing something that is only ever hitting data inshared_buffers, you can measure that; any other case, probably not.

Picking apart the one with slow timing class on my laptop, fastest is5.52ms, and the fastest with timing is 57.959ms. That makes for aminimum of 1311ns per timing call, best-case.

I'm not sure about buffer I/Os - on a big sequential scan, you might do quite a 
lot of those in a pretty
tight loop.

To put this into perspective relative to the number of explain timecalls, there are 488 pages in the relation my test is executing again.

I think we need to be careful to keep timing calls from happening atevery buffer allocation. I wouldn't expect sprinkling one around everybuffer miss would be a problem on a system with a fast clocksource. Andthat is what was shown by the testing Ants Aasma did before submittingthe "add timing of buffer I/O requests" patch; his results make moresense to me now. He estimated 22ns per gettimeofday on the system withfast timing calls--presumably using TSC, and possibly faster than I sawbecause his system had less cores than mine to worry about. He got 990ns on his slower system, and a worst case there of 3% overhead.

Whether people who are on one of these slower timing call systems wouldbe willing to pay 3% overhead is questionable. But I now believe Ants'sclaim that it's below the noise level on ones with a good TSC driventimer. I got a 35:1 ratio between fast and slow clock sources, he got45:1. If we try to estimate the timing overhead that is too small tomeasure, I'd guess it's ~3% / >30 = <0.1%. I'd just leave that on allthe time on a good TSC-driven system. Buffer hits and tuple-leveloperations you couldn't afford to time, just about anything else wouldbe fine.

One random thought: I wonder if there's a way for us to just time
every N'th event or something like that, to keep the overhead low.

I'm predicting we got a lot of future demand for instrumentationfeatures like this, where we want to make them available but would liketo keep them from happening too often when the system is busy. Tossinga percentage of them might work. Caching them in queue somewhere forprocessing by a background process, and not collecting the data, if thatqueue fills is another idea I've been thinking about recently. I'mworking on some ideas for making "is the server busy?" something you canask the background writer usefully too. There's a number of things thatbecome practical for that process to do, when it's decoupled from doingthe checkpoint sync job so its worst-case response time is expected totighten up.


--
Greg Smith   2ndQuadrant US    g...@2ndquadrant.com   Baltimore, MD
PostgreSQL Training, Services, and 24x7 Support  www.2ndQuadrant.us


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Timing overhead and Linux clock sources

Reply via email to