On Wed, Dec 7, 2011 at 9:40 AM, Greg Smith <g...@2ndquadrant.com> wrote: > He estimated 22ns per gettimeofday on the system with fast timing > calls--presumably using TSC, and possibly faster than I saw because his > system had less cores than mine to worry about. He got 990 ns on his slower > system, and a worst case there of 3% overhead.
Roberts comment about sequential scan performing lots of reads in a tight loop made me think of worse worst case. A count(*) with wide rows and/or lots of bloat. I created a test table with one tuple per page like this: CREATE TABLE io_test WITH (fillfactor=10) AS SELECT repeat('x', 1000) FROM generate_series(1,30000); I then timed SELECT COUNT(*) FROM io_test; with track_iotiming on and off. Averages of 1000 executions, differences significant according to t-test: timer | iotiming=off | iotiming=on | diff hpet | 86.10 ms | 147.80 ms | 71.67% tsc | 85.86 ms | 87.66 ms | 2.10% The attached test program (test_gettimeofday_monotonic) shows that one test loop iteration takes 34ns with tsc and 1270ns with hpet. I also managed to run the test program a couple of two socket Solaris 10 machines. The one with Xeon X5570 had iteration time of 220ns and Xeon E5620 had 270ns iterations. I'm not sure yet whether this is due to Solaris gettimeofday just being slower or some hardware issue. I also tested a more reasonable case of count(*) on pgbench_accounts with scale factor 50 (61 tuples per page). With tsc timing was actually 1% faster, but not statistically significant, with hpet the overhead was 5.6%. Scaling the overhead for the Solaris machines, it seems that the worst case for timing all buffer reads would be about 20% slower. Count(*) on medium length tuples for an out of shared buffers, in OS cache would have overhead between 1 and 2%. >> One random thought: I wonder if there's a way for us to just time >> every N'th event or something like that, to keep the overhead low. This would work only for cases where there's a reasonably uniform distribution of times or really long sampling periods, otherwise the variability will make the result pretty much useless. For example in the I/O case, a pretty typical load can have 1% of timings be 3 orders of magnitude longer than median. -- Ants Aasma
#include <stdio.h> #include <sys/time.h> typedef long long int64; typedef unsigned long long uint64; typedef long int32; typedef unsigned long uint32; #define TIMEDIFF(a, b) (((a).tv_sec - (b).tv_sec)*1000000 + ((a).tv_usec - (b).tv_usec)) int main(int argc, char **argv, char **arge) { struct timeval start, end, prev, cur; int64 i; int64 time_elapsed = 0; int32 duration; uint64 loop_count = 0; static int64 timings[32]; if (argc != 2 || !sscanf(argv[1], "%ld", &duration) || duration < 0) { printf("Usage: test_gettimeofday_monotonic seconds\n"); return 1; } uint64 end_time = 1000000ll*duration; gettimeofday(&start, 0); cur = start; while (time_elapsed < end_time) { prev = cur; gettimeofday(&cur, 0); long diff = TIMEDIFF(cur, prev); if (diff < 0) { printf("Time warp: %ld < 0 => %ld.%ld < %ld.%ld\n", diff, cur.tv_sec, cur.tv_usec, prev.tv_sec, prev.tv_usec); return -1; } int bits = 0; while (diff) { diff >>= 1; bits++; } timings[bits]++; loop_count++; time_elapsed = TIMEDIFF(cur, start); } gettimeofday(&end, 0); printf("Per loop: %0.2f ns\n", ((double) TIMEDIFF(end, start))*1000/loop_count); printf("%9s: %10s %9s\n", "usec", "count", "percent"); int found = 0; for (i = 31; i >= 0; i--) { if (found || timings[i]) { found = 1; printf("%9ld: %10lld %8.5f%%\n", 1l<<i>>1, timings[i], ((double)timings[i])*100/loop_count); } } return 0; }
-- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers