Hi, attached is a v4 of the patch. There are not many changes, mostly some simple tidying up, except for handling the Windows.
After a bit more investigation, it seems to me that we can't really get the same behavior as in other systems - basically the timestamp is unavailable so we can't log the interval start. So it seems to me the best we can do is to disable this option on Windows (and this is done in the patch). So when you try to use "--aggregate-interval" on Win it will complain it's an unknown option. Now that I think of it, there might be a better solution that would not mimic the Linux/Unix behaviour exactly - we do have elapsed time since the start of the benchmark, so maybe we should use this instead of the timestamp? I mean on systems with reasonable gettimeofday we'd get 1345828501 5601 1542744 483552416 61 2573 1345828503 7884 1979812 565806736 60 1479 1345828505 7208 1979422 567277552 59 1391 1345828507 7685 1980268 569784714 60 1398 1345828509 7073 1979779 573489941 236 1411 but on Windows we'd get 0 5601 1542744 483552416 61 2573 2 7884 1979812 565806736 60 1479 4 7208 1979422 567277552 59 1391 6 7685 1980268 569784714 60 1398 8 7073 1979779 573489941 236 1411 i.e. the first column is "seconds since start of the test". Hmmmm, and maybe we could call 'gettimeofday' at the beginning, to get the timestamp of the test start (AFAIK the issue on Windows is the resolution, but it should be enough). Then we could add it up with the elapsed time and we'd get the same output as on Linux. And maybe this could be done in regular pgbench execution too? But I really need help from someone who knows Windows and can test this. Comments regarding Pavel's reviews are below: On 2.10.2012 20:08, Pavel Stehule wrote: > Hello > > I did review of pgbench patch > http://archives.postgresql.org/pgsql-hackers/2012-09/msg00737.php > > * this patch is cleanly patched > > * I had a problem with doc > > make[1]: Entering directory `/home/pavel/src/postgresql/doc/src/sgml' > openjade -wall -wno-unused-param -wno-empty -wfully-tagged -D . -D . > -c /usr/share/sgml/docbook/dsssl-stylesheets/catalog -d stylesheet.dsl > -t sgml -i output-html -V html-index postgres.sgml > openjade:pgbench.sgml:767:8:E: document type does not allow element > "TITLE" here; missing one of "ABSTRACT", "AUTHORBLURB", "MSGSET", > "CALLOUTLIST", "ITEMIZEDLIST", "ORDEREDLIST", "SEGMENTEDLIST", > "VARIABLELIST", "CAUTION", "IMPORTANT", "NOTE", "TIP", "WARNING", > "FORMALPARA", "BLOCKQUOTE", "EQUATION", "EXAMPLE", "FIGURE", "TABLE", > "PROCEDURE", "SIDEBAR", "QANDASET", "REFSECT3" start-tag > make[1]: *** [HTML.index] Error 1 > make[1]: *** Deleting file `HTML.index' > make[1]: Leaving directory `/home/pavel/src/postgresql/doc/src/sgml Hmmm, I've never got the docs to build at all, all I do get is a heap of some SGML/DocBook related issues. Can you investigate a bit more where's the issue? I don't remember messing with the docs in a way that might cause this ... mostly copy'n'paste edits. > * pgbench is compiled without warnings > > * there is a few issues in source code > > + > + /* should we aggregate the results or not? */ > + if (use_log_agg) > + { > + > + /* are we still in the same interval? if yes, > accumulate the > + * values (print them otherwise) */ > + if (agg->start_time + use_log_agg >= > INSTR_TIME_GET_DOUBLE(now)) > + { > + Errr, so what are the issues here? > > * a real time range for aggregation is dynamic (pgbench is not > realtime application), so I am not sure if following constraint has > sense > > + if ((duration > 0) && (use_log_agg > 0) && (duration % use_log_agg != > 0)) { > + fprintf(stderr, "duration (%d) must be a multiple of aggregation > interval (%d)\n", duration, use_log_agg); > + exit(1); > + } I'm not sure what might be the issue here either. If the test duration is set (using "-T" option), then this kicks in and says that the duration should be a multiple of aggregation interval. Seems like a sensible assumption to me. Or do you think this is check should be removed? > * it doesn't flush last aggregated data (it is mentioned by Tomas: > "Also, I've realized the last interval may not be logged at all - I'll > take a look into this in the next version of the patch."). And it can > be significant for higher values of "use_log_agg" Yes, and I'm still not sure how to fix this in practice. But I do have this on my TODO. > > * a name of variable "use_log_agg" is not good - maybe "log_agg_interval" ? I've changed it to agg_interval. regards Tomas
diff --git a/contrib/pgbench/pgbench.c b/contrib/pgbench/pgbench.c index 090c210..2093edc 100644 --- a/contrib/pgbench/pgbench.c +++ b/contrib/pgbench/pgbench.c @@ -150,6 +150,7 @@ char *index_tablespace = NULL; #define naccounts 100000 bool use_log; /* log transaction latencies to a file */ +int agg_interval; /* log aggregates instead of individual transactions */ bool is_connect; /* establish connection for each transaction */ bool is_latencies; /* report per-command latencies */ int main_pid; /* main process id used in log filename */ @@ -245,6 +246,18 @@ typedef struct char *argv[MAX_ARGS]; /* command word list */ } Command; +typedef struct +{ + + long start_time; /* when does the interval start */ + int cnt; /* number of transactions */ + double min_duration; /* min/max durations */ + double max_duration; + double sum; /* sum(duration), sum(duration^2) - for estimates */ + double sum2; + +} AggVals; + static Command **sql_files[MAX_FILES]; /* SQL script files */ static int num_files; /* number of script files */ static int num_commands = 0; /* total number of Command structs */ @@ -377,6 +390,10 @@ usage(void) " -l write transaction times to log file\n" " --sampling-rate NUM\n" " fraction of transactions to log (e.g. 0.01 for 1%% sample)\n" +#ifndef WIN32 + " --aggregate-interval NUM\n" + " aggregate data over NUM seconds\n" +#endif " -M simple|extended|prepared\n" " protocol for submitting queries to server (default: simple)\n" " -n do not run VACUUM before tests\n" @@ -830,9 +847,22 @@ clientDone(CState *st, bool ok) return false; /* always false */ } +static +void agg_vals_init(AggVals * aggs, instr_time start) +{ + aggs->cnt = 0; + aggs->sum = 0; + aggs->sum2 = 0; + + aggs->min_duration = 3600 * 1000000.0; /* one hour */ + aggs->max_duration = 0; + + aggs->start_time = INSTR_TIME_GET_DOUBLE(start); +} + /* return false iff client should be disconnected */ static bool -doCustom(TState *thread, CState *st, instr_time *conn_time, FILE *logfile) +doCustom(TState *thread, CState *st, instr_time *conn_time, FILE *logfile, AggVals * agg) { PGresult *res; Command **commands; @@ -889,7 +919,7 @@ top: instr_time now; instr_time diff; double usec; - + /* * write the log entry if this row belongs to the random sample, * or no sampling rate was given which means log everything. @@ -902,17 +932,64 @@ top: diff = now; INSTR_TIME_SUBTRACT(diff, st->txn_begin); usec = (double) INSTR_TIME_GET_MICROSEC(diff); - + + /* should we aggregate the results or not? */ + if (agg_interval > 0) + { + + /* are we still in the same interval? if yes, accumulate the + * values (print them otherwise) */ + if (agg->start_time + agg_interval >= INSTR_TIME_GET_DOUBLE(now)) + { + + /* accumulate */ + agg->cnt += 1; + + agg->min_duration = (usec < agg->min_duration) ? usec : agg->min_duration; + agg->max_duration = (usec > agg->max_duration) ? usec : agg->max_duration; + + agg->sum += usec; + agg->sum2 += usec * usec; + + } + else + { + + /* This is a non-Windows branch, so we don't */ + /* This is more than we really ought to know about instr_time */ + fprintf(logfile, "%ld %d %.0f %.0f %.0f %.0f\n", + agg->start_time, agg->cnt, agg->sum, agg->sum2, + agg->min_duration, agg->max_duration); + + /* and now reset the values (include the current) */ + agg->cnt = 1; + agg->min_duration = usec; + agg->max_duration = usec; + + agg->sum = usec; + agg->sum2 = usec * usec; + + /* move to the next interval (there may be transactions longer than + * the desired interval */ + while (agg->start_time + agg_interval < INSTR_TIME_GET_DOUBLE(now)) + agg->start_time = agg->start_time + agg_interval; + } + + } + else + { + /* no, print raw transactions */ #ifndef WIN32 - /* This is more than we really ought to know about instr_time */ - fprintf(logfile, "%d %d %.0f %d %ld %ld\n", - st->id, st->cnt, usec, st->use_file, - (long) now.tv_sec, (long) now.tv_usec); + /* This is more than we really ought to know about instr_time */ + fprintf(logfile, "%d %d %.0f %d %ld %ld\n", + st->id, st->cnt, usec, st->use_file, + (long) now.tv_sec, (long) now.tv_usec); #else - /* On Windows, instr_time doesn't provide a timestamp anyway */ - fprintf(logfile, "%d %d %.0f %d 0 0\n", - st->id, st->cnt, usec, st->use_file); + /* On Windows, instr_time doesn't provide a timestamp anyway */ + fprintf(logfile, "%d %d %.0f %d 0 0\n", + st->id, st->cnt, usec, st->use_file); #endif + } } } @@ -1943,6 +2020,9 @@ main(int argc, char **argv) {"tablespace", required_argument, NULL, 2}, {"unlogged-tables", no_argument, &unlogged_tables, 1}, {"sampling-rate", required_argument, NULL, 4}, +#ifndef WIN32 + {"aggregate-interval", required_argument, NULL, 5}, +#endif {NULL, 0, NULL, 0} }; @@ -2156,6 +2236,16 @@ main(int argc, char **argv) exit(1); } break; +#ifndef WIN32 + case 5: + agg_interval = atoi(optarg); + if (agg_interval <= 0) + { + fprintf(stderr, "invalid number of seconds for aggregation: %d\n", agg_interval); + exit(1); + } + break; +#endif default: fprintf(stderr, _("Try \"%s --help\" for more information.\n"), progname); exit(1); @@ -2198,6 +2288,28 @@ main(int argc, char **argv) exit(1); } + /* --sampling-rate may must not be used with --aggregate-interval */ + if (sample_rate > 0.0 && agg_interval > 0) + { + fprintf(stderr, "log sampling (--sampling-rate) and aggregation (--aggregate-interval) can't be used at the same time\n"); + exit(1); + } + + if (agg_interval > 0 && (! use_log)) { + fprintf(stderr, "log aggregation is allowed only when actually logging transactions\n"); + exit(1); + } + + if ((duration > 0) && (agg_interval > duration)) { + fprintf(stderr, "number of seconds for aggregation (%d) must not be higher that test duration (%d)\n", agg_interval, duration); + exit(1); + } + + if ((duration > 0) && (agg_interval > 0) && (duration % agg_interval != 0)) { + fprintf(stderr, "duration (%d) must be a multiple of aggregation interval (%d)\n", duration, agg_interval); + exit(1); + } + /* * is_latencies only works with multiple threads in thread-based * implementations, not fork-based ones, because it supposes that the @@ -2457,7 +2569,10 @@ threadRun(void *arg) int remains = nstate; /* number of remaining clients */ int i; + AggVals aggs; + result = pg_malloc(sizeof(TResult)); + INSTR_TIME_SET_ZERO(result->conn_time); /* open log file if requested */ @@ -2492,6 +2607,8 @@ threadRun(void *arg) INSTR_TIME_SET_CURRENT(result->conn_time); INSTR_TIME_SUBTRACT(result->conn_time, thread->start_time); + agg_vals_init(&aggs, thread->start_time); + /* send start up queries in async manner */ for (i = 0; i < nstate; i++) { @@ -2500,7 +2617,7 @@ threadRun(void *arg) int prev_ecnt = st->ecnt; st->use_file = getrand(thread, 0, num_files - 1); - if (!doCustom(thread, st, &result->conn_time, logfile)) + if (!doCustom(thread, st, &result->conn_time, logfile, &aggs)) remains--; /* I've aborted */ if (st->ecnt > prev_ecnt && commands[st->state]->type == META_COMMAND) @@ -2602,7 +2719,7 @@ threadRun(void *arg) if (st->con && (FD_ISSET(PQsocket(st->con), &input_mask) || commands[st->state]->type == META_COMMAND)) { - if (!doCustom(thread, st, &result->conn_time, logfile)) + if (!doCustom(thread, st, &result->conn_time, logfile, &aggs)) remains--; /* I've aborted */ } @@ -2629,7 +2746,6 @@ done: return result; } - /* * Support for duration option: set timer_exceeded after so many seconds. */ diff --git a/doc/src/sgml/pgbench.sgml b/doc/src/sgml/pgbench.sgml index 91530ab..3600208 100644 --- a/doc/src/sgml/pgbench.sgml +++ b/doc/src/sgml/pgbench.sgml @@ -335,6 +335,21 @@ pgbench <optional> <replaceable>options</> </optional> <replaceable>dbname</> </varlistentry> <varlistentry> + <term><option>--aggregate-interval</option> <replaceable>seconds</></term> + <listitem> + <para> + Length of aggregation interval (in seconds). May be used only together + with <application>-l</application> - with this option, the log contains + per-interval summary (number of transactions, min/max latency and two + additional fields useful for variance estimation). + </para> + <para> + This option is not available on Windows. + </para> + </listitem> + </varlistentry> + + <varlistentry> <term><option>-M</option> <replaceable>querymode</></term> <listitem> <para> @@ -730,8 +745,9 @@ END; <title>Per-Transaction Logging</title> <para> - With the <option>-l</> option, <application>pgbench</> writes the time - taken by each transaction to a log file. The log file will be named + With the <option>-l</> option but without the <option>--aggregate-interval</option>, + <application>pgbench</> writes the time taken by each transaction + to a log file. The log file will be named <filename>pgbench_log.<replaceable>nnn</></filename>, where <replaceable>nnn</> is the PID of the pgbench process. If the <option>-j</> option is 2 or higher, creating multiple worker @@ -769,11 +785,50 @@ END; 0 202 2038 0 1175850569 2663 </screen></para> +<<<<<<< HEAD <para> When running a long test on hardware that can handle a lot of transactions, the log files can become very large. The <option>--sampling-rate</> option can be used to log only a random sample of transactions. </para> +======= + <title>Aggregated Logging</title> + + <para> + With the <option>--aggregate-interval</option> option, the logs use a bit different format: + +<synopsis> +<replaceable>interval_start</> <replaceable>num_of_transactions</> <replaceable>latency_sum</> <replaceable>latency_2_sum</> <replaceable>min_latency</> <replaceable>max_latency</> +</synopsis> + + where <replaceable>interval_start</> is the start of the interval (UNIX epoch + format timestamp), <replaceable>num_of_transactions</> is the number of transactions + within the interval, <replaceable>latency_sum</replaceable> is a sum of latencies + (so you can compute average latency easily). The following two fields are useful + for variance estimation - <replaceable>latency_sum</> is a sum of latencies and + <replaceable>latency_2_sum</> is a sum of 2nd powers of latencies. The last two + fields are <replaceable>min_latency</> - a minimum latency within the interval, and + <replaceable>max_latency</> - maximum latency within the interval. A transaction is + counted into the interval when it was committed. + </para> + + <para> + Here is example outputs: +<screen> +1345828501 5601 1542744 483552416 61 2573 +1345828503 7884 1979812 565806736 60 1479 +1345828505 7208 1979422 567277552 59 1391 +1345828507 7685 1980268 569784714 60 1398 +1345828509 7073 1979779 573489941 236 1411 +</screen></para> + + <para> + Notice that while the plain (unaggregated) log file contains index + of the custom script files, the aggregated log does not. Therefore if + you need per script data, you need to aggregate the data on your own. + </para> + +>>>>>>> pgbench log aggregation with custom interval </refsect2> <refsect2>
-- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers