Record currently wakes up based on watermarks to read events from the mmaps and write them out to the file. The result is a file that can have large blocks of events per mmap before a finished round event is added to the stream. This in turn affects the quantity of events that have to be passed through the ordered events queue before results can be displayed to the user. For commands like perf-script this can lead to long unnecessarily long delays before a user gets output. Large systems (e.g, 1024 cpus) further compound this effect. I have seen instances where I have to wait 45 minutes for perf-script to process a 5GB file before any events are shown.
This patch adds an option to perf-record to allow a user to specify the poll timeout in msec. For example using 100 msec timeouts similar to perf-top means the mmaps are traversed much more frequently leading to a smoother analysis side. Signed-off-by: David Ahern <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Frederic Weisbecker <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Stephane Eranian <[email protected]> Cc: Adrian Hunter <[email protected]> --- tools/perf/Documentation/perf-record.txt | 6 ++++++ tools/perf/builtin-record.c | 5 ++++- tools/perf/perf.h | 1 + 3 files changed, 11 insertions(+), 1 deletion(-) diff --git a/tools/perf/Documentation/perf-record.txt b/tools/perf/Documentation/perf-record.txt index 355c4f5569b5..7010c363fdd1 100644 --- a/tools/perf/Documentation/perf-record.txt +++ b/tools/perf/Documentation/perf-record.txt @@ -250,6 +250,12 @@ is off by default. --running-time:: Record running and enabled time for read events (:S) +--poll=:: +Polling interval in msec. Defaults to infinite which means record relies on +watermarks to wakeup and read events from each mmap. Setting poll helps smooth +the event collection across mmaps and the subsequent processing of the data +file. For example perf-top uses a 100 msec polling interval. + SEE ALSO -------- linkperf:perf-stat[1], linkperf:perf-list[1] diff --git a/tools/perf/builtin-record.c b/tools/perf/builtin-record.c index 5a2ff510b75b..091868288d29 100644 --- a/tools/perf/builtin-record.c +++ b/tools/perf/builtin-record.c @@ -485,7 +485,7 @@ static int __cmd_record(struct record *rec, int argc, const char **argv) if (hits == rec->samples) { if (done || draining) break; - err = perf_evlist__poll(rec->evlist, -1); + err = perf_evlist__poll(rec->evlist, opts->poll_timeout); /* * Propagate error, only if there's any. Ignore positive * number of returned events and interrupt error. @@ -734,6 +734,7 @@ static struct record record = { .user_freq = UINT_MAX, .user_interval = ULLONG_MAX, .freq = 4000, + .poll_timeout = -1, .target = { .uses_mmap = true, .default_per_cpu = true, @@ -841,6 +842,8 @@ struct option __record_options[] = { "Sample machine registers on interrupt"), OPT_BOOLEAN(0, "running-time", &record.opts.running_time, "Record running/enabled time of read (:S) events"), + OPT_INTEGER(0, "poll", &record.opts.poll_timeout, + "poll interval in ms (defaults to infinite)"), OPT_END() }; diff --git a/tools/perf/perf.h b/tools/perf/perf.h index 1caa70a4a9e1..ee847c8af668 100644 --- a/tools/perf/perf.h +++ b/tools/perf/perf.h @@ -62,6 +62,7 @@ struct record_opts { u64 user_interval; bool sample_transaction; unsigned initial_delay; + int poll_timeout; }; struct option; -- 2.2.1 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [email protected] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/

