Re: [PATCH v2 3/5] perf report: Sort by sampled cycles percent per block for stdio

Jin, Yao Wed, 16 Oct 2019 03:51:50 -0700



On 10/16/2019 6:15 PM, Jiri Olsa wrote:

On Tue, Oct 15, 2019 at 10:53:18PM +0800, Jin, Yao wrote:

SNIP

+static struct block_header_column{
+       const char *name;
+       int width;
+} block_columns[PERF_HPP_REPORT__BLOCK_MAX_INDEX] = {
+       [PERF_HPP_REPORT__BLOCK_TOTAL_CYCLES_COV] = {
+               .name = "Sampled Cycles%",
+               .width = 15,
+       },
+       [PERF_HPP_REPORT__BLOCK_LBR_CYCLES] = {
+               .name = "Sampled Cycles",
+               .width = 14,
+       },
+       [PERF_HPP_REPORT__BLOCK_CYCLES_PCT] = {
+               .name = "Avg Cycles%",
+               .width = 11,
+       },
+       [PERF_HPP_REPORT__BLOCK_AVG_CYCLES] = {
+               .name = "Avg Cycles",
+               .width = 10,
+       },
+       [PERF_HPP_REPORT__BLOCK_RANGE] = {
+               .name = "[Program Block Range]",
+               .width = 70,
+       },
+       [PERF_HPP_REPORT__BLOCK_DSO] = {
+               .name = "Shared Object",
+               .width = 20,
+       }
   };


so we already have support for multiple columns,
why don't you add those as 'struct sort_entry' objects?


For 'struct sort_entry' objects, do you mean I should reuse the "sort_dso"
which has been implemented yet in util/sort.c?

For other columns, it looks we can't reuse the existing sort_entry objects.


I did not mean reuse, just add new sort entries
to current sort framework


Does it seem like what the c2c does?

For example, my new update is like:

struct block_dimension {
       const char              *header;
       int                     idx;
       int                     width;
       struct sort_entry       *se;
       int64_t (*cmp)(struct perf_hpp_fmt *,
                      struct hist_entry *, struct hist_entry *);
       int64_t (*sort)(struct perf_hpp_fmt *,
                       struct hist_entry *, struct hist_entry *);
       int   (*entry)(struct perf_hpp_fmt *, struct perf_hpp *,
                      struct hist_entry *);
};

struct block_fmt {
       struct perf_hpp_fmt     fmt;
       struct report           *rep;
       struct block_dimension  *dim;
};

static struct block_dimension dim_total_cycles_pct {
       .header = "Sampled Cycles%",
       .idx = PERF_HPP_REPORT__BLOCK_TOTAL_CYCLES_PCT,
       .width = 15,
       .cmp = block_info__cmp;
       .sort = block_cycles_cov_sort;
       .entry = block_cycles_cov_entry;
};

......

Is above new update correct?

SNIP

+{
+       struct block_hist *bh = &rep->block_hist;
+
+       get_block_hists(hists, bh, rep);
+       symbol_conf.report_individual_block = true;
+       hists__fprintf(&bh->block_hists, true, 0, 0, 0,
+                      stdout, true);
+       hists__delete_entries(&bh->block_hists);
+       return 0;
+}
+
   static int perf_evlist__tty_browse_hists(struct evlist *evlist,
                                         struct report *rep,
                                         const char *help)
@@ -500,6 +900,12 @@ static int perf_evlist__tty_browse_hists(struct evlist 
*evlist,
                        continue;
                hists__fprintf_nr_sample_events(hists, rep, evname, stdout);
+
+               if (rep->total_cycles) {
+                       hists__fprintf_all_blocks(hists, rep);


so this call kicks all the block info setup/count/print, right?


Yes, all in this call.

I thingk it shouldn't be in the output code, but in the code before..
from what I see you could count block_info counts during the sample
processing, no?


In sample processing, we just get all symbols and account the cycles per
symbol. We need to create/count the block_info at some points after the
sample processing.


understand, but it needs to be outside display function

OK, I will move the code for block_info collection outside of displayfunction.

also, can't you gather the block_info data gradually
during the sample processing?

Looks we have to gather the block_info after the sample processing.That's because in each sample processing, we will callhist__account_cycles(). Then finally __symbol__account_cycles() getscalled for accounting one basic block in this symbol.


ch[offset].num_aggr++;
ch[offset].cycles_aggr += cycles;

So actually, after the sample processing, we will get the num_aggr andcycles_aggr for this basic block and compute the average cycles for thisblock (cycles_aggr / num_aggr).

That's why I think we should gather the block_info after the sampleprocessing.


Thanks
Jin Yao

jirka


Maybe it's not very good to put block info setup/count/print in a call, but
it's really not easy to process the block_info during the sample processing.

Thanks
Jin Yao

jirka

Re: [PATCH v2 3/5] perf report: Sort by sampled cycles percent per block for stdio

Reply via email to