1)
__random random.c:295 (JCC forward cycles:1)
__random random.c:295 (RET cycles:9)
Jin Yao (5):
perf/core: Define the common branch type classification
perf/x86/intel: Record branch type
perf record: Create a new option save_type in --branch-filter
perf
record the branch
type.
Signed-off-by: Jin Yao
---
include/uapi/linux/perf_event.h | 37 ++-
tools/include/uapi/linux/perf_event.h | 37 ++-
2 files changed, 72 insertions(+), 2 deletions(-)
diff --git a/include/uapi/linux
the
branches cross 4K or 2MB areas. It's an approximate computing for
crossing 4K page or 2MB page.
Signed-off-by: Jin Yao
---
arch/x86/events/intel/lbr.c | 106 +++-
1 file changed, 105 insertions(+), 1 deletion(-)
diff --git a/arch/x86/events/intel/lb
The option indicates the kernel to save branch type during sampling.
One example:
perf record -g --branch-filter any,save_type
Signed-off-by: Jin Yao
---
tools/perf/Documentation/perf-record.txt | 1 +
tools/perf/util/parse-branch-options.c | 1 +
2 files changed, 2 insertions(+)
diff
g. We don't know if the area is 4K or
2MB, so always compute both.
To make the output simple, if a branch crosses 2M area, CROSS_4K
will not be incremented.
Signed-off-by: Jin Yao
---
tools/perf/builtin-report.c | 212
tools/perf/util/event.h
(JCC forward cycles:1)
__random random.c:295 (JCC forward cycles:1)
__random random.c:295 (RET cycles:9)
Signed-off-by: Jin Yao
---
tools/perf/util/callchain.c | 221
tools/perf/util/callchain.h | 20
2 files changed, 182
d/backward computing to user-space though
it makes user-space code to be complicated.
Thanks
Jin Yao
_random random.c:297 (JCC forward cycles:1)
__random random.c:295 (JCC forward cycles:1)
__random random.c:295 (JCC forward cycles:1)
__random random.c:295 (JCC forward cycles:1)
__random random.c:295 (RET cycles:9)
Jin Yao (5):
perf/core: Defi
disassemble the branch instruction and record the branch
type.
Signed-off-by: Jin Yao
---
include/uapi/linux/perf_event.h | 29 -
tools/include/uapi/linux/perf_event.h | 29 -
2 files changed, 56 insertions(+), 2 deletions(-)
diff --git a
Perf already has support for disassembling the branch instruction
and using the branch type for filtering. The patch just records
the branch type in perf_branch_entry.
Before recording, the patch converts the x86 branch classification
to common branch classification.
Signed-off-by: Jin Yao
The option indicates the kernel to save branch type during sampling.
One example:
perf record -g --branch-filter any,save_type
Signed-off-by: Jin Yao
---
tools/perf/Documentation/perf-record.txt | 1 +
tools/perf/util/parse-branch-options.c | 1 +
2 files changed, 2 insertions(+)
diff
g. We don't know if the area is 4K or
2MB, so always compute both.
To make the output simple, if a branch crosses 2M area, CROSS_4K
will not be incremented.
Signed-off-by: Jin Yao
---
tools/perf/builtin-report.c | 70 +
tools/perf/util/event.
forward CROSS_4K cycles:1)
__random random.c:295 (JCC backward CROSS_2M cycles:1)
__random random.c:295 (JCC forward CROSS_4K cycles:1)
__random random.c:295 (CROSS_2M RET cycles:9)
Signed-off-by: Jin Yao
---
tools/perf/util/callchain.c | 195
ch patch's description?
That's fine, I can add and resend this patch.
Thanks
Jin Yao
On 4/11/2017 4:35 PM, Peter Zijlstra wrote:
On Tue, Apr 11, 2017 at 04:11:21PM +0800, Jin, Yao wrote:
On 4/11/2017 3:52 PM, Peter Zijlstra wrote:
This is still a completely inadequate changelog. I really will not
accept patches like this.
Hi,
The changelog is added in the cover-letter
On 4/11/2017 4:18 PM, Peter Zijlstra wrote:
On Tue, Apr 11, 2017 at 09:52:19AM +0200, Peter Zijlstra wrote:
On Tue, Apr 11, 2017 at 06:56:30PM +0800, Jin Yao wrote:
@@ -960,6 +1006,11 @@ intel_pmu_lbr_filter(struct cpu_hw_events *cpuc)
cpuc->lbr_entries[i].from
__random random.c:295 (JCC forward cycles:1)
__random random.c:295 (JCC forward cycles:1)
__random random.c:295 (JCC forward cycles:1)
__random random.c:295 (RET cycles:9)
Jin Yao (5):
perf/core: Define the common branch type classification
perf
. Remove the "cross" field in perf_branch_entry. The cross page
computing will be done later in userspace.
Signed-off-by: Jin Yao
---
include/uapi/linux/perf_event.h | 29 -
tools/include/uapi/linux/perf_event.h | 29
are:
1. Uses a lookup table to convert x86 branch type to common branch
type.
2. Move the JCC forward/JCC backward and cross page computing to
user space.
3. Initialize branch type to 0 in intel_pmu_lbr_read_32 and
intel_pmu_lbr_read_64
Signed-off-by: Jin Yao
---
arch/x86/events
The option indicates the kernel to save branch type during sampling.
One example:
perf record -g --branch-filter any,save_type
Signed-off-by: Jin Yao
---
tools/perf/Documentation/perf-record.txt | 1 +
tools/perf/util/parse-branch-options.c | 1 +
2 files changed, 2 insertions(+)
diff
e from and to addresses.
Signed-off-by: Jin Yao
---
tools/perf/builtin-report.c | 70 +
tools/perf/util/event.h | 3 +-
tools/perf/util/hist.c | 5 +---
tools/perf/util/util.c | 59 ++
tools/perf/u
forward/JCC backward and cross
page checking in user space by from and to addresses, while each
callchain entry only contains one ip (either from or to), so
this patch will append a branch from address to the callchain
entry which just contains the to ip.
Signed-off-by: Jin Yao
---
tools/perf/util
On 4/12/2017 6:58 PM, Jiri Olsa wrote:
On Wed, Apr 12, 2017 at 06:21:01AM +0800, Jin Yao wrote:
SNIP
3. Use 2 bits in perf_branch_entry for a "cross" metrics checking
for branch cross 4K or 2M area. It's an approximate computing
for checking if the branch cross 4K p
On 4/12/2017 10:26 PM, Jiri Olsa wrote:
On Wed, Apr 12, 2017 at 08:25:34PM +0800, Jin, Yao wrote:
SNIP
# Overhead Command Source Shared Object Source Symbol
Target SymbolBasic Block Cycles
On 4/12/2017 6:58 PM, Jiri Olsa wrote:
On Wed, Apr 12, 2017 at 06:21:01AM +0800, Jin Yao wrote:
SNIP
3. Use 2 bits in perf_branch_entry for a "cross" metrics checking
for branch cross 4K or 2M area. It's an approximate computing
for checking if the branch cross 4K p
On 4/13/2017 10:00 AM, Jin, Yao wrote:
On 4/12/2017 6:58 PM, Jiri Olsa wrote:
On Wed, Apr 12, 2017 at 06:21:01AM +0800, Jin Yao wrote:
SNIP
3. Use 2 bits in perf_branch_entry for a "cross" metrics checking
for branch cross 4K or 2M area. It's an approximate computing
On 4/19/2017 2:53 AM, Jiri Olsa wrote:
On Wed, Apr 12, 2017 at 06:21:06AM +0800, Jin Yao wrote:
SNIP
static int counts_str_build(char *bf, int bfsize,
u64 branch_count, u64 predicted_count,
u64 abort_count, u64 cycles_count
On 4/19/2017 2:53 AM, Jiri Olsa wrote:
On Wed, Apr 12, 2017 at 06:21:06AM +0800, Jin Yao wrote:
SNIP
+static int branch_type_str(struct branch_type_stat *stat,
+ char *bf, int bfsize)
+{
+ int i, j = 0, printed = 0;
+ u64 total = 0;
+
+ for (i = 0
On 4/19/2017 2:53 AM, Jiri Olsa wrote:
On Wed, Apr 12, 2017 at 06:21:05AM +0800, Jin Yao wrote:
SNIP
+static int hist_iter__branch_callback(struct hist_entry_iter *iter,
+ struct addr_location *al __maybe_unused,
+ bool
On 4/19/2017 2:53 AM, Jiri Olsa wrote:
On Wed, Apr 12, 2017 at 06:21:05AM +0800, Jin Yao wrote:
SNIP
+const char *branch_type_name(int type)
+{
+ const char *branch_names[PERF_BR_MAX] = {
+ "N/A",
+ "JCC",
+ "JMP&q
On 4/19/2017 8:53 AM, Jin, Yao wrote:
On 4/19/2017 2:53 AM, Jiri Olsa wrote:
On Wed, Apr 12, 2017 at 06:21:05AM +0800, Jin Yao wrote:
SNIP
+const char *branch_type_name(int type)
+{
+const char *branch_names[PERF_BR_MAX] = {
+"N/A",
+"JCC&
ed:
perf record: Create a new option save_type in --branch-filter
v2:
---
1. Use 4 bits in perf_branch_entry to record branch type.
2. Pull out some common branch types from FAR_BRANCH. Now the branch
types defined in perf_event.h:
Jin Yao (7):
perf/core: Define the common branch type clas
. Remove the PERF_BR_JCC_FWD/PERF_BR_JCC_BWD, they will be
computed later in userspace.
2. Remove the "cross" field in perf_branch_entry. The cross page
computing will be done later in userspace.
Signed-off-by: Jin Yao
---
include/uapi/linux/perf_event.h
intel_pmu_lbr_read_64
Signed-off-by: Jin Yao
---
arch/x86/events/intel/lbr.c | 53 -
1 file changed, 52 insertions(+), 1 deletion(-)
diff --git a/arch/x86/events/intel/lbr.c b/arch/x86/events/intel/lbr.c
index f924629..f10a7ed 100644
--- a/arch/x86
The option indicates the kernel to save branch type during sampling.
One example:
perf record -g --branch-filter any,save_type
Change log
--
v5: Not changed.
Signed-off-by: Jin Yao
---
tools/perf/Documentation/perf-record.txt | 1 +
tools/perf/util/parse-branch-options.c | 1 +
2
eries.
Signed-off-by: Jin Yao
---
tools/perf/util/callchain.c | 106 +++-
1 file changed, 45 insertions(+), 61 deletions(-)
diff --git a/tools/perf/util/callchain.c b/tools/perf/util/callchain.c
index 2e5eff5..8cae8a6 100644
--- a/tools/perf/util/callchain.c
in v5 patch series.
Signed-off-by: Jin Yao
---
tools/perf/util/Build| 1 +
tools/perf/util/branch.c | 63
tools/perf/util/branch.h | 23 ++
tools/perf/util/event.h | 3 ++-
4 files changed, 89 insertions(+), 1 deletion(-)
create
rsion, the major changes are:
Add the computing of JCC forward/JCC backward and cross page checking
by using the from and to addresses.
Signed-off-by: Jin Yao
---
tools/perf/builtin-report.c | 69 +
tools/perf/util/hist.c | 5 +---
2 files c
compute the JCC forward/JCC backward and cross
page checking in user space by from and to addresses, while each
callchain entry only contains one ip (either from or to), so
this patch will append a branch from address to the callchain
entry which just contains the to ip.
Signed-off-by: Jin Yao
On 4/19/2017 10:15 PM, Jiri Olsa wrote:
On Wed, Apr 19, 2017 at 11:48:14PM +0800, Jin Yao wrote:
SNIP
+static int branch_type_str(struct branch_type_stat *stat,
+ char *bf, int bfsize)
+{
+ int i, j = 0, printed = 0;
+ u64 total = 0;
+
+ for (i = 0
h
types defined in perf_event.h:
Jin Yao (7):
perf/core: Define the common branch type classification
perf/x86/intel: Record branch type
perf record: Create a new option save_type in --branch-filter
perf report: Refactor the branch info printing code
perf util: Create branch.c/.h
changes are:
1. Remove the PERF_BR_JCC_FWD/PERF_BR_JCC_BWD, they will be
computed later in userspace.
2. Remove the "cross" field in perf_branch_entry. The cross page
computing will be done later in userspace.
Signed-off-by: Jin Yao
---
include/uapi/linux/perf_event.h
intel_pmu_lbr_read_32 and
intel_pmu_lbr_read_64
Signed-off-by: Jin Yao
---
arch/x86/events/intel/lbr.c | 53 -
1 file changed, 52 insertions(+), 1 deletion(-)
diff --git a/arch/x86/events/intel/lbr.c b/arch/x86/events/intel/lbr.c
index f924629..f10a7ed
The option indicates the kernel to save branch type during sampling.
One example:
perf record -g --branch-filter any,save_type
Change log
--
v6: Not changed.
v5: Not changed.
Signed-off-by: Jin Yao
---
tools/perf/Documentation/perf-record.txt | 1 +
tools/perf/util/parse-branch
into {} brackets in
counts_str_build()
2. Keep the original display order, that is:
predicted, abort, cycles, iterations
v5: It's a new patch in v5 patch series.
Signed-off-by: Jin Yao
---
tools/perf/util/callchain.c | 106
1
a new patch in v5 patch series.
Signed-off-by: Jin Yao
---
tools/perf/util/Build| 1 +
tools/perf/util/branch.c | 168 +++
tools/perf/util/branch.h | 25 +++
tools/perf/util/event.h | 3 +-
4 files changed, 196 insertions(+), 1 deletion(-)
c
ode checking in
hist_iter__branch_callback().
v4: Comparing to previous version, the major changes are:
Add the computing of JCC forward/JCC backward and cross page checking
by using the from and to addresses.
Signed-off-by: Jin Yao
---
tools/perf/builtin-report.c | 25 +
entry which just contains the to ip.
Signed-off-by: Jin Yao
---
tools/perf/util/callchain.c | 38 +-
tools/perf/util/callchain.h | 5 -
tools/perf/util/machine.c | 26 +-
3 files changed, 50 insertions(+), 19 deletions(-)
diff --
On 4/20/2017 5:36 PM, Jiri Olsa wrote:
On Thu, Apr 20, 2017 at 08:07:48PM +0800, Jin Yao wrote:
v6:
Update according to the review comments from
Jiri Olsa . Major modifications are:
1. Move that multiline conditional code inside {} brackets.
2. Move branch_type_stat_display
On 4/23/2017 9:55 PM, Jiri Olsa wrote:
On Thu, Apr 20, 2017 at 08:07:50PM +0800, Jin Yao wrote:
SNIP
+#define X86_BR_TYPE_MAP_MAX 16
+
+static int
+common_branch_type(int type)
+{
+ int i, mask;
+ const int branch_map[X86_BR_TYPE_MAP_MAX] = {
+ PERF_BR_CALL
On 4/24/2017 8:47 AM, Jin, Yao wrote:
On 4/23/2017 9:55 PM, Jiri Olsa wrote:
On Thu, Apr 20, 2017 at 08:07:50PM +0800, Jin Yao wrote:
SNIP
+#define X86_BR_TYPE_MAP_MAX16
+
+static int
+common_branch_type(int type)
+{
+int i, mask;
+const int branch_map[X86_BR_TYPE_MAP_MAX
On 5/9/2017 4:26 PM, Jiri Olsa wrote:
On Mon, Apr 24, 2017 at 08:47:14AM +0800, Jin, Yao wrote:
On 4/23/2017 9:55 PM, Jiri Olsa wrote:
On Thu, Apr 20, 2017 at 08:07:50PM +0800, Jin Yao wrote:
SNIP
+#define X86_BR_TYPE_MAP_MAX16
+
+static int
+common_branch_type(int type
On 5/9/2017 8:39 PM, Jiri Olsa wrote:
On Tue, May 09, 2017 at 07:57:11PM +0800, Jin, Yao wrote:
SNIP
+
+ type >>= 2; /* skip X86_BR_USER and X86_BR_KERNEL */
+ mask = ~(~0 << 1);
is that a fancy way to get 1 into the mask? what do I miss?
you did not comment on thi
Hi maintainers,
Is this patch series (v6) OK for merging?
Thanks
Jin Yao
On 4/20/2017 5:36 PM, Jiri Olsa wrote:
On Thu, Apr 20, 2017 at 08:07:48PM +0800, Jin Yao wrote:
v6:
Update according to the review comments from
Jiri Olsa . Major modifications are:
1. Move that multiline
Hi maintainers,
Is this patch series OK or anything I should update?
Thanks
Jin Yao
On 6/2/2017 4:02 PM, Jin, Yao wrote:
Hi maintainers,
Is this patch series (v6) OK for merging?
Thanks
Jin Yao
On 4/20/2017 5:36 PM, Jiri Olsa wrote:
On Thu, Apr 20, 2017 at 08:07:48PM +0800, Jin Yao
Hi Arnaldo,
Could this series be merged? It's more than 2 months since the last time
Jiri Olsa gave the ack.
Thanks
Jin Yao
On 6/26/2017 2:24 PM, Jin, Yao wrote:
Hi maintainers,
Is this patch series OK or anything I should update?
Thanks
Jin Yao
On 6/2/2017 4:02 PM, Jin, Yao
On 7/10/2017 2:05 PM, Michael Ellerman wrote:
Hi Jin Yao,
Sorry I haven't commented until now, but it got lost in the flood of
patches.
Never mind, it's no problem. :)
Just a few nit-picks below ...
Jin Yao writes:
It is often useful to know the branch types while analyzing b
prepare the new patch.
Thanks
Jin Yao
On 7/10/2017 6:32 PM, Michael Ellerman wrote:
"Jin, Yao" writes:
On 7/10/2017 2:05 PM, Michael Ellerman wrote:
Jin Yao writes:
It is often useful to know the branch types while analyzing branch
data. For example, a call is very different
*/
+ PERF_BR_IND_CALL= 5,/* indirect call */
+ PERF_BR_RET = 6,/* return */
I decide to only define these types in this patch set. For other more
arch-related branch type, we can add it in future.
Is this OK?
Thanks
Jin Yao
On 7/10/2017 9:10 PM, Segher Boessenkool
_BR_IND= 3,/* indirect */
PERF_BR_CALL= 4,/* call */
PERF_BR_IND_CALL= 5, /* indirect call */
PERF_BR_RET= 6,/* return */
Thanks
Jin Yao
On 7/11/2017 10:28 AM, Michael Ellerman wrote:
"Jin, Yao" writes:
On 7/10/2017 9:46 PM, Peter Zijlstra wrote:
On Mon, Jul 10, 2017 at 08:10:50AM -0500, Segher Boessenkool wrote:
PERF_BR_INT is triggered by instruction "int" .
PERF_BR_IRQ is triggered by interrupts
61 matches
Mail list logo