On Sun, Jul 07, 2024 at 08:14:18PM +0530, Athira Rajeev wrote: > Currently data_type_cmp() only compares size and type name. > But in cases where the type name of two data type entries > is same, but var_name is different, the comparison can't distinguish > two different types. > > Consider there is a "long unsigned int" with var_name as "X" and there > is global variable "long unsigned int". Currently since > data_type_cmp uses only type_name for comparison ( "long unsigned int"), > it won't distinguish these as separate entries. Update the
I'm still not sure if it's ok. It intentionally merges different instances of the same type together as it's a data 'type' profile. > functions "data_type_cmp" as well as "sort__typeoff_sort" to > compare variable names after type name if it exists. > > Also updated "hist_entry__typeoff_snprintf" to print var_name if > it is set. With the changes, > > 11.42% long unsigned int long unsigned int +0 (current_stack_pointer) > 4.68% struct paca_struct struct paca_struct +2312 (__current) > 4.57% struct paca_struct struct paca_struct +2354 (irq_soft_mask) > 2.69% struct paca_struct struct paca_struct +2808 (canary) > 2.68% struct paca_struct struct paca_struct +8 (paca_index) > 2.24% struct paca_struct struct paca_struct +48 (data_offset) > 1.43% long unsigned int long unsigned int +0 (no field) It seems like an output of `perf report -s type,typeoff`. But I'm curious how it'd work with -s type only? I guess it'd have two separate entries for 'long unsigned int'. Ideally we can have a single entry with two different fields. For example, `perf report -s type,typeoff -H`: 12.85% long unsigned int 11.42% long unsigned int +0 (current_stack_pointer) 1.43% long unsigned int +0 (no field) ... > > Signed-off-by: Athira Rajeev <atraj...@linux.vnet.ibm.com> > --- > tools/perf/util/annotate-data.c | 24 ++++++++++++++++++++++-- > tools/perf/util/sort.c | 23 +++++++++++++++++++++-- > 2 files changed, 43 insertions(+), 4 deletions(-) > > diff --git a/tools/perf/util/annotate-data.c b/tools/perf/util/annotate-data.c > index 8d05f3dbddf6..759c6a22e719 100644 > --- a/tools/perf/util/annotate-data.c > +++ b/tools/perf/util/annotate-data.c > @@ -167,7 +167,7 @@ static void exit_type_state(struct type_state *state) > } > > /* > - * Compare type name and size to maintain them in a tree. > + * Compare type name, var_name and size to maintain them in a tree. > * I'm not sure if DWARF would have information of a single type in many > * different places (compilation units). If not, it could compare the > * offset of the type entry in the .debug_info section. > @@ -176,12 +176,32 @@ static int data_type_cmp(const void *_key, const struct > rb_node *node) > { > const struct annotated_data_type *key = _key; > struct annotated_data_type *type; > + int64_t ret = 0; > > type = rb_entry(node, struct annotated_data_type, node); > > if (key->self.size != type->self.size) > return key->self.size - type->self.size; > - return strcmp(key->self.type_name, type->self.type_name); > + > + ret = strcmp(key->self.type_name, type->self.type_name); > + if (ret) { > + return ret; > + } No need for the parentheses. > + > + /* > + * Compare var_name if it exists for key and type. > + * If both nodes doesn't have var_name, but one of > + * them has, return non-zero. This is to indicate nodes > + * are not the same if one has var_name, but other doesn't. > + */ > + if (key->self.var_name && type->self.var_name) { > + ret = strcmp(key->self.var_name, type->self.var_name); > + if (ret) > + return ret; > + } else if (key->self.var_name || type->self.var_name) > + return 1; I think you need to compare the order properly like in cmp_null() in util/sort.c. Please see below. > + > + return ret; > } > > static bool data_type_less(struct rb_node *node_a, const struct rb_node > *node_b) > diff --git a/tools/perf/util/sort.c b/tools/perf/util/sort.c > index cd39ea972193..c6d885060ee7 100644 > --- a/tools/perf/util/sort.c > +++ b/tools/perf/util/sort.c > @@ -2267,9 +2267,25 @@ sort__typeoff_sort(struct hist_entry *left, struct > hist_entry *right) > right_type = right->mem_type; > } > > + /* > + * Compare type_name first. Next, ompare var_name if it exists > + * for left and right hist_entry. If both entries doesn't have > + * var_name, but one of them has, return non-zero. This is to > + * indicate entries are not the same if one has var_name, but the > + * other doesn't. > + * If type_name and var_name is same, use mem_type_off field. > + */ > ret = strcmp(left_type->self.type_name, right_type->self.type_name); > if (ret) > return ret; > + > + if (left_type->self.var_name && right_type->self.var_name) { > + ret = strcmp(left_type->self.var_name, > right_type->self.var_name); > + if (ret) > + return ret; > + } else if (right_type->self.var_name || left_type->self.var_name) > + return 1; } else if (!left_type->self.var_name != !right_type->self.var_name) return cmp_null(left_type->self.var_name, right_type->self.var_name); Thanks, Namhyung > + > return left->mem_type_off - right->mem_type_off; > } > > @@ -2305,9 +2321,12 @@ static int hist_entry__typeoff_snprintf(struct > hist_entry *he, char *bf, > char buf[4096]; > > buf[0] = '\0'; > - if (list_empty(&he_type->self.children)) > + if (list_empty(&he_type->self.children)) { > snprintf(buf, sizeof(buf), "no field"); > - else > + if (he_type->self.var_name) > + strcpy(buf, he_type->self.var_name); > + > + } else > fill_member_name(buf, sizeof(buf), &he_type->self, > he->mem_type_off, true); > buf[4095] = '\0'; > -- > 2.43.0 >