On Thursday 16 June 2016 02:41 AM, Yury Norov wrote: > On Wed, Jun 15, 2016 at 10:51:27PM +0300, Yury Norov wrote: >> Hi Madhavan, >> >> On Wed, Jun 15, 2016 at 05:12:53PM +0530, Madhavan Srinivasan wrote: >>> When decoding the perf_regs mask in regs_dump__printf(), >>> we loop through the mask using find_first_bit and find_next_bit functions. >>> And mask is of type "u64". But "u64" is send as a "unsigned long *" to >>> lib functions along with sizeof(). >>> >>> While the exisitng code works fine in most of the case, when using a 32bit >>> perf >>> on a 64bit kernel (Big Endian), we end reading the wrong word. In >>> find_first_bit(), >>> one word at a time (based on BITS_PER_LONG) is loaded and >>> checked for any bit set. In 32bit BE userspace, >>> BITS_PER_LONG turns out to be 32, and for a mask value of >>> "0x00000000000000ff", find_first_bit will return 32, instead of 0. >>> Reason for this is that, value in the word0 is all zeros and value >>> in word1 is 0xff. Ideally, second word in the mask should be loaded >>> and searched. Patch swaps the word to look incase of 32bit BE. >> I think this is not a problem of find_bit() at all. You have wrong >> typecast as the source of problem (tools/perf/util/session.c"): >> >> 940 static void regs_dump__printf(u64 mask, u64 *regs) >> 941 { >> 942 unsigned rid, i = 0; >> 943 >> 944 for_each_set_bit(rid, (unsigned long *) &mask, sizeof(mask) * 8) >> { >> ^^^^ Here ^^^^ >> 945 u64 val = regs[i++]; >> 946 >> 947 printf(".... %-5s 0x%" PRIx64 "\n", >> 948 perf_reg_name(rid), val); >> 949 } >> 950 } >> >> But for some reason you change correct find_bit()... >> >> Though proper fix is like this for me: >> >> static void regs_dump__printf(u64 mask, u64 *regs) >> { >> unsigned rid, i = 0; >> unsigned long _mask[sizeof(mask)/sizeof(unsigned long)]; >> >> _mask[0] = mask & ULONG_MAX; >> if (sizeof(mask) > sizeof(unsigned long)) >> _mask[1] = mask >> BITS_PER_LONG; >> >> for_each_set_bit(rid, _mask, sizeof(mask) * BITS_PER_BYTE) { >> u64 val = regs[i++]; >> >> printf(".... %-5s 0x%" PRIx64 "\n", >> perf_reg_name(rid), val); >> } >> } >> >> Maybe there already is some macro doing the conversion for you... > yes it is, cpu_to_le64() is what you want
no wait, on second look, cpu_to_le64() is not right. Because we will end up swapping within 32bit. But what you suggested looks to be fine. I can repost this with one minor tweak, right shift with 32 instead of BITS_PER_LONG (since I see compiler errors in 64bit). Maddy > >> Yury. >> >>> Cc: Arnaldo Carvalho de Melo <a...@kernel.org> >>> Cc: Adrian Hunter <adrian.hun...@intel.com> >>> Cc: Borislav Petkov <b...@suse.de> >>> Cc: David Ahern <dsah...@gmail.com> >>> Cc: George Spelvin <li...@horizon.com> >>> Cc: Jiri Olsa <jo...@redhat.com> >>> Cc: Namhyung Kim <namhy...@kernel.org> >>> Cc: Rasmus Villemoes <li...@rasmusvillemoes.dk> >>> Cc: Wang Nan <wangn...@huawei.com> >>> Cc: Yury Norov <yury.no...@gmail.com> >>> Cc: Michael Ellerman <m...@ellerman.id.au> >>> Signed-off-by: Madhavan Srinivasan <ma...@linux.vnet.ibm.com> >>> --- >>> tools/lib/find_bit.c | 17 +++++++++++++++++ >>> 1 file changed, 17 insertions(+) >>> >>> diff --git a/tools/lib/find_bit.c b/tools/lib/find_bit.c >>> index 9122a9e80046..996b3e04324f 100644 >>> --- a/tools/lib/find_bit.c >>> +++ b/tools/lib/find_bit.c >>> @@ -37,7 +37,12 @@ static unsigned long _find_next_bit(const unsigned long >>> *addr, >>> if (!nbits || start >= nbits) >>> return nbits; >>> >>> +#if (__BYTE_ORDER == __BIG_ENDIAN) && (BITS_PER_LONG != 64) >>> + tmp = addr[(((nbits - 1)/BITS_PER_LONG) - (start / BITS_PER_LONG))] >>> + ^ invert; >>> +#else >>> tmp = addr[start / BITS_PER_LONG] ^ invert; >>> +#endif >>> >>> /* Handle 1st word. */ >>> tmp &= BITMAP_FIRST_WORD_MASK(start); >>> @@ -48,7 +53,12 @@ static unsigned long _find_next_bit(const unsigned long >>> *addr, >>> if (start >= nbits) >>> return nbits; >>> >>> +#if (__BYTE_ORDER == __BIG_ENDIAN) && (BITS_PER_LONG != 64) >>> + tmp = addr[(((nbits - 1)/BITS_PER_LONG) - (start / >>> BITS_PER_LONG))] >>> + ^ invert; >>> +#else >>> tmp = addr[start / BITS_PER_LONG] ^ invert; >>> +#endif >>> } >>> >>> return min(start + __ffs(tmp), nbits); >>> @@ -75,8 +85,15 @@ unsigned long find_first_bit(const unsigned long *addr, >>> unsigned long size) >>> unsigned long idx; >>> >>> for (idx = 0; idx * BITS_PER_LONG < size; idx++) { >>> +#if (__BYTE_ORDER == __BIG_ENDIAN) && (BITS_PER_LONG != 64) >>> + if (addr[(((size-1)/BITS_PER_LONG) - idx)]) >>> + return min(idx * BITS_PER_LONG + >>> + __ffs(addr[(((size-1)/BITS_PER_LONG) - idx)]), >>> + size); >>> +#else >>> if (addr[idx]) >>> return min(idx * BITS_PER_LONG + __ffs(addr[idx]), >>> size); >>> +#endif >>> } >>> >>> return size; >>> -- >>> 1.9.1 _______________________________________________ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev