On Sun, 2013-09-29 at 01:57 -0500, Wang Dongsheng-B40534 wrote: > > > -----Original Message----- > > From: Wood Scott-B07421 > > Sent: Saturday, September 28, 2013 5:33 AM > > To: Wang Dongsheng-B40534 > > Cc: Wood Scott-B07421; Bhushan Bharat-R65777; linuxppc- > > d...@lists.ozlabs.org > > Subject: Re: [PATCH v4 4/4] powerpc/85xx: add sysfs for pw20 state and > > altivec idle > > > > On Thu, 2013-09-26 at 22:34 -0500, Wang Dongsheng-B40534 wrote: > > > cycle = div_u64(ns * tb_ticks_per_usec, 1000); It's look good. > > > But why we need: > > > if (ns >= 10000) > > > cycle = ((ns + 500) / 1000) * tb_ticks_per_usec; ? > > > > > > I think "cycle = div_u64(ns * tb_ticks_per_usec, 1000)" is good > > > enough. :) > > > > It's to deal with overflow if a very large value of ns is provided > > (and/or tb_ticks_per_usec is larger than we expect). > > > :), as I think, it's to deal with overflow. But you version cannot do deal > with it. > Because ns is u64, if ns > 0xffffffff_fffffe0b, ns + 500 will overflow, And > if tb_ticks_per_usec > 1000 and ns > (0xffffffff_ffffffff / 10) cycle also > will overflow.
Sigh... It significantly increases the value of ns at which you'll get overflow problems. :-) I was more concerned with large-but-somewhat-reasonable values of ns (e.g. than with trying to handle every possible input. Even without that test, getting overflow is stretching the bounds of reasonableness (e.g. a 1 GHz timebase would require a timeout of over 7 months to overflow), but it was low-hanging fruit. And the worst case is we go to pw20 sooner than the user wanted, so let's not go too overboard. I doubt you would see > 0xffffffff_fffffe0b except if someone is trying to disable it by setting 0xffffffff_ffffffff even though a separate API is provided to cleanly disable it. > I think we need to do this: > > #define U64_LOW_MASK 0xffffffffULL > #define U64_MASK 0xffffffffffffffffULL > > u32 tmp_rem; > u64 ns_u_rem, ns_u, ns_l, ns_l_carry; > u64 cycle; > > ns_u = ns >> 32; > ns_l = ns & U64_LOW_MASK; > > ns_l *= tb_ticks_per_usec; > ns_l_carry = ns_l >> 32; > ns_u *= tb_ticks_per_usec; > ns_u += ns_l_carry; > > ns_u = div_u64_rem(ns_u, 1000, &tmp_rem); > ns_u_rem = tmp_rem; > ns_l = (ns_l & U64_LOW_MASK) | ((ns_u_rem) << 32); > ns_l = div_u64(ns_l, 1000); > > if (ns_u >> 32) > cycle = U64_MASK; > else > cycle = (ns_u << 32) | (ns_l & U64_LOW_MASK); > > I has already tested this code, and works good. :) Ugh. I don't think we need to get this complicated (and I'd rather not spend the time verifying the correctness of this). If for some reason we did need something like this in some other context (I don't want to see it just for pw20), I'd be more inclined to see general 128-bit mult/divide support. -Scott _______________________________________________ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev