On Sat, 2007-10-27 at 15:27 +0200, Christian "Eddie" Dost wrote:
> The sparc64 popc works in O(lg(n))
No, it has a fix cost, whatever the operand is.
It has another advantage: it does not need any intermediate variable,
which is great when running on CISC host in the Qemu execution
environmnent.
The sparc64 popc works in O(lg(n)), the "optimized" code below work in
O(n). Could be better to generalize the sparc64 code, like this:
static always_inline int ctpop32 (uint32_t val)
{
uint32_t i;
i = (val & 0x) + ((val >> 1) & 0x);
i = (i & 0x333
On Sat, 2007-10-27 at 16:01 +0300, Blue Swirl wrote:
> On 10/27/07, J. Mayer <[EMAIL PROTECTED]> wrote:
> > I also got optimized versions of bit population count which could also
> > be shared:
> > static always_inline int ctpop32 (uint32_t val)
> > {
> > int i;
> >
> > for (i = 0; val != 0
J. Mayer wrote:
>
> On Sat, 2007-10-27 at 12:19 +0100, Thiemo Seufer wrote:
> > J. Mayer wrote:
> > > The latest patches in clo makes gcc 3.4.6 fail to build the mips64
> > > targets on my amd64 host (looks like an register allocation clash in the
> > > optimizer code).
> >
> > Your version is li
On 10/27/07, J. Mayer <[EMAIL PROTECTED]> wrote:
> I also got optimized versions of bit population count which could also
> be shared:
> static always_inline int ctpop32 (uint32_t val)
> {
> int i;
>
> for (i = 0; val != 0; i++)
> val = val ^ (val - 1);
>
> return i;
> }
>
> If
On Sat, 2007-10-27 at 12:19 +0100, Thiemo Seufer wrote:
> J. Mayer wrote:
> > The latest patches in clo makes gcc 3.4.6 fail to build the mips64
> > targets on my amd64 host (looks like an register allocation clash in the
> > optimizer code).
>
> Your version is likely faster as well.
>
> > Furt
J. Mayer wrote:
> The latest patches in clo makes gcc 3.4.6 fail to build the mips64
> targets on my amd64 host (looks like an register allocation clash in the
> optimizer code).
Your version is likely faster as well.
> Furthermore, the clz micro-op for Mips seems very suspect to me,
> according
The latest patches in clo makes gcc 3.4.6 fail to build the mips64
targets on my amd64 host (looks like an register allocation clash in the
optimizer code).
Furthermore, the clz micro-op for Mips seems very suspect to me,
according to the changes made in the clo implementation.
I did change the clz