> It is because mask 0xffffffff is optimized to 0xfffffffc by keeping track
> of non-zero bits in registers and the above code doesn't take that
> into account.

Then I'd suggest modifying that code so that it does rather than
essentially duplicating it.  But I'd recommend running some
performance tests to verify that you're not pessimizing things when
you do that: this stuff can be very tricky and you want to make sure
that you're not converting something like (and X 3) into a bit
extraction unnecessarily.

Reply via email to