Hi,
On Sat, Oct 10, 2015 at 12:55 PM, Henrik Gramner wrote:
> On Sat, Oct 10, 2015 at 4:36 AM, Ronald S. Bultje
> wrote:
> > @@ -674,6 +715,24 @@ cglobal vp9_idct_idct_8x8_add_10, 4, 6 +
> ARCH_X86_64, 10, \
> [...]
> > +shl skipd, 1
> > +lea blockq, [blockq+ski
On Sat, Oct 10, 2015 at 4:36 AM, Ronald S. Bultje wrote:
> @@ -674,6 +715,24 @@ cglobal vp9_idct_idct_8x8_add_10, 4, 6 + ARCH_X86_64,
> 10, \
[...]
> +shl skipd, 1
> +lea blockq, [blockq+skipq*(mmsize>>1)]
add skipd, skipd
Nit: mmsize/2 is more readable than mmsi
These aren't quite as helpful as the ones in 8bpp, since over there,
we can use pmulhrsw, but here the coefficients have too many bits to
be able to take advantage of pmulhrsw. However, we can still skip
cols for which all coefs are 0, and instead just zero the input data
for the row itx. This help