> From: Maxime Taisant
>
> > From: Ivan Kalvachev
> >
> > On 8/8/17, maxime taisant wrote:
> > > From: Maxime Taisant
> > >
> > > +movups m2, [lineq+2*j0q-24]
> > > +movups m5, [lineq+2*j0q-8]
> > > +
> From: Clément Bœsch
>
> On Tue, Aug 08, 2017 at 09:09:44AM +0000, maxime taisant wrote:
> > From: Maxime Taisant
> >
> > Hi,
> >
> > Here is some SSE optimisations for the dwt function used to decode
> JPEG2000.
> > I tested this code by u
> From: Ivan Kalvachev
>
> On 8/8/17, maxime taisant wrote:
> > From: Maxime Taisant
> >
> > Hi,
> >
> > Here is some SSE optimisations for the dwt function used to decode
> JPEG2000.
> > I tested this code by using the time command while readin
From: Maxime Taisant
Hi,
Here is some SSE optimisations for the dwt function used to decode JPEG2000.
I tested this code by using the time command while reading a JPEG2000 encoded
video with ffmpeg and, on average, I observed a 4.05% general improvement, and
a 12.67% improvement on the dwt
From: Maxime Taisant
Hi,
I am currently working on SSE optimisations for the dwt functions used to
decode JPEG2000.
For the moment, I have only managed to produce a SSE-optimized version of the
sr_1d97_float function (with relatively good results).
I would like to have some comments on my