Re: [FFmpeg-devel] [PATCH] JPEG2000: SSE optimisation of DWT decoding

2017-08-11 Thread maxime taisant
> From: Maxime Taisant > > > From: Ivan Kalvachev > > > > On 8/8/17, maxime taisant wrote: > > > From: Maxime Taisant > > > > > > +movups m2, [lineq+2*j0q-24] > > > +movups m5, [lineq+2*j0q-8] > > > +

Re: [FFmpeg-devel] [PATCH] JPEG2000: SSE optimisation of DWT decoding

2017-08-10 Thread maxime taisant
> From: Clément Bœsch > > On Tue, Aug 08, 2017 at 09:09:44AM +0000, maxime taisant wrote: > > From: Maxime Taisant > > > > Hi, > > > > Here is some SSE optimisations for the dwt function used to decode > JPEG2000. > > I tested this code by u

Re: [FFmpeg-devel] [PATCH] JPEG2000: SSE optimisation of DWT decoding

2017-08-10 Thread maxime taisant
> From: Ivan Kalvachev > > On 8/8/17, maxime taisant wrote: > > From: Maxime Taisant > > > > Hi, > > > > Here is some SSE optimisations for the dwt function used to decode > JPEG2000. > > I tested this code by using the time command while readin

[FFmpeg-devel] [PATCH] JPEG2000: SSE optimisation of DWT decoding

2017-08-08 Thread maxime taisant
From: Maxime Taisant Hi, Here is some SSE optimisations for the dwt function used to decode JPEG2000. I tested this code by using the time command while reading a JPEG2000 encoded video with ffmpeg and, on average, I observed a 4.05% general improvement, and a 12.67% improvement on the dwt

[FFmpeg-devel] [PATCH][RFC] JPEG2000: SSE optimisation for DWT decoding

2017-07-20 Thread maxime taisant
From: Maxime Taisant Hi, I am currently working on SSE optimisations for the dwt functions used to decode JPEG2000. For the moment, I have only managed to produce a SSE-optimized version of the sr_1d97_float function (with relatively good results). I would like to have some comments on my