Re: [FFmpeg-devel] [PATCH] JPEG2000: SSE optimisation of DWT decoding

2017-10-06 Thread Michael Niedermayer
On Fri, Oct 06, 2017 at 05:30:57PM +0200, Nicolas Bertrand wrote: > From: Maxime Taisant > > --- > libavcodec/jpeg2000dwt.c | 45 +- > libavcodec/jpeg2000dwt.h |5 + > libavcodec/x86/jpeg2000dsp.asm| 1339 > + > libavcodec/x86/jpeg

Re: [FFmpeg-devel] [PATCH] JPEG2000: SSE optimisation of DWT decoding

2017-10-06 Thread Carl Eugen Hoyos
2017-10-06 17:30 GMT+02:00 Nicolas Bertrand : > From: Maxime Taisant > > --- > libavcodec/jpeg2000dwt.c | 45 +- > libavcodec/jpeg2000dwt.h |5 + > libavcodec/x86/jpeg2000dsp.asm| 1339 > + > libavcodec/x86/jpeg2000dsp_init.c | 119

[FFmpeg-devel] [PATCH] JPEG2000: SSE optimisation of DWT decoding

2017-10-06 Thread Nicolas Bertrand
From: Maxime Taisant --- libavcodec/jpeg2000dwt.c | 45 +- libavcodec/jpeg2000dwt.h |5 + libavcodec/x86/jpeg2000dsp.asm| 1339 + libavcodec/x86/jpeg2000dsp_init.c | 119 tests/checkasm/jpeg2000dsp.c |1 + 5 files cha

Re: [FFmpeg-devel] [PATCH] JPEG2000: SSE optimisation of DWT decoding

2017-08-11 Thread Clément Bœsch
On Fri, Aug 11, 2017 at 06:32:37PM +0300, Ivan Kalvachev wrote: > On 8/10/17, maxime taisant wrote: > >> From: Ivan Kalvachev > >> On 8/8/17, maxime taisant wrote: > >> > From: Maxime Taisant > >> > > >> > Hi, > >> > > >> > Here is some SSE optimisations for the dwt function used to decode > >

Re: [FFmpeg-devel] [PATCH] JPEG2000: SSE optimisation of DWT decoding

2017-08-11 Thread Ivan Kalvachev
On 8/10/17, maxime taisant wrote: >> From: Ivan Kalvachev >> On 8/8/17, maxime taisant wrote: >> > From: Maxime Taisant >> > >> > Hi, >> > >> > Here is some SSE optimisations for the dwt function used to decode >> > JPEG2000. >> > I tested this code by using the time command while reading a JP

Re: [FFmpeg-devel] [PATCH] JPEG2000: SSE optimisation of DWT decoding

2017-08-11 Thread maxime taisant
> From: Maxime Taisant > > > From: Ivan Kalvachev > > > > On 8/8/17, maxime taisant wrote: > > > From: Maxime Taisant > > > > > > +movups m2, [lineq+2*j0q-24] > > > +movups m5, [lineq+2*j0q-8] > > > +shufps m2, m5, 0xDD > > > +addps m2, m1 > > > +mulps m2, m3 > > > +su

Re: [FFmpeg-devel] [PATCH] JPEG2000: SSE optimisation of DWT decoding

2017-08-10 Thread maxime taisant
> From: Clément Bœsch > > On Tue, Aug 08, 2017 at 09:09:44AM +, maxime taisant wrote: > > From: Maxime Taisant > > > > Hi, > > > > Here is some SSE optimisations for the dwt function used to decode > JPEG2000. > > I tested this code by using the time command while reading a > JPEG2000 enco

Re: [FFmpeg-devel] [PATCH] JPEG2000: SSE optimisation of DWT decoding

2017-08-10 Thread maxime taisant
> From: Ivan Kalvachev > > On 8/8/17, maxime taisant wrote: > > From: Maxime Taisant > > > > Hi, > > > > Here is some SSE optimisations for the dwt function used to decode > JPEG2000. > > I tested this code by using the time command while reading a > JPEG2000 > > encoded video with ffmpeg and,

Re: [FFmpeg-devel] [PATCH] JPEG2000: SSE optimisation of DWT decoding

2017-08-09 Thread Michael Bradshaw
On Tue, Aug 8, 2017 at 2:09 AM, maxime taisant wrote: > > [...] > +void (*dwt_decode)(DWTContext *s, void *t); Why the global variable? It seems unnecessary, and as Clément pointed out, is unsafe and should not be used in the FFmpeg code base (at least not without a very good justification and s

Re: [FFmpeg-devel] [PATCH] JPEG2000: SSE optimisation of DWT decoding

2017-08-08 Thread Clément Bœsch
On Tue, Aug 08, 2017 at 09:09:44AM +, maxime taisant wrote: > From: Maxime Taisant > > Hi, > > Here is some SSE optimisations for the dwt function used to decode JPEG2000. > I tested this code by using the time command while reading a JPEG2000 encoded > video with ffmpeg and, on average, I

Re: [FFmpeg-devel] [PATCH] JPEG2000: SSE optimisation of DWT decoding

2017-08-08 Thread Ivan Kalvachev
On 8/8/17, maxime taisant wrote: > From: Maxime Taisant > > Hi, > > Here is some SSE optimisations for the dwt function used to decode JPEG2000. > I tested this code by using the time command while reading a JPEG2000 > encoded video with ffmpeg and, on average, I observed a 4.05% general > improv

[FFmpeg-devel] [PATCH] JPEG2000: SSE optimisation of DWT decoding

2017-08-08 Thread maxime taisant
From: Maxime Taisant Hi, Here is some SSE optimisations for the dwt function used to decode JPEG2000. I tested this code by using the time command while reading a JPEG2000 encoded video with ffmpeg and, on average, I observed a 4.05% general improvement, and a 12.67% improvement on the dwt dec