> From: Clément Bœsch <u...@pkh.me> > > On Tue, Aug 08, 2017 at 09:09:44AM +0000, maxime taisant wrote: > > From: Maxime Taisant <maximetais...@hotmail.fr> > > > > Hi, > > > > Here is some SSE optimisations for the dwt function used to decode > JPEG2000. > > I tested this code by using the time command while reading a > JPEG2000 encoded video with ffmpeg and, on average, I observed a > 4.05% general improvement, and a 12.67% improvement on the dwt > decoding part alone. > > In the nasm code, you can notice that the SR1DFLOAT macro appear > twice. One version is called in the nasm code by the HORSD macro > and the other is called in the C code of the dwt function, I couldn't > figure out a way to make only one macro. > > I also couldn't figure out a good way to optimize the VER_SD part, so > that is why I left it unchanged, with just a SSE-optimized version of > the SR_1D_FLOAT function. > > > > Regards. > > > > --- > > libavcodec/jpeg2000dwt.c | 21 +- > > libavcodec/jpeg2000dwt.h | 6 + > > libavcodec/x86/jpeg2000dsp.asm | 794 > ++++++++++++++++++++++++++++++++++++++ > > libavcodec/x86/jpeg2000dsp_init.c | 55 +++ > > 4 files changed, 863 insertions(+), 13 deletions(-) > > > > diff --git a/libavcodec/jpeg2000dwt.c b/libavcodec/jpeg2000dwt.c > index > > 55dd5e89b5..69c935980d 100644 > > --- a/libavcodec/jpeg2000dwt.c > > +++ b/libavcodec/jpeg2000dwt.c > > @@ -558,16 +558,19 @@ int ff_jpeg2000_dwt_init(DWTContext *s, > int border[2][2], > > } > > switch (type) { > > case FF_DWT97: > > + dwt_decode = dwt_decode97_float; > > s->f_linebuf = av_malloc_array((maxlen + 12), sizeof(*s- > >f_linebuf)); > > if (!s->f_linebuf) > > return AVERROR(ENOMEM); > > break; > > case FF_DWT97_INT: > > + dwt_decode = dwt_decode97_int; > > s->i_linebuf = av_malloc_array((maxlen + 12), sizeof(*s- > >i_linebuf)); > > if (!s->i_linebuf) > > return AVERROR(ENOMEM); > > break; > > case FF_DWT53: > > + dwt_decode = dwt_decode53; > > s->i_linebuf = av_malloc_array((maxlen + 6), sizeof(*s- > >i_linebuf)); > > if (!s->i_linebuf) > > return AVERROR(ENOMEM); > > Using globals is not acceptable, you need to fix that. >
Yeah, I can't even remember why I did that... I will fix it. Thank you. _______________________________________________ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpeg.org/mailman/listinfo/ffmpeg-devel