Re: [FFmpeg-devel] [PATCH] avcodec/imc: cast float to int prior to comparing with int variable
On 6/26/17, Paul B Mahol wrote: > On 6/26/17, Michael Niedermayer wrote: >> On Mon, Jun 26, 2017 at 12:44:48PM +0200, Paul B Mahol wrote: >>> From: Kostya Shishkov >>> >>> Fixes #3886. >>> >>> Signed-off-by: Paul B Mahol >>> --- >>> libavcodec/imc.c | 2 +- >>> 1 file changed, 1 insertion(+), 1 deletion(-) >> >> breaks fate-imc >> >> [...] >> >> -- >> Michael GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB >> >> If you think the mosad wants you dead since a long time then you are >> either >> wrong or dead since a long time. >> > > Yes I know. Are you willing to add new file based on new output? > Hello? ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
Re: [FFmpeg-devel] [PATCH] Add wayland support for VAAPI
On 27/06/17 07:38, David Fort wrote: > Wayland environment became quite popular with gnome 3. This patch adds the > ability to > initialize the VAAPI accelerator from a wayland display. Is there some specific use-case which needs this? The X11 support mainly exists because of old systems (*cough* Intel Media SDK *cough*) which don't support render nodes, and therefore have to connect to a DRM master device - if X11 is running then it has to use the DRI2 authentication ritual to do that. Anything recent should be connecting via the render node, which allows it to be used independently of any other system and has clear control over what hardware you are using on systems with multiple possibilities. Thanks, - Mark ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
Re: [FFmpeg-devel] [PATCH 1/3] avcodec/proresenc: switch default prores encoder to prores_ks
On 6/27/17, Michael Niedermayer wrote: > On Tue, Jun 27, 2017 at 12:20:05AM +0200, Paul B Mahol wrote: >> On 6/27/17, Michael Niedermayer wrote: >> > On Mon, Jun 26, 2017 at 11:55:35PM +0200, Paul B Mahol wrote: >> >> Rationale: >> >> prores_ks have more features and is faster for qscale > 0 >> >> and gives better quality output. >> >> >> >> Signed-off-by: Paul B Mahol >> >> --- >> >> libavcodec/Makefile | 2 +- >> >> libavcodec/proresenc_anatoliy.c | 14 -- >> >> libavcodec/proresenc_kostya.c | 24 >> >> tests/fate/vcodec.mak | 1 + >> >> tests/ref/vsynth/vsynth1-prores | 8 >> >> tests/ref/vsynth/vsynth2-prores | 8 >> >> tests/ref/vsynth/vsynth3-prores | 8 >> >> 7 files changed, 38 insertions(+), 27 deletions(-) >> >> >> >> diff --git a/libavcodec/Makefile b/libavcodec/Makefile >> >> index f0cba88..16dce40 100644 >> >> --- a/libavcodec/Makefile >> >> +++ b/libavcodec/Makefile >> >> @@ -477,7 +477,7 @@ OBJS-$(CONFIG_PPM_DECODER) += pnmdec.o >> >> pnm.o >> >> OBJS-$(CONFIG_PPM_ENCODER) += pnmenc.o >> >> OBJS-$(CONFIG_PRORES_DECODER) += proresdec2.o proresdsp.o >> >> proresdata.o >> >> OBJS-$(CONFIG_PRORES_LGPL_DECODER) += proresdec_lgpl.o >> >> proresdsp.o >> >> proresdata.o >> >> -OBJS-$(CONFIG_PRORES_ENCODER) += proresenc_anatoliy.o >> >> +OBJS-$(CONFIG_PRORES_ENCODER) += proresenc_kostya.o >> >> proresdata.o >> >> OBJS-$(CONFIG_PRORES_AW_ENCODER) += proresenc_anatoliy.o >> >> OBJS-$(CONFIG_PRORES_KS_ENCODER) += proresenc_kostya.o >> >> proresdata.o >> >> OBJS-$(CONFIG_PSD_DECODER) += psd.o >> >> diff --git a/libavcodec/proresenc_anatoliy.c >> >> b/libavcodec/proresenc_anatoliy.c >> >> index 0516066..7ff6ff7 100644 >> >> --- a/libavcodec/proresenc_anatoliy.c >> >> +++ b/libavcodec/proresenc_anatoliy.c >> >> @@ -614,17 +614,3 @@ AVCodec ff_prores_aw_encoder = { >> >> .capabilities = AV_CODEC_CAP_FRAME_THREADS | >> >> AV_CODEC_CAP_INTRA_ONLY, >> >> .profiles = profiles >> >> }; >> >> - >> >> -AVCodec ff_prores_encoder = { >> >> -.name = "prores", >> >> -.long_name = NULL_IF_CONFIG_SMALL("Apple ProRes"), >> >> -.type = AVMEDIA_TYPE_VIDEO, >> >> -.id = AV_CODEC_ID_PRORES, >> >> -.priv_data_size = sizeof(ProresContext), >> >> -.init = prores_encode_init, >> >> -.close = prores_encode_close, >> >> -.encode2= prores_encode_frame, >> >> -.pix_fmts = (const enum >> >> AVPixelFormat[]){AV_PIX_FMT_YUV422P10, >> >> AV_PIX_FMT_NONE}, >> >> -.capabilities = AV_CODEC_CAP_FRAME_THREADS | >> >> AV_CODEC_CAP_INTRA_ONLY, >> >> -.profiles = profiles >> >> -}; >> >> diff --git a/libavcodec/proresenc_kostya.c >> >> b/libavcodec/proresenc_kostya.c >> >> index 09bb611..ad979c2 100644 >> >> --- a/libavcodec/proresenc_kostya.c >> >> +++ b/libavcodec/proresenc_kostya.c >> >> @@ -1341,6 +1341,13 @@ static const AVClass proresenc_class = { >> >> .version= LIBAVUTIL_VERSION_INT, >> >> }; >> >> >> >> +static const AVClass prores_class = { >> >> +.class_name = "ProRes", >> >> +.item_name = av_default_item_name, >> >> +.option = options, >> >> +.version= LIBAVUTIL_VERSION_INT, >> >> +}; >> >> + >> >> AVCodec ff_prores_ks_encoder = { >> >> .name = "prores_ks", >> >> .long_name = NULL_IF_CONFIG_SMALL("Apple ProRes (iCodec >> >> Pro)"), >> >> @@ -1357,3 +1364,20 @@ AVCodec ff_prores_ks_encoder = { >> >>}, >> >> .priv_class = &proresenc_class, >> >> }; >> >> + >> >> +AVCodec ff_prores_encoder = { >> >> +.name = "prores", >> >> +.long_name = NULL_IF_CONFIG_SMALL("Apple ProRes (iCodec >> >> Pro)"), >> >> +.type = AVMEDIA_TYPE_VIDEO, >> >> +.id = AV_CODEC_ID_PRORES, >> >> +.priv_data_size = sizeof(ProresContext), >> >> +.init = encode_init, >> >> +.close = encode_close, >> >> +.encode2= encode_frame, >> >> +.capabilities = AV_CODEC_CAP_SLICE_THREADS | >> >> AV_CODEC_CAP_FRAME_THREADS | AV_CODEC_CAP_INTRA_ONLY, >> >> +.pix_fmts = (const enum AVPixelFormat[]) { >> >> + AV_PIX_FMT_YUV422P10, AV_PIX_FMT_YUV444P10, >> >> + AV_PIX_FMT_YUVA444P10, AV_PIX_FMT_NONE >> >> + }, >> >> +.priv_class = &prores_class, >> >> +}; >> >> diff --git a/tests/fate/vcodec.mak b/tests/fate/vcodec.mak >> >> index 8c24510..3e92f3c 100644 >> >> --- a/tests/fate/vcodec.mak >> >> +++ b/tests/fate/vcodec.mak >> >> @@ -329,6 +329,7 @@ fate-vsynth%-mpng: CODEC = png >> >> FATE_VCODEC-$(call ENCDEC, MSVIDEO1, AVI) += msvideo1 >> >> >> >> FATE_VCODEC-$(call ENCDEC, PRORES, MOV) += prores prores_ks >> >> +fate-vsynth%-prores: ENCOPTS = -qscale:v 1 >>
[FFmpeg-devel] [PATCH 1/3] avcodec/s302m: fix AVOption flags
--- libavcodec/s302m.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/libavcodec/s302m.c b/libavcodec/s302m.c index a68ac79f2c..4350d97f0a 100644 --- a/libavcodec/s302m.c +++ b/libavcodec/s302m.c @@ -201,7 +201,7 @@ static int s302m_decode_frame(AVCodecContext *avctx, void *data, return avpkt->size; } -#define FLAGS AV_OPT_FLAG_VIDEO_PARAM|AV_OPT_FLAG_DECODING_PARAM +#define FLAGS AV_OPT_FLAG_AUDIO_PARAM|AV_OPT_FLAG_DECODING_PARAM static const AVOption s302m_options[] = { {"non_pcm_mode", "Chooses what to do with NON-PCM", offsetof(S302Context, non_pcm_mode), AV_OPT_TYPE_INT, {.i64 = 3}, 0, 3, FLAGS, "non_pcm_mode"}, {"copy", "Pass NON-PCM through unchanged" , 0, AV_OPT_TYPE_CONST, {.i64 = 0}, 0, 3, FLAGS, "non_pcm_mode"}, -- 2.11.0 ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
[FFmpeg-devel] [PATCH 3/3] avformat: add SMPTE 337M demuxer
--- Changelog| 1 + doc/general.texi | 1 + libavformat/Makefile | 1 + libavformat/allformats.c | 1 + libavformat/s337m.c | 205 +++ libavformat/version.h| 2 +- 6 files changed, 210 insertions(+), 1 deletion(-) create mode 100644 libavformat/s337m.c diff --git a/Changelog b/Changelog index 4f46edaddb..c872137792 100644 --- a/Changelog +++ b/Changelog @@ -24,6 +24,7 @@ version : - roberts video filter - The x86 assembler default switched from yasm to nasm, pass --x86asmexe=yasm to configure to restore the old behavior. +- Dolby E decoder and SMPTE 337M demuxer version 3.3: - CrystalHD decoder moved to new decode API diff --git a/doc/general.texi b/doc/general.texi index d95ef31fde..036c8c25d4 100644 --- a/doc/general.texi +++ b/doc/general.texi @@ -520,6 +520,7 @@ library: @tab Multimedia format used by many games. @item SMJPEG@tab X @tab X @tab Used in certain Loki game ports. +@item SMPTE 337M encapsulation @tab @tab X @item Smush @tab @tab X @tab Multimedia format used in some LucasArts games. @item Sony OpenMG (OMA) @tab X @tab X diff --git a/libavformat/Makefile b/libavformat/Makefile index 80aeed22c0..b0ef82cdd4 100644 --- a/libavformat/Makefile +++ b/libavformat/Makefile @@ -432,6 +432,7 @@ OBJS-$(CONFIG_RTSP_DEMUXER) += rtsp.o rtspdec.o httpauth.o \ urldecode.o OBJS-$(CONFIG_RTSP_MUXER)+= rtsp.o rtspenc.o httpauth.o \ urldecode.o +OBJS-$(CONFIG_S337M_DEMUXER) += s337m.o spdif.o OBJS-$(CONFIG_SAMI_DEMUXER) += samidec.o subtitles.o OBJS-$(CONFIG_SAP_DEMUXER) += sapdec.o OBJS-$(CONFIG_SAP_MUXER) += sapenc.o diff --git a/libavformat/allformats.c b/libavformat/allformats.c index a0e2fb8c85..1ebc14231c 100644 --- a/libavformat/allformats.c +++ b/libavformat/allformats.c @@ -267,6 +267,7 @@ static void register_all(void) REGISTER_MUXDEMUX(RTP, rtp); REGISTER_MUXER (RTP_MPEGTS, rtp_mpegts); REGISTER_MUXDEMUX(RTSP, rtsp); +REGISTER_DEMUXER (S337M,s337m); REGISTER_DEMUXER (SAMI, sami); REGISTER_MUXDEMUX(SAP, sap); REGISTER_DEMUXER (SBG, sbg); diff --git a/libavformat/s337m.c b/libavformat/s337m.c new file mode 100644 index 00..3175a9fb87 --- /dev/null +++ b/libavformat/s337m.c @@ -0,0 +1,205 @@ +/* + * Copyright (C) 2017 foo86 + * + * This file is part of FFmpeg. + * + * FFmpeg is free software; you can redistribute it and/or + * modify it under the terms of the GNU Lesser General Public + * License as published by the Free Software Foundation; either + * version 2.1 of the License, or (at your option) any later version. + * + * FFmpeg is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + * Lesser General Public License for more details. + * + * You should have received a copy of the GNU Lesser General Public + * License along with FFmpeg; if not, write to the Free Software + * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA + */ + +#include "libavutil/intreadwrite.h" +#include "avformat.h" +#include "spdif.h" + +#define MARKER_16LE 0x72F81F4E +#define MARKER_20LE 0x20876FF0E154 +#define MARKER_24LE 0x72F8961F4EA5 + +#define IS_16LE_MARKER(state) ((state & 0x) == MARKER_16LE) +#define IS_20LE_MARKER(state) ((state & 0xF0F0) == MARKER_20LE) +#define IS_24LE_MARKER(state) ((state & 0x) == MARKER_24LE) +#define IS_LE_MARKER(state) (IS_16LE_MARKER(state) || IS_20LE_MARKER(state) || IS_24LE_MARKER(state)) + +static int s337m_get_offset_and_codec(AVFormatContext *s, + uint64_t state, + int data_type, int data_size, + int *offset, int *codec) +{ +int word_bits; + +if (IS_16LE_MARKER(state)) { +word_bits = 16; +} else if (IS_20LE_MARKER(state)) { +data_type >>= 8; +data_size >>= 4; +word_bits = 20; +} else { +data_type >>= 8; +word_bits = 24; +} + +if ((data_type & 0x1F) != 0x1C) { +if (s) +avpriv_report_missing_feature(s, "Data type %#x in SMPTE 337M", data_type & 0x1F); +return AVERROR_PATCHWELCOME; +} + +if (codec) +*codec = AV_CODEC_ID_DOLBY_E; + +switch (data_size / word_bits) { +case 3648: +*offset = 1920; +break; +case 3644: +*offset = 2002; +break; +case 3640: +*offset = 2000; +break; +case 3040: +*offset = 1601; +
[FFmpeg-devel] [PATCH 2/3] avcodec: add Dolby E decoder
--- configure |1 + doc/general.texi|1 + libavcodec/Makefile |1 + libavcodec/allcodecs.c |1 + libavcodec/avcodec.h|1 + libavcodec/codec_desc.c |7 + libavcodec/dolby_e.c| 716 + libavcodec/dolby_e.h| 1153 +++ libavcodec/version.h|4 +- 9 files changed, 1883 insertions(+), 2 deletions(-) create mode 100644 libavcodec/dolby_e.c create mode 100644 libavcodec/dolby_e.h diff --git a/configure b/configure index 6ca919be4a..4a97be23df 100755 --- a/configure +++ b/configure @@ -2401,6 +2401,7 @@ dds_decoder_select="texturedsp" dirac_decoder_select="dirac_parse dwt golomb videodsp mpegvideoenc" dnxhd_decoder_select="blockdsp idctdsp" dnxhd_encoder_select="aandcttables blockdsp fdctdsp idctdsp mpegvideoenc pixblockdsp" +dolby_e_decoder_select="mdct" dvvideo_decoder_select="dvprofile idctdsp" dvvideo_encoder_select="dvprofile fdctdsp me_cmp pixblockdsp" dxa_decoder_select="zlib" diff --git a/doc/general.texi b/doc/general.texi index 8f582d586f..d95ef31fde 100644 --- a/doc/general.texi +++ b/doc/general.texi @@ -1001,6 +1001,7 @@ following image formats are supported: @tab All versions except 5.1 are supported. @item DCA (DTS Coherent Acoustics) @tab X @tab X @tab supported extensions: XCh, XXCH, X96, XBR, XLL, LBR (partially) +@item Dolby E @tab @tab X @item DPCM id RoQ@tab X @tab X @tab Used in Quake III, Jedi Knight 2 and other computer games. @item DPCM Interplay @tab @tab X diff --git a/libavcodec/Makefile b/libavcodec/Makefile index f0cba8843d..e12878de0d 100644 --- a/libavcodec/Makefile +++ b/libavcodec/Makefile @@ -252,6 +252,7 @@ OBJS-$(CONFIG_DIRAC_DECODER) += diracdec.o dirac.o diracdsp.o diractab OBJS-$(CONFIG_DFA_DECODER) += dfa.o OBJS-$(CONFIG_DNXHD_DECODER) += dnxhddec.o dnxhddata.o OBJS-$(CONFIG_DNXHD_ENCODER) += dnxhdenc.o dnxhddata.o +OBJS-$(CONFIG_DOLBY_E_DECODER) += dolby_e.o OBJS-$(CONFIG_DPX_DECODER) += dpx.o OBJS-$(CONFIG_DPX_ENCODER) += dpxenc.o OBJS-$(CONFIG_DSD_LSBF_DECODER)+= dsddec.o dsd.o diff --git a/libavcodec/allcodecs.c b/libavcodec/allcodecs.c index 54a9e8c42e..4101d340dd 100644 --- a/libavcodec/allcodecs.c +++ b/libavcodec/allcodecs.c @@ -417,6 +417,7 @@ static void register_all(void) REGISTER_DECODER(BMV_AUDIO, bmv_audio); REGISTER_DECODER(COOK, cook); REGISTER_ENCDEC (DCA, dca); +REGISTER_DECODER(DOLBY_E, dolby_e); REGISTER_DECODER(DSD_LSBF, dsd_lsbf); REGISTER_DECODER(DSD_MSBF, dsd_msbf); REGISTER_DECODER(DSD_LSBF_PLANAR, dsd_lsbf_planar); diff --git a/libavcodec/avcodec.h b/libavcodec/avcodec.h index b697afa0ae..ca1dd54eb0 100644 --- a/libavcodec/avcodec.h +++ b/libavcodec/avcodec.h @@ -622,6 +622,7 @@ enum AVCodecID { AV_CODEC_ID_PAF_AUDIO, AV_CODEC_ID_ON2AVC, AV_CODEC_ID_DSS_SP, +AV_CODEC_ID_DOLBY_E, AV_CODEC_ID_FFWAVESYNTH = 0x15800, AV_CODEC_ID_SONIC, diff --git a/libavcodec/codec_desc.c b/libavcodec/codec_desc.c index cf1246e431..26c03e9df6 100644 --- a/libavcodec/codec_desc.c +++ b/libavcodec/codec_desc.c @@ -2671,6 +2671,13 @@ static const AVCodecDescriptor codec_descriptors[] = { .props = AV_CODEC_PROP_LOSSY, }, { +.id= AV_CODEC_ID_DOLBY_E, +.type = AVMEDIA_TYPE_AUDIO, +.name = "dolby_e", +.long_name = NULL_IF_CONFIG_SMALL("Dolby E"), +.props = AV_CODEC_PROP_LOSSY, +}, +{ .id= AV_CODEC_ID_G729, .type = AVMEDIA_TYPE_AUDIO, .name = "g729", diff --git a/libavcodec/dolby_e.c b/libavcodec/dolby_e.c new file mode 100644 index 00..ba1fe1c5d0 --- /dev/null +++ b/libavcodec/dolby_e.c @@ -0,0 +1,716 @@ +/* + * Copyright (C) 2017 foo86 + * + * This file is part of FFmpeg. + * + * FFmpeg is free software; you can redistribute it and/or + * modify it under the terms of the GNU Lesser General Public + * License as published by the Free Software Foundation; either + * version 2.1 of the License, or (at your option) any later version. + * + * FFmpeg is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + * Lesser General Public License for more details. + * + * You should have received a copy of the GNU Lesser General Public + * License along with FFmpeg; if not, write to the Free Software + * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA + */ + +#include "libavutil/mem.h" +#include "libavutil/float_dsp.h" +#include "internal.h" +#include "get_bits.h" +#include "put_bits.h" +#include "fft.h" +#include "dolby_e.h" + +static float mantissa_tab1[17][4]; +stat
Re: [FFmpeg-devel] [PATCH 1/3] avcodec/s302m: fix AVOption flags
On 6/27/17, foo86 wrote: > --- > libavcodec/s302m.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) lgtm ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
Re: [FFmpeg-devel] [PATCH] avcodec/ffv1enc: Allow less than 2 rows of slices for low vertical resolution
On Mon, Jun 26, 2017 at 05:33:02PM +0200, Paul B Mahol wrote: > On 6/26/17, Michael Niedermayer wrote: > > Fixes: Ticket5548 > > > > Signed-off-by: Michael Niedermayer > > --- > > libavcodec/ffv1enc.c | 4 > > 1 file changed, 4 insertions(+) > > LGTM applied > > Could this calculation be simplified? (Even with different results) > Just few operations without if/else? tried to simplify it thx [...] -- Michael GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB Modern terrorism, a quick summary: Need oil, start war with country that has oil, kill hundread thousand in war. Let country fall into chaos, be surprised about raise of fundamantalists. Drop more bombs, kill more people, be surprised about them taking revenge and drop even more bombs and strip your own citizens of their rights and freedoms. to be continued signature.asc Description: Digital signature ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
Re: [FFmpeg-devel] [PATCH] Add wayland support for VAAPI
Le 27/06/2017 à 09:54, Mark Thompson a écrit : > On 27/06/17 07:38, David Fort wrote: >> Wayland environment became quite popular with gnome 3. This patch adds the >> ability to >> initialize the VAAPI accelerator from a wayland display. > > Is there some specific use-case which needs this? The X11 support mainly > exists because of old systems (*cough* Intel Media SDK *cough*) which don't > support render nodes, and therefore have to connect to a DRM master device - > if X11 is running then it has to use the DRI2 authentication ritual to do > that. > > Anything recent should be connecting via the render node, which allows it to > be used independently of any other system and has clear control over what > hardware you are using on systems with multiple possibilities. > Hi, my final goal is to have a h264 video rendered directly in a surface (using ffmpeg as a library). If render nodes surfaces can be displayed on the screen then I'm fine with it and forget about that patch. I did some tests and I had the impression that when the rendering was done with a DRI render node, it was slower than when the VADisplay was grabbeb from wayland (F25 with gnome3). But it may be just an impression or a test bias. Best regards. -- David FORT website: http://www.hardening-consulting.com/ signature.asc Description: OpenPGP digital signature ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
Re: [FFmpeg-devel] [PATCH 1/3] avcodec/proresenc: switch default prores encoder to prores_ks
On Tue, Jun 27, 2017 at 10:38:47AM +0200, Paul B Mahol wrote: > On 6/27/17, Michael Niedermayer wrote: > > On Tue, Jun 27, 2017 at 12:20:05AM +0200, Paul B Mahol wrote: > >> On 6/27/17, Michael Niedermayer wrote: > >> > On Mon, Jun 26, 2017 at 11:55:35PM +0200, Paul B Mahol wrote: > >> >> Rationale: > >> >> prores_ks have more features and is faster for qscale > 0 > >> >> and gives better quality output. > >> >> > >> >> Signed-off-by: Paul B Mahol > >> >> --- > >> >> libavcodec/Makefile | 2 +- > >> >> libavcodec/proresenc_anatoliy.c | 14 -- > >> >> libavcodec/proresenc_kostya.c | 24 > >> >> tests/fate/vcodec.mak | 1 + > >> >> tests/ref/vsynth/vsynth1-prores | 8 > >> >> tests/ref/vsynth/vsynth2-prores | 8 > >> >> tests/ref/vsynth/vsynth3-prores | 8 > >> >> 7 files changed, 38 insertions(+), 27 deletions(-) > >> >> > >> >> diff --git a/libavcodec/Makefile b/libavcodec/Makefile > >> >> index f0cba88..16dce40 100644 > >> >> --- a/libavcodec/Makefile > >> >> +++ b/libavcodec/Makefile > >> >> @@ -477,7 +477,7 @@ OBJS-$(CONFIG_PPM_DECODER) += pnmdec.o > >> >> pnm.o > >> >> OBJS-$(CONFIG_PPM_ENCODER) += pnmenc.o > >> >> OBJS-$(CONFIG_PRORES_DECODER) += proresdec2.o proresdsp.o > >> >> proresdata.o > >> >> OBJS-$(CONFIG_PRORES_LGPL_DECODER) += proresdec_lgpl.o > >> >> proresdsp.o > >> >> proresdata.o > >> >> -OBJS-$(CONFIG_PRORES_ENCODER) += proresenc_anatoliy.o > >> >> +OBJS-$(CONFIG_PRORES_ENCODER) += proresenc_kostya.o > >> >> proresdata.o > >> >> OBJS-$(CONFIG_PRORES_AW_ENCODER) += proresenc_anatoliy.o > >> >> OBJS-$(CONFIG_PRORES_KS_ENCODER) += proresenc_kostya.o > >> >> proresdata.o > >> >> OBJS-$(CONFIG_PSD_DECODER) += psd.o > >> >> diff --git a/libavcodec/proresenc_anatoliy.c > >> >> b/libavcodec/proresenc_anatoliy.c > >> >> index 0516066..7ff6ff7 100644 > >> >> --- a/libavcodec/proresenc_anatoliy.c > >> >> +++ b/libavcodec/proresenc_anatoliy.c > >> >> @@ -614,17 +614,3 @@ AVCodec ff_prores_aw_encoder = { > >> >> .capabilities = AV_CODEC_CAP_FRAME_THREADS | > >> >> AV_CODEC_CAP_INTRA_ONLY, > >> >> .profiles = profiles > >> >> }; > >> >> - > >> >> -AVCodec ff_prores_encoder = { > >> >> -.name = "prores", > >> >> -.long_name = NULL_IF_CONFIG_SMALL("Apple ProRes"), > >> >> -.type = AVMEDIA_TYPE_VIDEO, > >> >> -.id = AV_CODEC_ID_PRORES, > >> >> -.priv_data_size = sizeof(ProresContext), > >> >> -.init = prores_encode_init, > >> >> -.close = prores_encode_close, > >> >> -.encode2= prores_encode_frame, > >> >> -.pix_fmts = (const enum > >> >> AVPixelFormat[]){AV_PIX_FMT_YUV422P10, > >> >> AV_PIX_FMT_NONE}, > >> >> -.capabilities = AV_CODEC_CAP_FRAME_THREADS | > >> >> AV_CODEC_CAP_INTRA_ONLY, > >> >> -.profiles = profiles > >> >> -}; > >> >> diff --git a/libavcodec/proresenc_kostya.c > >> >> b/libavcodec/proresenc_kostya.c > >> >> index 09bb611..ad979c2 100644 > >> >> --- a/libavcodec/proresenc_kostya.c > >> >> +++ b/libavcodec/proresenc_kostya.c > >> >> @@ -1341,6 +1341,13 @@ static const AVClass proresenc_class = { > >> >> .version= LIBAVUTIL_VERSION_INT, > >> >> }; > >> >> > >> >> +static const AVClass prores_class = { > >> >> +.class_name = "ProRes", > >> >> +.item_name = av_default_item_name, > >> >> +.option = options, > >> >> +.version= LIBAVUTIL_VERSION_INT, > >> >> +}; > >> >> + > >> >> AVCodec ff_prores_ks_encoder = { > >> >> .name = "prores_ks", > >> >> .long_name = NULL_IF_CONFIG_SMALL("Apple ProRes (iCodec > >> >> Pro)"), > >> >> @@ -1357,3 +1364,20 @@ AVCodec ff_prores_ks_encoder = { > >> >>}, > >> >> .priv_class = &proresenc_class, > >> >> }; > >> >> + > >> >> +AVCodec ff_prores_encoder = { > >> >> +.name = "prores", > >> >> +.long_name = NULL_IF_CONFIG_SMALL("Apple ProRes (iCodec > >> >> Pro)"), > >> >> +.type = AVMEDIA_TYPE_VIDEO, > >> >> +.id = AV_CODEC_ID_PRORES, > >> >> +.priv_data_size = sizeof(ProresContext), > >> >> +.init = encode_init, > >> >> +.close = encode_close, > >> >> +.encode2= encode_frame, > >> >> +.capabilities = AV_CODEC_CAP_SLICE_THREADS | > >> >> AV_CODEC_CAP_FRAME_THREADS | AV_CODEC_CAP_INTRA_ONLY, > >> >> +.pix_fmts = (const enum AVPixelFormat[]) { > >> >> + AV_PIX_FMT_YUV422P10, AV_PIX_FMT_YUV444P10, > >> >> + AV_PIX_FMT_YUVA444P10, AV_PIX_FMT_NONE > >> >> + }, > >> >> +.priv_class = &prores_class, > >> >> +}; > >> >> diff --git a/tests/fate/vcodec.mak b/tests/fate/vcodec.mak > >> >> index 8c24510..3e92f3c 100644 > >> >> --- a/tests/fate/vcodec.mak > >> >> +++ b/tests/f
[FFmpeg-devel] [PATCH] avcodec/wavpack: Fix invalid shift
Fixes: runtime error: left shift of 1 by 31 places cannot be represented in type 'int' Fixes: 2377/clusterfuzz-testcase-minimized-6108505935183872 Found-by: continuous fuzzing process https://github.com/google/oss-fuzz/tree/master/projects/ffmpeg Signed-off-by: Michael Niedermayer --- libavcodec/wavpack.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/libavcodec/wavpack.c b/libavcodec/wavpack.c index bc4b030425..a117e8aa81 100644 --- a/libavcodec/wavpack.c +++ b/libavcodec/wavpack.c @@ -846,9 +846,9 @@ static int wavpack_decode_block(AVCodecContext *avctx, int block_no, continue; } bytestream2_get_buffer(&gb, val, 4); -if (val[0] > 31) { +if (val[0] > 30) { av_log(avctx, AV_LOG_ERROR, - "Invalid INT32INFO, extra_bits = %d (> 32)\n", val[0]); + "Invalid INT32INFO, extra_bits = %d (> 30)\n", val[0]); continue; } else if (val[0]) { s->extra_bits = val[0]; -- 2.13.0 ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
Re: [FFmpeg-devel] [PATCH] avcodec/imc: cast float to int prior to comparing with int variable
On Tue, Jun 27, 2017 at 09:45:38AM +0200, Paul B Mahol wrote: > On 6/26/17, Paul B Mahol wrote: > > On 6/26/17, Michael Niedermayer wrote: > >> On Mon, Jun 26, 2017 at 12:44:48PM +0200, Paul B Mahol wrote: > >>> From: Kostya Shishkov > >>> > >>> Fixes #3886. > >>> > >>> Signed-off-by: Paul B Mahol > >>> --- > >>> libavcodec/imc.c | 2 +- > >>> 1 file changed, 1 insertion(+), 1 deletion(-) > >> > >> breaks fate-imc > >> > >> [...] > >> > >> -- > >> Michael GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB > >> > >> If you think the mosad wants you dead since a long time then you are > >> either > >> wrong or dead since a long time. > >> > > > > Yes I know. Are you willing to add new file based on new output? > > > Hello? you lack patience uploaded to imc/imc-201706.pcm but it still needs 24h to propagate to all fate clients [...9 -- Michael GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB Republics decline into democracies and democracies degenerate into despotisms. -- Aristotle signature.asc Description: Digital signature ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
Re: [FFmpeg-devel] [PATCH 1/3] avcodec/proresenc: switch default prores encoder to prores_ks
On 6/27/17, Michael Niedermayer wrote: > On Tue, Jun 27, 2017 at 10:38:47AM +0200, Paul B Mahol wrote: >> On 6/27/17, Michael Niedermayer wrote: >> > On Tue, Jun 27, 2017 at 12:20:05AM +0200, Paul B Mahol wrote: >> >> On 6/27/17, Michael Niedermayer wrote: >> >> > On Mon, Jun 26, 2017 at 11:55:35PM +0200, Paul B Mahol wrote: >> >> >> Rationale: >> >> >> prores_ks have more features and is faster for qscale > 0 >> >> >> and gives better quality output. >> >> >> >> >> >> Signed-off-by: Paul B Mahol >> >> >> --- >> >> >> libavcodec/Makefile | 2 +- >> >> >> libavcodec/proresenc_anatoliy.c | 14 -- >> >> >> libavcodec/proresenc_kostya.c | 24 >> >> >> tests/fate/vcodec.mak | 1 + >> >> >> tests/ref/vsynth/vsynth1-prores | 8 >> >> >> tests/ref/vsynth/vsynth2-prores | 8 >> >> >> tests/ref/vsynth/vsynth3-prores | 8 >> >> >> 7 files changed, 38 insertions(+), 27 deletions(-) >> >> >> >> >> >> diff --git a/libavcodec/Makefile b/libavcodec/Makefile >> >> >> index f0cba88..16dce40 100644 >> >> >> --- a/libavcodec/Makefile >> >> >> +++ b/libavcodec/Makefile >> >> >> @@ -477,7 +477,7 @@ OBJS-$(CONFIG_PPM_DECODER) += >> >> >> pnmdec.o >> >> >> pnm.o >> >> >> OBJS-$(CONFIG_PPM_ENCODER) += pnmenc.o >> >> >> OBJS-$(CONFIG_PRORES_DECODER) += proresdec2.o proresdsp.o >> >> >> proresdata.o >> >> >> OBJS-$(CONFIG_PRORES_LGPL_DECODER) += proresdec_lgpl.o >> >> >> proresdsp.o >> >> >> proresdata.o >> >> >> -OBJS-$(CONFIG_PRORES_ENCODER) += proresenc_anatoliy.o >> >> >> +OBJS-$(CONFIG_PRORES_ENCODER) += proresenc_kostya.o >> >> >> proresdata.o >> >> >> OBJS-$(CONFIG_PRORES_AW_ENCODER) += proresenc_anatoliy.o >> >> >> OBJS-$(CONFIG_PRORES_KS_ENCODER) += proresenc_kostya.o >> >> >> proresdata.o >> >> >> OBJS-$(CONFIG_PSD_DECODER) += psd.o >> >> >> diff --git a/libavcodec/proresenc_anatoliy.c >> >> >> b/libavcodec/proresenc_anatoliy.c >> >> >> index 0516066..7ff6ff7 100644 >> >> >> --- a/libavcodec/proresenc_anatoliy.c >> >> >> +++ b/libavcodec/proresenc_anatoliy.c >> >> >> @@ -614,17 +614,3 @@ AVCodec ff_prores_aw_encoder = { >> >> >> .capabilities = AV_CODEC_CAP_FRAME_THREADS | >> >> >> AV_CODEC_CAP_INTRA_ONLY, >> >> >> .profiles = profiles >> >> >> }; >> >> >> - >> >> >> -AVCodec ff_prores_encoder = { >> >> >> -.name = "prores", >> >> >> -.long_name = NULL_IF_CONFIG_SMALL("Apple ProRes"), >> >> >> -.type = AVMEDIA_TYPE_VIDEO, >> >> >> -.id = AV_CODEC_ID_PRORES, >> >> >> -.priv_data_size = sizeof(ProresContext), >> >> >> -.init = prores_encode_init, >> >> >> -.close = prores_encode_close, >> >> >> -.encode2= prores_encode_frame, >> >> >> -.pix_fmts = (const enum >> >> >> AVPixelFormat[]){AV_PIX_FMT_YUV422P10, >> >> >> AV_PIX_FMT_NONE}, >> >> >> -.capabilities = AV_CODEC_CAP_FRAME_THREADS | >> >> >> AV_CODEC_CAP_INTRA_ONLY, >> >> >> -.profiles = profiles >> >> >> -}; >> >> >> diff --git a/libavcodec/proresenc_kostya.c >> >> >> b/libavcodec/proresenc_kostya.c >> >> >> index 09bb611..ad979c2 100644 >> >> >> --- a/libavcodec/proresenc_kostya.c >> >> >> +++ b/libavcodec/proresenc_kostya.c >> >> >> @@ -1341,6 +1341,13 @@ static const AVClass proresenc_class = { >> >> >> .version= LIBAVUTIL_VERSION_INT, >> >> >> }; >> >> >> >> >> >> +static const AVClass prores_class = { >> >> >> +.class_name = "ProRes", >> >> >> +.item_name = av_default_item_name, >> >> >> +.option = options, >> >> >> +.version= LIBAVUTIL_VERSION_INT, >> >> >> +}; >> >> >> + >> >> >> AVCodec ff_prores_ks_encoder = { >> >> >> .name = "prores_ks", >> >> >> .long_name = NULL_IF_CONFIG_SMALL("Apple ProRes (iCodec >> >> >> Pro)"), >> >> >> @@ -1357,3 +1364,20 @@ AVCodec ff_prores_ks_encoder = { >> >> >>}, >> >> >> .priv_class = &proresenc_class, >> >> >> }; >> >> >> + >> >> >> +AVCodec ff_prores_encoder = { >> >> >> +.name = "prores", >> >> >> +.long_name = NULL_IF_CONFIG_SMALL("Apple ProRes (iCodec >> >> >> Pro)"), >> >> >> +.type = AVMEDIA_TYPE_VIDEO, >> >> >> +.id = AV_CODEC_ID_PRORES, >> >> >> +.priv_data_size = sizeof(ProresContext), >> >> >> +.init = encode_init, >> >> >> +.close = encode_close, >> >> >> +.encode2= encode_frame, >> >> >> +.capabilities = AV_CODEC_CAP_SLICE_THREADS | >> >> >> AV_CODEC_CAP_FRAME_THREADS | AV_CODEC_CAP_INTRA_ONLY, >> >> >> +.pix_fmts = (const enum AVPixelFormat[]) { >> >> >> + AV_PIX_FMT_YUV422P10, >> >> >> AV_PIX_FMT_YUV444P10, >> >> >> + AV_PIX_FMT_YUVA444P10, AV_PIX_FMT_NONE >> >> >> + }, >> >> >> +.priv_class = &prores_class, >> >> >
Re: [FFmpeg-devel] [PATCH 2/3] avcodec: add Dolby E decoder
On 6/27/2017 10:59 AM, foo86 wrote: > +init_tables(); This should be run under ff_thread_once() and thread-safe init flag added to the internal caps. - Derek ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
Re: [FFmpeg-devel] [PATCH] Add FITS Decoder
On 6/20/17, Paras Chadha wrote: > Above changes done. Also fixed an issue with BLANK keyword. > > Signed-off-by: Paras Chadha > --- > Changelog | 1 + > doc/general.texi| 2 + > libavcodec/Makefile | 1 + > libavcodec/allcodecs.c | 1 + > libavcodec/avcodec.h| 1 + > libavcodec/codec_desc.c | 8 + > libavcodec/fitsdec.c| 517 > > libavcodec/version.h| 2 +- > libavformat/img2.c | 1 + > 9 files changed, 533 insertions(+), 1 deletion(-) > create mode 100644 libavcodec/fitsdec.c > Whats about all those metadata available in FITS files? Please export them as frame metadata too. ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
Re: [FFmpeg-devel] Multiprogram mode for mpeg TS
On Mon, Jun 26, 2017 at 12:51:08PM +0300, ffm...@a.legko.ru wrote: > > > On Mon, 26 Jun 2017, Hendrik Leppkes wrote: > > >Individual changes should be submitted as seperate patches. Having > >several unrelated changes in one patch makes it impossible to apply or > >even properly review it. > all changes are related: > > multiprog cannot exist when some of prog goes away (pthread ends in > ffmpeg.c) (this touches stream looping too). > > options MUST be set in ffmpeg_ops.c to make it possible to name and > separate each prog in TS. thus ops.c is corrected. > > mpegtsenc.c - all changes above used here. > > applying patch to the latest git causes no trouble, as I can see.. libavformat has a API and ABI (ffmpeg and others) are using it. If you change libavformat, its needed to update the version of libavformat. And to ensure the changes do not break ABI/API Changes to a application using libavformat do not belong in that the same patch that changed the lib. There are many applications using libavformat both in our git and in many other git repositories from other projects. -- Michael GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB Modern terrorism, a quick summary: Need oil, start war with country that has oil, kill hundread thousand in war. Let country fall into chaos, be surprised about raise of fundamantalists. Drop more bombs, kill more people, be surprised about them taking revenge and drop even more bombs and strip your own citizens of their rights and freedoms. to be continued signature.asc Description: Digital signature ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
Re: [FFmpeg-devel] [PATCH 10/11] avcodec/x86: add an 8-bit simple IDCT function based on the x86-64 high depth functions
On Mon, Jun 26, 2017 at 02:20:03PM +0200, James Darnley wrote: > On 2017-06-25 21:27, Michael Niedermayer wrote: > > On Sat, Jun 24, 2017 at 06:30:26PM -0400, Ronald S. Bultje wrote: > >> Hi, > >> > >> On Sat, Jun 24, 2017 at 3:27 PM, Michael Niedermayer > >> >> > >>> This patch changes the default IDCT on x86(64), which is intended IIUC > >>> It also changes the IDCT when simplemmx is set > >>> > >>> but on x86-32 simplemmx does after this patch not produce the same > >>> result as simplemmx on x86-64. > >>> > >>> iam not sure but > >>> maybe the changed code should enable on FF_IDCT_SIMPLE instead of > >>> FF_IDCT_SIMPLEMMX ? > >>> whats your oppinion on this ? > >>> the next patch would add FF_IDCT_SIMPLE but it also leaves > >>> FF_IDCT_SIMPLEMMX > >> > >> > >> That's a good point, I also considered that question (not so much the > >> 32bit vs. 64bit, but the mmx vs. sse2). The question is basically what > >> simplemmx means. Is it the exact binary result of the mmx function? Or is > >> it a way of saying "almost simple, but with some rounding diffs because > >> mmx"? > >> > >> If the second, then simple is a superset of simplemmx. If the first, then > >> we should remove simplemmx from the list of "supported" idcts for the > >> sse2/avx functions. I have no preference (I assumed it meant the first), > >> but if you'd prefer to use the second meaning, then that's an easy > >> modification to make and it won't practically have any impact for most use > >> cases I think... > > > > I didnt think about meaning, rather more about practice. > > if someone reports any issue using "simplemmx" and bitexact and > > that fails to be reproduced it could be confusing. > > This is especially plausible when the bug is not idct rounding but > > a bug in a later stage just triggered by specific output from the idct > > > > also potential future fate tests of simplemmx or other simd idcts > > require there to be a way to select a specific idct output > > > > no strong oppinion on this ... > > I admit I haven't considered whether I should be using this with > simplemmx. I could change the code so that the new code isn't used for it. > > If simplemmx is supposed to be its own algorithm available to anyone who > might wish to use it then I think that an error should occur when MMX is > not available. yes, that would make sense > > Since the current behaviour is to have simple as the catch-all fallback > I will leave the code as is. auto, simpleauto, simplemmx, and simple > will now all use the new code. > > We can discuss these points all you want but I intend to push the > remaining 3 patches Soon(TM). I have still not tried Gramner's > suggestion so you have some time to object and block. > > ___ > ffmpeg-devel mailing list > ffmpeg-devel@ffmpeg.org > http://ffmpeg.org/mailman/listinfo/ffmpeg-devel -- Michael GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB Those who are best at talking, realize last or never when they are wrong. signature.asc Description: Digital signature ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
[FFmpeg-devel] [PATCH 1/2] x86/vf_blend: add sse and ssse3 extremity functions
Signed-off-by: James Almer --- libavfilter/x86/vf_blend.asm| 25 + libavfilter/x86/vf_blend_init.c | 4 tests/checkasm/vf_blend.c | 1 + 3 files changed, 30 insertions(+) diff --git a/libavfilter/x86/vf_blend.asm b/libavfilter/x86/vf_blend.asm index 33b1ad1496..25f6f5affc 100644 --- a/libavfilter/x86/vf_blend.asm +++ b/libavfilter/x86/vf_blend.asm @@ -286,6 +286,31 @@ BLEND_INIT difference, 3 jl .loop BLEND_END +BLEND_INIT extremity, 8 +pxor m2, m2 +mova m4, [pw_255] +.nextrow: +movxq, widthq + +.loop: +movum0, [topq + xq] +movum1, [bottomq + xq] +punpckhbw m5, m0, m2 +punpcklbw m0, m2 +punpckhbw m6, m1, m2 +punpcklbw m1, m2 +psubw m3, m4, m0 +psubw m7, m4, m5 +psubw m3, m1 +psubw m7, m6 +ABS1m3, m1 +ABS1m7, m6 +packuswbm3, m7 +mova [dstq + xq], m3 +add xq, mmsize +jl .loop +BLEND_END + BLEND_INIT negation, 5 pxor m2, m2 mova m4, [pw_255] diff --git a/libavfilter/x86/vf_blend_init.c b/libavfilter/x86/vf_blend_init.c index 96fe3d8baa..71f9b0a685 100644 --- a/libavfilter/x86/vf_blend_init.c +++ b/libavfilter/x86/vf_blend_init.c @@ -47,6 +47,8 @@ BLEND_FUNC(subtract, sse2) BLEND_FUNC(xor, sse2) BLEND_FUNC(difference, sse2) BLEND_FUNC(difference, ssse3) +BLEND_FUNC(extremity, sse2) +BLEND_FUNC(extremity, ssse3) BLEND_FUNC(negation, sse2) BLEND_FUNC(negation, ssse3) @@ -72,12 +74,14 @@ av_cold void ff_blend_init_x86(FilterParams *param, int is_16bit) case BLEND_SUBTRACT: param->blend = ff_blend_subtract_sse2; break; case BLEND_XOR: param->blend = ff_blend_xor_sse2; break; case BLEND_DIFFERENCE: param->blend = ff_blend_difference_sse2; break; +case BLEND_EXTREMITY: param->blend = ff_blend_extremity_sse2; break; case BLEND_NEGATION: param->blend = ff_blend_negation_sse2; break; } } if (EXTERNAL_SSSE3(cpu_flags) && param->opacity == 1 && !is_16bit) { switch (param->mode) { case BLEND_DIFFERENCE: param->blend = ff_blend_difference_ssse3; break; +case BLEND_EXTREMITY: param->blend = ff_blend_extremity_ssse3; break; case BLEND_NEGATION: param->blend = ff_blend_negation_ssse3; break; } } diff --git a/tests/checkasm/vf_blend.c b/tests/checkasm/vf_blend.c index aa568c0de0..4e018ac69e 100644 --- a/tests/checkasm/vf_blend.c +++ b/tests/checkasm/vf_blend.c @@ -117,6 +117,7 @@ void checkasm_check_blend(void) check_and_report(subtract, BLEND_SUBTRACT) check_and_report(xor, BLEND_XOR) check_and_report(difference, BLEND_DIFFERENCE) +check_and_report(extremity, BLEND_EXTREMITY) check_and_report(negation, BLEND_NEGATION) report("8bit"); -- 2.13.0 ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
[FFmpeg-devel] [PATCH 2/2] x86/vf_blend: optimize difference and negation functions
Process more pixels per loop. Signed-off-by: James Almer --- libavfilter/x86/vf_blend.asm | 40 1 file changed, 24 insertions(+), 16 deletions(-) diff --git a/libavfilter/x86/vf_blend.asm b/libavfilter/x86/vf_blend.asm index 25f6f5affc..d5e512e6e0 100644 --- a/libavfilter/x86/vf_blend.asm +++ b/libavfilter/x86/vf_blend.asm @@ -268,21 +268,25 @@ BLEND_INIT phoenix, 4 BLEND_END %macro BLEND_ABS 0 -BLEND_INIT difference, 3 +BLEND_INIT difference, 5 pxor m2, m2 .nextrow: movxq, widthq .loop: -movhm0, [topq + xq] -movhm1, [bottomq + xq] +movum0, [topq + xq] +movum1, [bottomq + xq] +punpckhbw m3, m0, m2 punpcklbw m0, m2 +punpckhbw m4, m1, m2 punpcklbw m1, m2 psubw m0, m1 +psubw m3, m4 ABS1m0, m1 -packuswbm0, m0 -movh [dstq + xq], m0 -add xq, mmsize / 2 +ABS1m3, m4 +packuswbm0, m3 +mova [dstq + xq], m0 +add xq, mmsize jl .loop BLEND_END @@ -311,26 +315,30 @@ BLEND_INIT extremity, 8 jl .loop BLEND_END -BLEND_INIT negation, 5 +BLEND_INIT negation, 8 pxor m2, m2 mova m4, [pw_255] .nextrow: movxq, widthq .loop: -movhm0, [topq + xq] -movhm1, [bottomq + xq] +movum0, [topq + xq] +movum1, [bottomq + xq] +punpckhbw m5, m0, m2 punpcklbw m0, m2 +punpckhbw m6, m1, m2 punpcklbw m1, m2 -movam3, m4 -psubw m3, m0 +psubw m3, m4, m0 +psubw m7, m4, m5 psubw m3, m1 +psubw m7, m6 ABS1m3, m1 -movam0, m4 -psubw m0, m3 -packuswbm0, m0 -movh [dstq + xq], m0 -add xq, mmsize / 2 +ABS1m7, m1 +psubw m0, m4, m3 +psubw m1, m4, m7 +packuswbm0, m1 +mova [dstq + xq], m0 +add xq, mmsize jl .loop BLEND_END %endmacro -- 2.13.0 ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
Re: [FFmpeg-devel] [PATCH 1/2] x86/vf_blend: add sse and ssse3 extremity functions
On 6/27/17, James Almer wrote: > Signed-off-by: James Almer > --- > libavfilter/x86/vf_blend.asm| 25 + > libavfilter/x86/vf_blend_init.c | 4 > tests/checkasm/vf_blend.c | 1 + > 3 files changed, 30 insertions(+) > LGTM, I have couple more blend modes to add which might not be SIMDable. ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
Re: [FFmpeg-devel] [PATCH 2/2] x86/vf_blend: optimize difference and negation functions
On 6/27/17, James Almer wrote: > Process more pixels per loop. > > Signed-off-by: James Almer > --- > libavfilter/x86/vf_blend.asm | 40 > 1 file changed, 24 insertions(+), 16 deletions(-) > LGTM ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
Re: [FFmpeg-devel] [PATCH] x86inc: don't use read-only data sections on COFF targets
On 6/27/2017 3:54 AM, Clément Bœsch wrote: > On Mon, Jun 26, 2017 at 12:32:15AM -0300, James Almer wrote: >> Yasm: >> src/libavfilter/x86/af_volume.asm:24: warning: Standard COFF does not >> support read-only data sections >> src/libavfilter/x86/af_volume.asm:24: warning: Unrecognized qualifier `align' >> >> Nasm: >> src/libavfilter/x86/af_volume.asm:24: error: standard COFF does not support >> section alignment specification >> src/libavutil/x86/x86inc.asm:92: ... from macro `SECTION_RODATA' defined here >> >> Signed-off-by: James Almer >> --- >> Untested. >> >> libavutil/x86/x86inc.asm | 2 ++ >> 1 file changed, 2 insertions(+) >> >> diff --git a/libavutil/x86/x86inc.asm b/libavutil/x86/x86inc.asm >> index fa826e6d85..c4ec29bd9d 100644 >> --- a/libavutil/x86/x86inc.asm >> +++ b/libavutil/x86/x86inc.asm >> @@ -88,6 +88,8 @@ >> %macro SECTION_RODATA 0-1 16 >> %ifidn __OUTPUT_FORMAT__,aout >> section .text >> +%elifidn __OUTPUT_FORMAT__,coff >> +section .text >> %else >> SECTION .rodata align=%1 >> %endif > > > I can confirm it fixes the compilation on DJGPP FATE instance. Pushed, thanks. > > Side note: I just noticed the object dependencies are broken with nasm > (typically, x86inc.asm is not present in libavfilter/x86/af_volume.d). Huh, you're right. It works as expected if you use -M -MF instead of -MD, but that defeats the point of using the latter which was to create the dep file alongside assembly in a single nasm call. ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
Re: [FFmpeg-devel] [PATCH] Add wayland support for VAAPI
On 27/06/17 13:13, Hardening wrote: > Le 27/06/2017 à 09:54, Mark Thompson a écrit : >> On 27/06/17 07:38, David Fort wrote: >>> Wayland environment became quite popular with gnome 3. This patch adds the >>> ability to >>> initialize the VAAPI accelerator from a wayland display. >> >> Is there some specific use-case which needs this? The X11 support mainly >> exists because of old systems (*cough* Intel Media SDK *cough*) which don't >> support render nodes, and therefore have to connect to a DRM master device - >> if X11 is running then it has to use the DRI2 authentication ritual to do >> that. >> >> Anything recent should be connecting via the render node, which allows it to >> be used independently of any other system and has clear control over what >> hardware you are using on systems with multiple possibilities. >> > > Hi, > > my final goal is to have a h264 video rendered directly in a surface > (using ffmpeg as a library). Note that the hardware decoders are not in general able to render to a surface which can go directly to scanout. You'll always want some intermediate here - typically this is done with OpenGL (see video players, e.g. mpv), though you can also do it inside VAAPI by using the VPP mechanism. With ffmpeg + Wayland only, it should be doable by using the decoder along with the scale_vaapi filter (really doing format conversion, though as the name suggests it can scale as well) - allow the decoder to allocate its frames internally, but supply your own VAAPI surfaces as the output of the filter made from the Wayland surfaces using DRM PRIME sharing. > If render nodes surfaces can be displayed > on the screen then I'm fine with it and forget about that patch. Yes - surfaces are not distinguished by the context they come from, only by the actual underlying hardware device they exist on and what format they are. > I did some tests and I had the impression that when the rendering was > done with a DRI render node, it was slower than when the VADisplay was > grabbeb from wayland (F25 with gnome3). But it may be just an impression > or a test bias. That should be identical. The mechanism used by Wayland (and X11 DRI2) is to ask the server which DRM device node it should open, receiving a string path and a cookie it return. It opens the node and authenticates with the cookie if necessary (i.e. if anything more than rendering is required, which it isn't for codec stuff). (X11 DRI3 is different - that actually gives you a file descriptor to the DRM device directly.) I'm not really against the patch, but it adds a new Wayland dependency to libavutil which is probably undesirable in general. Thanks, - Mark ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
Re: [FFmpeg-devel] [PATCH 1/2] x86/vf_blend: add sse and ssse3 extremity functions
On 6/27/2017 12:50 PM, Paul B Mahol wrote: > On 6/27/17, James Almer wrote: >> Signed-off-by: James Almer >> --- >> libavfilter/x86/vf_blend.asm| 25 + >> libavfilter/x86/vf_blend_init.c | 4 >> tests/checkasm/vf_blend.c | 1 + >> 3 files changed, 30 insertions(+) >> > > LGTM, I have couple more blend modes to add which might not be > SIMDable. Pushed. ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
Re: [FFmpeg-devel] [PATCH 2/2] x86/vf_blend: optimize difference and negation functions
On 6/27/2017 12:50 PM, Paul B Mahol wrote: > On 6/27/17, James Almer wrote: >> Process more pixels per loop. >> >> Signed-off-by: James Almer >> --- >> libavfilter/x86/vf_blend.asm | 40 >> 1 file changed, 24 insertions(+), 16 deletions(-) >> > > LGTM Pushed, thanks. ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
[FFmpeg-devel] [PATCH 1/5] avcodec/utvideodec: Move bitstream end check out of inner loop
This is not needed when the buffer is large enough for the worst case of a line 2% faster vlc reading Signed-off-by: Michael Niedermayer --- libavcodec/utvideodec.c | 22 -- 1 file changed, 12 insertions(+), 10 deletions(-) diff --git a/libavcodec/utvideodec.c b/libavcodec/utvideodec.c index 44841aaa65..815b71cfb6 100644 --- a/libavcodec/utvideodec.c +++ b/libavcodec/utvideodec.c @@ -196,11 +196,6 @@ static int decode_plane10(UtvideoContext *c, int plane_no, prev = 0x200; for (j = sstart; j < send; j++) { for (i = 0; i < width * step; i += step) { -if (get_bits_left(&gb) <= 0) { -av_log(c->avctx, AV_LOG_ERROR, - "Slice decoding ran out of bits\n"); -goto fail; -} pix = get_vlc2(&gb, vlc.table, vlc.bits, 3); if (pix < 0) { av_log(c->avctx, AV_LOG_ERROR, "Decoding error\n"); @@ -214,6 +209,11 @@ static int decode_plane10(UtvideoContext *c, int plane_no, dest[i] = pix; } dest += stride; +if (get_bits_left(&gb) < 0) { +av_log(c->avctx, AV_LOG_ERROR, +"Slice decoding ran out of bits\n"); +goto fail; +} } if (get_bits_left(&gb) > 32) av_log(c->avctx, AV_LOG_WARNING, @@ -302,11 +302,6 @@ static int decode_plane(UtvideoContext *c, int plane_no, prev = 0x80; for (j = sstart; j < send; j++) { for (i = 0; i < width * step; i += step) { -if (get_bits_left(&gb) <= 0) { -av_log(c->avctx, AV_LOG_ERROR, - "Slice decoding ran out of bits\n"); -goto fail; -} pix = get_vlc2(&gb, vlc.table, vlc.bits, 3); if (pix < 0) { av_log(c->avctx, AV_LOG_ERROR, "Decoding error\n"); @@ -318,6 +313,11 @@ static int decode_plane(UtvideoContext *c, int plane_no, } dest[i] = pix; } +if (get_bits_left(&gb) < 0) { +av_log(c->avctx, AV_LOG_ERROR, +"Slice decoding ran out of bits\n"); +goto fail; +} dest += stride; } if (get_bits_left(&gb) > 32) @@ -610,6 +610,8 @@ static int decode_frame(AVCodecContext *avctx, void *data, int *got_frame, c->frame_pred = (c->frame_info >> 8) & 3; +max_slice_size += 4*avctx->width; + av_fast_malloc(&c->slice_bits, &c->slice_bits_size, max_slice_size + AV_INPUT_BUFFER_PADDING_SIZE); -- 2.13.0 ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
[FFmpeg-devel] [PATCH 5/5] avcodec/utvideodec: Factor multiply out of inner loop
0.5% faster loop Signed-off-by: Michael Niedermayer --- libavcodec/utvideodec.c | 6 -- 1 file changed, 4 insertions(+), 2 deletions(-) diff --git a/libavcodec/utvideodec.c b/libavcodec/utvideodec.c index 788f4475b9..a20e28320c 100644 --- a/libavcodec/utvideodec.c +++ b/libavcodec/utvideodec.c @@ -196,7 +196,8 @@ static int decode_plane10(UtvideoContext *c, int plane_no, prev = 0x200; for (j = sstart; j < send; j++) { -for (i = 0; i < width * step; i += step) { +int ws = width * step; +for (i = 0; i < ws; i += step) { pix = get_vlc2(&gb, vlc.table, VLC_BITS, 3); if (pix < 0) { av_log(c->avctx, AV_LOG_ERROR, "Decoding error\n"); @@ -300,7 +301,8 @@ static int decode_plane(UtvideoContext *c, int plane_no, prev = 0x80; for (j = sstart; j < send; j++) { -for (i = 0; i < width * step; i += step) { +int ws = width * step; +for (i = 0; i < ws; i += step) { pix = get_vlc2(&gb, vlc.table, VLC_BITS, 3); if (pix < 0) { av_log(c->avctx, AV_LOG_ERROR, "Decoding error\n"); -- 2.13.0 ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
[FFmpeg-devel] [PATCH 3/5] avcodec/utvideodec: enable unchecked bitreader
inner reader loop becomes 16% faster Signed-off-by: Michael Niedermayer --- libavcodec/utvideodec.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/libavcodec/utvideodec.c b/libavcodec/utvideodec.c index 411df47730..1418cde543 100644 --- a/libavcodec/utvideodec.c +++ b/libavcodec/utvideodec.c @@ -27,6 +27,8 @@ #include #include +#define UNCHECKED_BITSTREAM_READER 1 + #include "libavutil/intreadwrite.h" #include "avcodec.h" #include "bswapdsp.h" -- 2.13.0 ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
[FFmpeg-devel] [PATCH 4/5] avcodec/utvideodec: bswap directly without memcpy
Signed-off-by: Michael Niedermayer --- libavcodec/utvideodec.c | 8 ++-- 1 file changed, 2 insertions(+), 6 deletions(-) diff --git a/libavcodec/utvideodec.c b/libavcodec/utvideodec.c index 1418cde543..788f4475b9 100644 --- a/libavcodec/utvideodec.c +++ b/libavcodec/utvideodec.c @@ -188,11 +188,9 @@ static int decode_plane10(UtvideoContext *c, int plane_no, goto fail; } -memcpy(c->slice_bits, src + slice_data_start + c->slices * 4, - slice_size); memset(c->slice_bits + slice_size, 0, AV_INPUT_BUFFER_PADDING_SIZE); c->bdsp.bswap_buf((uint32_t *) c->slice_bits, - (uint32_t *) c->slice_bits, + src + slice_data_start + c->slices * 4, (slice_data_end - slice_data_start + 3) >> 2); init_get_bits(&gb, c->slice_bits, slice_size * 8); @@ -294,11 +292,9 @@ static int decode_plane(UtvideoContext *c, int plane_no, goto fail; } -memcpy(c->slice_bits, src + slice_data_start + c->slices * 4, - slice_size); memset(c->slice_bits + slice_size, 0, AV_INPUT_BUFFER_PADDING_SIZE); c->bdsp.bswap_buf((uint32_t *) c->slice_bits, - (uint32_t *) c->slice_bits, + src + slice_data_start + c->slices * 4, (slice_data_end - slice_data_start + 3) >> 2); init_get_bits(&gb, c->slice_bits, slice_size * 8); -- 2.13.0 ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
[FFmpeg-devel] [PATCH 2/5] avcodec/utvideodec: hardcode vlc bits
2.5% faster vlc decoding Signed-off-by: Michael Niedermayer --- libavcodec/utvideodec.c | 11 ++- 1 file changed, 6 insertions(+), 5 deletions(-) diff --git a/libavcodec/utvideodec.c b/libavcodec/utvideodec.c index 815b71cfb6..411df47730 100644 --- a/libavcodec/utvideodec.c +++ b/libavcodec/utvideodec.c @@ -73,8 +73,8 @@ static int build_huff10(const uint8_t *src, VLC *vlc, int *fsym) syms[i] = he[i].sym; code += 0x8000u >> (he[i].len - 1); } - -return ff_init_vlc_sparse(vlc, FFMIN(he[last].len, 11), last + 1, +#define VLC_BITS 11 +return ff_init_vlc_sparse(vlc, VLC_BITS, last + 1, bits, sizeof(*bits), sizeof(*bits), codes, sizeof(*codes), sizeof(*codes), syms, sizeof(*syms), sizeof(*syms), 0); @@ -117,7 +117,8 @@ static int build_huff(const uint8_t *src, VLC *vlc, int *fsym) code += 0x8000u >> (he[i].len - 1); } -return ff_init_vlc_sparse(vlc, FFMIN(he[last].len, 11), last + 1, +#define VLC_BITS 11 +return ff_init_vlc_sparse(vlc, VLC_BITS, last + 1, bits, sizeof(*bits), sizeof(*bits), codes, sizeof(*codes), sizeof(*codes), syms, sizeof(*syms), sizeof(*syms), 0); @@ -196,7 +197,7 @@ static int decode_plane10(UtvideoContext *c, int plane_no, prev = 0x200; for (j = sstart; j < send; j++) { for (i = 0; i < width * step; i += step) { -pix = get_vlc2(&gb, vlc.table, vlc.bits, 3); +pix = get_vlc2(&gb, vlc.table, VLC_BITS, 3); if (pix < 0) { av_log(c->avctx, AV_LOG_ERROR, "Decoding error\n"); goto fail; @@ -302,7 +303,7 @@ static int decode_plane(UtvideoContext *c, int plane_no, prev = 0x80; for (j = sstart; j < send; j++) { for (i = 0; i < width * step; i += step) { -pix = get_vlc2(&gb, vlc.table, vlc.bits, 3); +pix = get_vlc2(&gb, vlc.table, VLC_BITS, 3); if (pix < 0) { av_log(c->avctx, AV_LOG_ERROR, "Decoding error\n"); goto fail; -- 2.13.0 ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
Re: [FFmpeg-devel] [PATCH 1/5] avcodec/utvideodec: Move bitstream end check out of inner loop
On Tue, Jun 27, 2017 at 09:47:31PM +0200, Michael Niedermayer wrote: "Summary email is empty, skipping it" somehow the summary mail for the thread was lost ... it basically said thats a bunch of trivial optimizations surrounding the vlc reader loop. [...] -- Michael GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB Freedom in capitalist society always remains about the same as it was in ancient Greek republics: Freedom for slave owners. -- Vladimir Lenin signature.asc Description: Digital signature ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
Re: [FFmpeg-devel] [PATCH] avcodec/vp9: add 64-bit ipred_dr_32x32_16 avx2 implementation
Hi, On Sun, Jun 25, 2017 at 10:42 AM, Ilia Valiakhmetov wrote: > vp9_diag_downright_32x32_12bpp_c: 429.7 > vp9_diag_downright_32x32_12bpp_sse2: 158.9 > vp9_diag_downright_32x32_12bpp_ssse3: 144.6 > vp9_diag_downright_32x32_12bpp_avx: 141.0 > vp9_diag_downright_32x32_12bpp_avx2: 73.8 > > Almost 50% faster than avx implementation > --- > libavcodec/x86/vp9dsp_init_16bpp.c| 6 +- > libavcodec/x86/vp9intrapred_16bpp.asm | 103 > +- > 2 files changed, 106 insertions(+), 3 deletions(-) > > diff --git a/libavcodec/x86/vp9dsp_init_16bpp.c > b/libavcodec/x86/vp9dsp_init_16bpp.c > index 8d1aa13..54216f0 100644 > --- a/libavcodec/x86/vp9dsp_init_16bpp.c > +++ b/libavcodec/x86/vp9dsp_init_16bpp.c > @@ -52,8 +52,9 @@ decl_ipred_fns(dc, 16, mmxext, sse2); > decl_ipred_fns(dc_top, 16, mmxext, sse2); > decl_ipred_fns(dc_left, 16, mmxext, sse2); > decl_ipred_fn(dl, 16, 16, avx2); > -decl_ipred_fn(dr, 16, 16, avx2); > decl_ipred_fn(dl, 32, 16, avx2); > +decl_ipred_fn(dr, 16, 16, avx2); > +decl_ipred_fn(dr, 32, 16, avx2); > > #define decl_ipred_dir_funcs(type) \ > decl_ipred_fns(type, 16, sse2, sse2); \ > @@ -137,8 +138,9 @@ av_cold void ff_vp9dsp_init_16bpp_x86(VP9DSPContext > *dsp) > init_fpel_func(1, 1, 64, avg, _16, avx2); > init_fpel_func(0, 1, 128, avg, _16, avx2); > init_ipred_func(dl, DIAG_DOWN_LEFT, 16, 16, avx2); > -init_ipred_func(dr, DIAG_DOWN_RIGHT, 16, 16, avx2); > init_ipred_func(dl, DIAG_DOWN_LEFT, 32, 16, avx2); > +init_ipred_func(dr, DIAG_DOWN_RIGHT, 16, 16, avx2); > +init_ipred_func(dr, DIAG_DOWN_RIGHT, 32, 16, avx2); > } > > #endif /* HAVE_YASM */ > diff --git a/libavcodec/x86/vp9intrapred_16bpp.asm > b/libavcodec/x86/vp9intrapred_16bpp.asm > index 6d4400b..32b6982 100644 > --- a/libavcodec/x86/vp9intrapred_16bpp.asm > +++ b/libavcodec/x86/vp9intrapred_16bpp.asm > @@ -1221,8 +1221,109 @@ cglobal vp9_ipred_dr_16x16_16, 4, 5, 6, dst, > stride, l, a > mova [dstq+strideq*0], m4 ; 0 > mova [dst3q+strideq*4], m5 ; 7 > RET > -%endif > > +%if ARCH_X86_64 > +cglobal vp9_ipred_dr_32x32_16, 4, 7, 10, dst, stride, l, a > +movam0, [lq+mmsize*0+0]; l[0-15] > +movam1, [lq+mmsize*1+0]; l[16-31] > +movum2, [aq+mmsize*0-2]; *abcdefghijklmno > +movam3, [aq+mmsize*0+0]; abcdefghijklmnop > +movam4, [aq+mmsize*1+0]; qrstuvwxyz012345 > +vperm2i128 m5, m0, m1, q0201 ; lmnopqrstuvwxyz0 > +vpalignrm6, m5, m0, 2 ; mnopqrstuvwxyz01 > +vpalignrm7, m5, m0, 4 ; nopqrstuvwxyz012 > +LOWPASS 0, 6, 7 ; L[0-15] > +vperm2i128 m7, m1, m2, q0201 ; stuvwxyz*abcdefg > +vpalignrm5, m7, m1, 2 ; lmnopqrstuvwxyz* > +vpalignrm6, m7, m1, 4 ; mnopqrstuvwxyz*a > +LOWPASS 1, 5, 6 ; L[16-31]# > +vperm2i128 m5, m3, m4, q0201 ; ijklmnopqrstuvwx > +vpalignrm6, m5, m3, 2 ; bcdefghijklmnopq > +LOWPASS 2, 3, 6 ; A[0-15] > +movum3, [aq+mmsize*1-2]; pqrstuvwxyz01234 > +vperm2i128 m6, m4, m4, q2001 ; yz012345 > +vpalignrm7, m6, m4, 2 ; rstuvwxyz012345. > +LOWPASS 3, 4, 7 ; A[16-31]. > +vperm2i128 m4, m1, m2, q0201 ; TUVWXYZ#ABCDEFGH > +vperm2i128 m5, m0, m1, q0201 ; L[7-15]L[16-23] > +vperm2i128 m8, m2, m3, q0201 ; IJKLMNOPQRSTUVWX > +DEFINE_ARGS dst8, stride, stride3, stride7, stride5, dst24, cnt > +lea stride3q, [strideq*3] > +lea stride5q, [stride3q+strideq*2] > +lea stride7q, [strideq*4+stride3q] > +lea dst24q, [dst8q+stride3q*8] > +lea dst8q, [dst8q+strideq*8] > +mov cntd, 2 > + > +.loop: > +mova [dst24q+stride7q+0 ], m0 ; 31 23 15 7 > +mova [dst24q+stride7q+32], m1 > +mova[dst8q+stride7q+0], m1 > +mova [dst8q+stride7q+32], m2 > +vpalignrm6, m4, m1, 2 > +vpalignrm7, m5, m0, 2 > +vpalignrm9, m8, m2, 2 > +mova [dst24q+stride3q*2+0], m7 ; 30 22 14 6 > +mova [dst24q+stride3q*2+32], m6 > +mova [dst8q+stride3q*2+0], m6 > +mova [dst8q+stride3q*2+32], m9 > +vpalignrm6, m4, m1, 4 > +vpalignrm7, m5, m0, 4 > +vpalign
Re: [FFmpeg-devel] [PATCH 1/5] avcodec/utvideodec: Move bitstream end check out of inner loop
On 6/27/17, Michael Niedermayer wrote: > On Tue, Jun 27, 2017 at 09:47:31PM +0200, Michael Niedermayer wrote: > > "Summary email is empty, skipping it" > > somehow the summary mail for the thread was lost ... > > it basically said thats a bunch of trivial optimizations surrounding > the vlc reader loop. libavcodec/utvideodec.c: In function ***decode_plane10***: libavcodec/utvideodec.c:193:27: warning: passing argument 2 of ***c->bdsp.bswap_buf*** from incompatible pointer type [-Wincompatible-pointer-types] src + slice_data_start + c->slices * 4, ^~~ libavcodec/utvideodec.c:193:27: note: expected ***const uint32_t * {aka const unsigned int *}*** but argument is of type ***const uint8_t * {aka const unsigned char *}*** libavcodec/utvideodec.c: In function ***decode_plane***: libavcodec/utvideodec.c:298:27: warning: passing argument 2 of ***c->bdsp.bswap_buf*** from incompatible pointer type [-Wincompatible-pointer-types] src + slice_data_start + c->slices * 4, ^~~ libavcodec/utvideodec.c:298:27: note: expected ***const uint32_t * {aka const unsigned int *}*** but argument is of type ***const uint8_t * {aka const unsigned char *}*** ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
Re: [FFmpeg-devel] [PATCH 1/5] avcodec/utvideodec: Move bitstream end check out of inner loop
On 6/27/17, Paul B Mahol wrote: > On 6/27/17, Michael Niedermayer wrote: >> On Tue, Jun 27, 2017 at 09:47:31PM +0200, Michael Niedermayer wrote: >> >> "Summary email is empty, skipping it" >> >> somehow the summary mail for the thread was lost ... >> >> it basically said thats a bunch of trivial optimizations surrounding >> the vlc reader loop. > > libavcodec/utvideodec.c: In function ***decode_plane10***: > libavcodec/utvideodec.c:193:27: warning: passing argument 2 of > ***c->bdsp.bswap_buf*** from incompatible pointer type > [-Wincompatible-pointer-types] > src + slice_data_start + c->slices * 4, > ^~~ > libavcodec/utvideodec.c:193:27: note: expected ***const uint32_t * > {aka const unsigned int *}*** but argument is of type ***const uint8_t > * {aka const unsigned char *}*** > libavcodec/utvideodec.c: In function ***decode_plane***: > libavcodec/utvideodec.c:298:27: warning: passing argument 2 of > ***c->bdsp.bswap_buf*** from incompatible pointer type > [-Wincompatible-pointer-types] > src + slice_data_start + c->slices * 4, > ^~~ > libavcodec/utvideodec.c:298:27: note: expected ***const uint32_t * > {aka const unsigned int *}*** but argument is of type ***const uint8_t > * {aka const unsigned char *}*** > Except this, patchset LGTM. ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
Re: [FFmpeg-devel] [PATCH 3/5] avcodec/utvideodec: enable unchecked bitreader
On Tue, 27 Jun 2017 at 20:48 Michael Niedermayer wrote: > inner reader loop becomes 16% faster > > Signed-off-by: Michael Niedermayer > --- > libavcodec/utvideodec.c | 2 ++ > 1 file changed, 2 insertions(+) > > diff --git a/libavcodec/utvideodec.c b/libavcodec/utvideodec.c > index 411df47730..1418cde543 100644 > --- a/libavcodec/utvideodec.c > +++ b/libavcodec/utvideodec.c > @@ -27,6 +27,8 @@ > #include > #include > > +#define UNCHECKED_BITSTREAM_READER 1 > + > #include "libavutil/intreadwrite.h" > #include "avcodec.h" > #include "bswapdsp.h" > -- > 2.13.0 > Asking for trouble unless fuzzed well. Kieran ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
Re: [FFmpeg-devel] [PATCH 3/5] avcodec/utvideodec: enable unchecked bitreader
On 6/27/17, Kieran Kunhya wrote: > On Tue, 27 Jun 2017 at 20:48 Michael Niedermayer > wrote: > >> inner reader loop becomes 16% faster >> >> Signed-off-by: Michael Niedermayer >> --- >> libavcodec/utvideodec.c | 2 ++ >> 1 file changed, 2 insertions(+) >> >> diff --git a/libavcodec/utvideodec.c b/libavcodec/utvideodec.c >> index 411df47730..1418cde543 100644 >> --- a/libavcodec/utvideodec.c >> +++ b/libavcodec/utvideodec.c >> @@ -27,6 +27,8 @@ >> #include >> #include >> >> +#define UNCHECKED_BITSTREAM_READER 1 >> + >> #include "libavutil/intreadwrite.h" >> #include "avcodec.h" >> #include "bswapdsp.h" >> -- >> 2.13.0 >> > > Asking for trouble unless fuzzed well. Not really, it allocates enough bytes extra. ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
Re: [FFmpeg-devel] [PATCH 1/2] x86/vf_blend: add sse and ssse3 extremity functions
On 6/27/17, James Almer wrote: > Signed-off-by: James Almer > --- > libavfilter/x86/vf_blend.asm| 25 + > libavfilter/x86/vf_blend_init.c | 4 > tests/checkasm/vf_blend.c | 1 + > 3 files changed, 30 insertions(+) > > diff --git a/libavfilter/x86/vf_blend.asm b/libavfilter/x86/vf_blend.asm > index 33b1ad1496..25f6f5affc 100644 > --- a/libavfilter/x86/vf_blend.asm > +++ b/libavfilter/x86/vf_blend.asm > @@ -286,6 +286,31 @@ BLEND_INIT difference, 3 > jl .loop > BLEND_END > > +BLEND_INIT extremity, 8 > +pxor m2, m2 > +mova m4, [pw_255] > +.nextrow: > +movxq, widthq > + > +.loop: > +movum0, [topq + xq] > +movum1, [bottomq + xq] > +punpckhbw m5, m0, m2 > +punpcklbw m0, m2 > +punpckhbw m6, m1, m2 > +punpcklbw m1, m2 > +psubw m3, m4, m0 > +psubw m7, m4, m5 > +psubw m3, m1 > +psubw m7, m6 > +ABS1m3, m1 > +ABS1m7, m6 Minor nitpick. There exists ABS2 that takes 4 parameters and that does two interleaved ABS1 , that are (hopefully) faster on sse2. It should generate exactly the same code on ssse3. ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
Re: [FFmpeg-devel] [PATCH 1/2] x86/vf_blend: add sse and ssse3 extremity functions
On 6/27/2017 8:19 PM, Ivan Kalvachev wrote: > On 6/27/17, James Almer wrote: >> Signed-off-by: James Almer >> --- >> libavfilter/x86/vf_blend.asm| 25 + >> libavfilter/x86/vf_blend_init.c | 4 >> tests/checkasm/vf_blend.c | 1 + >> 3 files changed, 30 insertions(+) >> >> diff --git a/libavfilter/x86/vf_blend.asm b/libavfilter/x86/vf_blend.asm >> index 33b1ad1496..25f6f5affc 100644 >> --- a/libavfilter/x86/vf_blend.asm >> +++ b/libavfilter/x86/vf_blend.asm >> @@ -286,6 +286,31 @@ BLEND_INIT difference, 3 >> jl .loop >> BLEND_END >> >> +BLEND_INIT extremity, 8 >> +pxor m2, m2 >> +mova m4, [pw_255] >> +.nextrow: >> +movxq, widthq >> + >> +.loop: >> +movum0, [topq + xq] >> +movum1, [bottomq + xq] >> +punpckhbw m5, m0, m2 >> +punpcklbw m0, m2 >> +punpckhbw m6, m1, m2 >> +punpcklbw m1, m2 >> +psubw m3, m4, m0 >> +psubw m7, m4, m5 >> +psubw m3, m1 >> +psubw m7, m6 >> +ABS1m3, m1 >> +ABS1m7, m6 > > Minor nitpick. > > There exists ABS2 that takes 4 parameters and that does > two interleaved ABS1 , that are (hopefully) faster on sse2. > It should generate exactly the same code on ssse3. Ah nice, pushed a change to use them. Thanks. ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
Re: [FFmpeg-devel] [PATCH 5/5] avcodec/utvideodec: Factor multiply out of inner loop
2017-06-28 3:47 GMT+08:00 Michael Niedermayer : > 0.5% faster loop > > Signed-off-by: Michael Niedermayer > --- > libavcodec/utvideodec.c | 6 -- > 1 file changed, 4 insertions(+), 2 deletions(-) > > diff --git a/libavcodec/utvideodec.c b/libavcodec/utvideodec.c > index 788f4475b9..a20e28320c 100644 > --- a/libavcodec/utvideodec.c > +++ b/libavcodec/utvideodec.c > @@ -196,7 +196,8 @@ static int decode_plane10(UtvideoContext *c, int plane_no, > > prev = 0x200; > for (j = sstart; j < send; j++) { > -for (i = 0; i < width * step; i += step) { > +int ws = width * step; > +for (i = 0; i < ws; i += step) { > pix = get_vlc2(&gb, vlc.table, VLC_BITS, 3); > if (pix < 0) { > av_log(c->avctx, AV_LOG_ERROR, "Decoding error\n"); > @@ -300,7 +301,8 @@ static int decode_plane(UtvideoContext *c, int plane_no, > > prev = 0x80; > for (j = sstart; j < send; j++) { > -for (i = 0; i < width * step; i += step) { > +int ws = width * step; > +for (i = 0; i < ws; i += step) { > pix = get_vlc2(&gb, vlc.table, VLC_BITS, 3); > if (pix < 0) { > av_log(c->avctx, AV_LOG_ERROR, "Decoding error\n"); > -- > 2.13.0 > > ___ > ffmpeg-devel mailing list > ffmpeg-devel@ffmpeg.org > http://ffmpeg.org/mailman/listinfo/ffmpeg-devel LGTM, move the compute before the loop, better then in the loop ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpeg.org/mailman/listinfo/ffmpeg-devel