Re: [FFmpeg-devel] [PATCH 2/2] lavfi: make filter_frame non-recursive.
On Mon, Nov 14, 2016 at 12:18:12AM +0100, Nicolas George wrote: > Le quartidi 24 brumaire, an CCXXV, Hendrik Leppkes a écrit : > > I disagree with that, this is generic utility code, not a tight > > algorithm loop, where absolute efficiency is not necessarily required, > > and disregarding code simplicity amd consistency will just harm in the > > long-run. > > We can discuss that. Remember that they are fields in a structure, not > code: if these fields are accessed in a tight loop, your argument > vanishes. > You can dereference once before entering the loop -- Clément B. ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
Re: [FFmpeg-devel] [PATCH]lavu/pixfmt: Add GRAY10
2016-11-14 3:32 GMT+01:00 Michael Niedermayer : > On Mon, Nov 14, 2016 at 01:59:06AM +0100, Carl Eugen Hoyos wrote: >> Hi! >> >> Attached patches are a requirement to fix two tickets. >> >> Please review, Carl Eugen > > patches LGTM Patches applied. Thank you, Carl Eugen ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
[FFmpeg-devel] [PATCH 8/9] aarch64: vp9: Add NEON itxfm routines
This work is sponsored by, and copyright, Google. These are ported from the ARM version; thanks to the larger amount of registers available, we can do the 16x16 and 32x32 transforms in slices 8 pixels wide instead of 4. This gives a speedup of around 1.4x compared to the 32 bit version. The fact that aarch64 doesn't have the same d/q register aliasing makes some of the macros quite a bit simpler as well. Examples of runtimes vs the 32 bit version, on a Cortex A53: ARM AArch64 vp9_inv_adst_adst_4x4_add_neon: 90.0 87.7 vp9_inv_adst_adst_8x8_add_neon: 400.0354.7 vp9_inv_adst_adst_16x16_add_neon: 2526.5 1827.2 vp9_inv_dct_dct_4x4_add_neon: 74.0 72.7 vp9_inv_dct_dct_8x8_add_neon:271.0256.7 vp9_inv_dct_dct_16x16_add_neon: 1960.7 1372.7 vp9_inv_dct_dct_32x32_add_neon:11988.9 8088.3 vp9_inv_wht_wht_4x4_add_neon: 63.0 57.7 The speedup vs C code (2-4x) is smaller than in the 32 bit case, mostly because the C code ends up significantly faster (around 1.6x faster, with GCC 5.4) when built for aarch64. Examples of runtimes vs C on a Cortex A57 (for a slightly older version of the patch): A57 gcc-5.3 neon vp9_inv_adst_adst_4x4_add_neon: 152.2 60.0 vp9_inv_adst_adst_8x8_add_neon: 948.2 288.0 vp9_inv_adst_adst_16x16_add_neon:4830.4 1380.5 vp9_inv_dct_dct_4x4_add_neon: 153.0 58.6 vp9_inv_dct_dct_8x8_add_neon: 789.2 180.2 vp9_inv_dct_dct_16x16_add_neon: 3639.6 917.1 vp9_inv_dct_dct_32x32_add_neon: 20462.1 4985.0 vp9_inv_wht_wht_4x4_add_neon: 91.0 49.8 The asm is around factor 3-4 faster than C on the cortex-a57 and the asm is around 30-50% faster on the a57 compared to the a53. This is an adapted cherry-pick from libav commit 3c9546dfafcdfe8e7860aff9ebbf609318220f29. --- libavcodec/aarch64/Makefile |3 +- libavcodec/aarch64/vp9dsp_init_aarch64.c | 54 +- libavcodec/aarch64/vp9itxfm_neon.S | 1116 ++ 3 files changed, 1171 insertions(+), 2 deletions(-) create mode 100644 libavcodec/aarch64/vp9itxfm_neon.S diff --git a/libavcodec/aarch64/Makefile b/libavcodec/aarch64/Makefile index e7db95e..e8a7f7a 100644 --- a/libavcodec/aarch64/Makefile +++ b/libavcodec/aarch64/Makefile @@ -42,4 +42,5 @@ NEON-OBJS-$(CONFIG_MPEGAUDIODSP)+= aarch64/mpegaudiodsp_neon.o # decoders/encoders NEON-OBJS-$(CONFIG_DCA_DECODER) += aarch64/synth_filter_neon.o NEON-OBJS-$(CONFIG_VORBIS_DECODER) += aarch64/vorbisdsp_neon.o -NEON-OBJS-$(CONFIG_VP9_DECODER) += aarch64/vp9mc_neon.o +NEON-OBJS-$(CONFIG_VP9_DECODER) += aarch64/vp9itxfm_neon.o \ + aarch64/vp9mc_neon.o diff --git a/libavcodec/aarch64/vp9dsp_init_aarch64.c b/libavcodec/aarch64/vp9dsp_init_aarch64.c index 4adf363..2848608 100644 --- a/libavcodec/aarch64/vp9dsp_init_aarch64.c +++ b/libavcodec/aarch64/vp9dsp_init_aarch64.c @@ -96,7 +96,7 @@ define_8tap_2d_funcs(16) define_8tap_2d_funcs(8) define_8tap_2d_funcs(4) -av_cold void ff_vp9dsp_init_aarch64(VP9DSPContext *dsp, int bpp) +static av_cold void vp9dsp_mc_init_aarch64(VP9DSPContext *dsp, int bpp) { int cpu_flags = av_get_cpu_flags(); @@ -154,3 +154,55 @@ av_cold void ff_vp9dsp_init_aarch64(VP9DSPContext *dsp, int bpp) init_mc_funcs_dirs(4, 4); } } + +#define define_itxfm(type_a, type_b, sz) \ +void ff_vp9_##type_a##_##type_b##_##sz##x##sz##_add_neon(uint8_t *_dst,\ + ptrdiff_t stride, \ + int16_t *_block, int eob) + +#define define_itxfm_funcs(sz) \ +define_itxfm(idct, idct, sz); \ +define_itxfm(iadst, idct, sz); \ +define_itxfm(idct, iadst, sz); \ +define_itxfm(iadst, iadst, sz) + +define_itxfm_funcs(4); +define_itxfm_funcs(8); +define_itxfm_funcs(16); +define_itxfm(idct, idct, 32); +define_itxfm(iwht, iwht, 4); + + +static av_cold void vp9dsp_itxfm_init_aarch64(VP9DSPContext *dsp, int bpp) +{ +int cpu_flags = av_get_cpu_flags(); + +if (bpp != 8) +return; + +if (have_neon(cpu_flags)) { +#define init_itxfm(tx, sz) \ +dsp->itxfm_add[tx][DCT_DCT] = ff_vp9_idct_idct_##sz##_add_neon; \ +dsp->itxfm_add[tx][DCT_ADST] = ff_vp9_iadst_idct_##sz##_add_neon; \ +dsp->itxfm_add[tx][ADST_DCT] = ff_vp9_idct_iadst_##sz##_add_neon; \ +dsp->itxfm_add[tx][ADST_ADST] = ff_vp9_iadst_iadst_##sz##_add_neon + +#define init_idct(tx, nm) \ +dsp->itxfm_add[tx][DCT_DCT] = \ +dsp->itxfm_add[tx][ADST_DCT] = \ +dsp->itxfm_add[tx][DCT_ADST] = \ +dsp->itxfm_add[tx][ADST_ADST] = ff_vp9_##nm##_add_neon + +init_itxfm(TX_4X4, 4x4); +init_itxfm(TX_8X8, 8x8); +init_itxfm(TX_16X16, 16x16); +
[FFmpeg-devel] [PATCH 7/9] aarch64: vp9: Add NEON optimizations of VP9 MC functions
This work is sponsored by, and copyright, Google. These are ported from the ARM version; it is essentially a 1:1 port with no extra added features, but with some hand tuning (especially for the plain copy/avg functions). The ARM version isn't very register starved to begin with, so there's not much to be gained from having more spare registers here - we only avoid having to clobber callee-saved registers. Examples of runtimes vs the 32 bit version, on a Cortex A53: ARM AArch64 vp9_avg4_neon: 27.2 23.7 vp9_avg8_neon: 56.5 54.7 vp9_avg16_neon:169.9 167.4 vp9_avg32_neon:585.8 585.2 vp9_avg64_neon: 2460.32294.7 vp9_avg_8tap_smooth_4h_neon: 132.7 125.2 vp9_avg_8tap_smooth_4hv_neon: 478.8 442.0 vp9_avg_8tap_smooth_4v_neon: 126.0 93.7 vp9_avg_8tap_smooth_8h_neon: 241.7 234.2 vp9_avg_8tap_smooth_8hv_neon: 690.9 646.5 vp9_avg_8tap_smooth_8v_neon: 245.0 205.5 vp9_avg_8tap_smooth_64h_neon:11273.2 11280.1 vp9_avg_8tap_smooth_64hv_neon: 22980.6 22184.1 vp9_avg_8tap_smooth_64v_neon:11549.7 10781.1 vp9_put4_neon: 18.0 17.2 vp9_put8_neon: 40.2 37.7 vp9_put16_neon: 97.4 99.5 vp9_put32_neon/armv8: 346.0 307.4 vp9_put64_neon/armv8: 1319.01107.5 vp9_put_8tap_smooth_4h_neon: 126.7 118.2 vp9_put_8tap_smooth_4hv_neon: 465.7 434.0 vp9_put_8tap_smooth_4v_neon: 113.0 86.5 vp9_put_8tap_smooth_8h_neon: 229.7 221.6 vp9_put_8tap_smooth_8hv_neon: 658.9 621.3 vp9_put_8tap_smooth_8v_neon: 215.0 187.5 vp9_put_8tap_smooth_64h_neon:10636.7 10627.8 vp9_put_8tap_smooth_64hv_neon: 21076.8 21026.9 vp9_put_8tap_smooth_64v_neon: 9635.09632.4 These are generally about as fast as the corresponding ARM routines on the same CPU (at least on the A53), in most cases marginally faster. The speedup vs C code is pretty much the same as for the 32 bit case; on the A53 it's around 6-13x for ther larger 8tap filters. The exact speedup varies a little, since the C versions generally don't end up exactly as slow/fast as on 32 bit. This is an adapted cherry-pick from libav commit 383d96aa2229f644d9bd77b821ed3a309da5e9fc. --- libavcodec/aarch64/Makefile | 2 + libavcodec/aarch64/vp9dsp_init_aarch64.c | 156 +++ libavcodec/aarch64/vp9mc_neon.S | 676 +++ libavcodec/vp9.c | 8 +- libavcodec/vp9dsp.c | 1 + libavcodec/vp9dsp.h | 1 + 6 files changed, 840 insertions(+), 4 deletions(-) create mode 100644 libavcodec/aarch64/vp9dsp_init_aarch64.c create mode 100644 libavcodec/aarch64/vp9mc_neon.S diff --git a/libavcodec/aarch64/Makefile b/libavcodec/aarch64/Makefile index c3df887..e7db95e 100644 --- a/libavcodec/aarch64/Makefile +++ b/libavcodec/aarch64/Makefile @@ -16,6 +16,7 @@ OBJS-$(CONFIG_DCA_DECODER) += aarch64/synth_filter_init.o OBJS-$(CONFIG_RV40_DECODER) += aarch64/rv40dsp_init_aarch64.o OBJS-$(CONFIG_VC1DSP) += aarch64/vc1dsp_init_aarch64.o OBJS-$(CONFIG_VORBIS_DECODER) += aarch64/vorbisdsp_init.o +OBJS-$(CONFIG_VP9_DECODER) += aarch64/vp9dsp_init_aarch64.o # ARMv8 optimizations @@ -41,3 +42,4 @@ NEON-OBJS-$(CONFIG_MPEGAUDIODSP)+= aarch64/mpegaudiodsp_neon.o # decoders/encoders NEON-OBJS-$(CONFIG_DCA_DECODER) += aarch64/synth_filter_neon.o NEON-OBJS-$(CONFIG_VORBIS_DECODER) += aarch64/vorbisdsp_neon.o +NEON-OBJS-$(CONFIG_VP9_DECODER) += aarch64/vp9mc_neon.o diff --git a/libavcodec/aarch64/vp9dsp_init_aarch64.c b/libavcodec/aarch64/vp9dsp_init_aarch64.c new file mode 100644 index 000..4adf363 --- /dev/null +++ b/libavcodec/aarch64/vp9dsp_init_aarch64.c @@ -0,0 +1,156 @@ +/* + * Copyright (c) 2016 Google Inc. + * + * This file is part of FFmpeg. + * + * FFmpeg is free software; you can redistribute it and/or + * modify it under the terms of the GNU Lesser General Public + * License as published by the Free Software Foundation; either + * version 2.1 of the License, or (at your option) any later version. + * + * FFmpeg is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + * Lesser General Public License for more details. + * + * You should have received a copy of the GNU Lesser General Public + * License along with FFmpeg; if not, write to the Free Software + * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA + */ + +#include + +#include "libavutil/attributes.h" +#include "libavutil/aarch64/cpu.h" +#include "libavcodec/vp9dsp.h" + +#define
[FFmpeg-devel] [PATCH 6/9] aarch64: Add an offset parameter to the movrel macro
With apple tools, the linker fails with errors like these, if the offset is negative: ld: in section __TEXT,__text reloc 8: symbol index out of range for architecture arm64 This is cherry-picked from libav commit c44a8a3eabcd6acd2ba79f32ec8a432e6ebe552c. --- libavutil/aarch64/asm.S | 14 ++ 1 file changed, 10 insertions(+), 4 deletions(-) diff --git a/libavutil/aarch64/asm.S b/libavutil/aarch64/asm.S index ff34e7a..523b8c5 100644 --- a/libavutil/aarch64/asm.S +++ b/libavutil/aarch64/asm.S @@ -72,15 +72,21 @@ ELF .size \name, . - \name \name: .endm -.macro movrel rd, val +.macro movrel rd, val, offset=0 #if CONFIG_PIC && defined(__APPLE__) +.if \offset < 0 adrp\rd, \val@PAGE add \rd, \rd, \val@PAGEOFF +sub \rd, \rd, -(\offset) +.else +adrp\rd, \val+(\offset)@PAGE +add \rd, \rd, \val+(\offset)@PAGEOFF +.endif #elif CONFIG_PIC -adrp\rd, \val -add \rd, \rd, :lo12:\val +adrp\rd, \val+\offset +add \rd, \rd, :lo12:\val+\offset #else -ldr \rd, =\val +ldr \rd, =\val+\offset #endif .endm -- 2.7.4 ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
[FFmpeg-devel] [PATCH 5/9] arm: vp9: Add NEON loop filters
This work is sponsored by, and copyright, Google. The implementation tries to have smart handling of cases where no pixels need the full filtering for the 8/16 width filters, skipping both calculation and writeback of the unmodified pixels in those cases. The actual effect of this is hard to test with checkasm though, since it tests the full filtering, and the benefit depends on how many filtered blocks use the shortcut. Examples of relative speedup compared to the C version, from checkasm: Cortex A7 A8 A9A53 vp9_loop_filter_h_4_8_neon: 2.72 2.68 1.78 3.15 vp9_loop_filter_h_8_8_neon: 2.36 2.38 1.70 2.91 vp9_loop_filter_h_16_8_neon: 1.80 1.89 1.45 2.01 vp9_loop_filter_h_16_16_neon:2.81 2.78 2.18 3.16 vp9_loop_filter_mix2_h_44_16_neon: 2.65 2.67 1.93 3.05 vp9_loop_filter_mix2_h_48_16_neon: 2.46 2.38 1.81 2.85 vp9_loop_filter_mix2_h_84_16_neon: 2.50 2.41 1.73 2.85 vp9_loop_filter_mix2_h_88_16_neon: 2.77 2.66 1.96 3.23 vp9_loop_filter_mix2_v_44_16_neon: 4.28 4.46 3.22 5.70 vp9_loop_filter_mix2_v_48_16_neon: 3.92 4.00 3.03 5.19 vp9_loop_filter_mix2_v_84_16_neon: 3.97 4.31 2.98 5.33 vp9_loop_filter_mix2_v_88_16_neon: 3.91 4.19 3.06 5.18 vp9_loop_filter_v_4_8_neon: 4.53 4.47 3.31 6.05 vp9_loop_filter_v_8_8_neon: 3.58 3.99 2.92 5.17 vp9_loop_filter_v_16_8_neon: 3.40 3.50 2.81 4.68 vp9_loop_filter_v_16_16_neon:4.66 4.41 3.74 6.02 The speedup vs C code is around 2-6x. The numbers are quite inconclusive though, since the checkasm test runs multiple filterings on top of each other, so later rounds might end up with different codepaths (different decisions on which filter to apply, based on input pixel differences). Disabling the early-exit in the asm doesn't give a fair comparison either though, since the C code only does the necessary calcuations for each row. Based on START_TIMER/STOP_TIMER wrapping around a few individual functions, the speedup vs C code is around 4-9x. This is pretty similar in runtime to the corresponding routines in libvpx. (This is comparing vpx_lpf_vertical_16_neon, vpx_lpf_horizontal_edge_8_neon and vpx_lpf_horizontal_edge_16_neon to vp9_loop_filter_h_16_8_neon, vp9_loop_filter_v_16_8_neon and vp9_loop_filter_v_16_16_neon - note that the naming of horizonal and vertical is flipped between the libraries.) In order to have stable, comparable numbers, the early exits in both asm versions were disabled, forcing the full filtering codepath. Cortex A7 A8 A9 A53 vp9_loop_filter_h_16_8_neon: 597.2 472.0 482.4 415.0 libvpx vpx_lpf_vertical_16_neon: 626.0 464.5 470.7 445.0 vp9_loop_filter_v_16_8_neon: 500.2 422.5 429.7 295.0 libvpx vpx_lpf_horizontal_edge_8_neon: 586.5 414.5 415.6 383.2 vp9_loop_filter_v_16_16_neon:905.0 784.7 791.5 546.0 libvpx vpx_lpf_horizontal_edge_16_neon: 1060.2 751.7 743.5 685.2 Our version is consistently faster on on A7 and A53, marginally slower on A8, and sometimes faster, sometimes slower on A9 (marginally slower in all three tests in this particular test run). This is an adapted cherry-pick from libav commit dd299a2d6d4d1af9528ed35a8131c35946be5973. --- libavcodec/arm/Makefile | 1 + libavcodec/arm/vp9dsp_init_arm.c | 60 +++ libavcodec/arm/vp9lpf_neon.S | 770 +++ 3 files changed, 831 insertions(+) create mode 100644 libavcodec/arm/vp9lpf_neon.S diff --git a/libavcodec/arm/Makefile b/libavcodec/arm/Makefile index 8602e28..7f18daa 100644 --- a/libavcodec/arm/Makefile +++ b/libavcodec/arm/Makefile @@ -141,4 +141,5 @@ NEON-OBJS-$(CONFIG_RV40_DECODER) += arm/rv34dsp_neon.o\ NEON-OBJS-$(CONFIG_VORBIS_DECODER) += arm/vorbisdsp_neon.o NEON-OBJS-$(CONFIG_VP6_DECODER)+= arm/vp6dsp_neon.o NEON-OBJS-$(CONFIG_VP9_DECODER)+= arm/vp9itxfm_neon.o \ + arm/vp9lpf_neon.o \ arm/vp9mc_neon.o diff --git a/libavcodec/arm/vp9dsp_init_arm.c b/libavcodec/arm/vp9dsp_init_arm.c index 1d4eabf..05e50d7 100644 --- a/libavcodec/arm/vp9dsp_init_arm.c +++ b/libavcodec/arm/vp9dsp_init_arm.c @@ -188,8 +188,68 @@ static av_cold void vp9dsp_itxfm_init_arm(VP9DSPContext *dsp, int bpp) } } +#define define_loop_filter(dir, wd, size) \ +void ff_vp9_loop_filter_##dir##_##wd##_##size##_neon(uint8_t *dst, ptrdiff_t stride, int E, int I, int H) + +#define define_loop_filters(wd, size) \ +define_loop_filter(h, wd, size); \ +define_loop_filter(v, wd, size) + +define_loop_filters(4, 8); +define_loop_filters(8, 8); +define_loop_filters(16, 8); +define_loop_filters(16, 16); + +#define lf_mix_fn(dir, wd1, wd2, stridea)
[FFmpeg-devel] [PATCH 3/9] arm: vp9: Add NEON optimizations of VP9 MC functions
This work is sponsored by, and copyright, Google. The filter coefficients are signed values, where the product of the multiplication with one individual filter coefficient doesn't overflow a 16 bit signed value (the largest filter coefficient is 127). But when the products are accumulated, the resulting sum can overflow the 16 bit signed range. Instead of accumulating in 32 bit, we accumulate the largest product (either index 3 or 4) last with a saturated addition. (The VP8 MC asm does something similar, but slightly simpler, by accumulating each half of the filter separately. In the VP9 MC filters, each half of the filter can also overflow though, so the largest component has to be handled individually.) Examples of relative speedup compared to the C version, from checkasm: Cortex A7 A8 A9A53 vp9_avg4_neon: 1.71 1.15 1.42 1.49 vp9_avg8_neon: 2.51 3.63 3.14 2.58 vp9_avg16_neon: 2.95 6.76 3.01 2.84 vp9_avg32_neon: 3.29 6.64 2.85 3.00 vp9_avg64_neon: 3.47 6.67 3.14 2.80 vp9_avg_8tap_smooth_4h_neon: 3.22 4.73 2.76 4.67 vp9_avg_8tap_smooth_4hv_neon:3.67 4.76 3.28 4.71 vp9_avg_8tap_smooth_4v_neon: 5.52 7.60 4.60 6.31 vp9_avg_8tap_smooth_8h_neon: 6.22 9.04 5.12 9.32 vp9_avg_8tap_smooth_8hv_neon:6.38 8.21 5.72 8.17 vp9_avg_8tap_smooth_8v_neon: 9.22 12.66 8.15 11.10 vp9_avg_8tap_smooth_64h_neon:7.02 10.23 5.54 11.58 vp9_avg_8tap_smooth_64hv_neon: 6.76 9.46 5.93 9.40 vp9_avg_8tap_smooth_64v_neon: 10.76 14.13 9.46 13.37 vp9_put4_neon: 1.11 1.47 1.00 1.21 vp9_put8_neon: 1.23 2.17 1.94 1.48 vp9_put16_neon: 1.63 4.02 1.73 1.97 vp9_put32_neon: 1.56 4.92 2.00 1.96 vp9_put64_neon: 2.10 5.28 2.03 2.35 vp9_put_8tap_smooth_4h_neon: 3.11 4.35 2.63 4.35 vp9_put_8tap_smooth_4hv_neon:3.67 4.69 3.25 4.71 vp9_put_8tap_smooth_4v_neon: 5.45 7.27 4.49 6.52 vp9_put_8tap_smooth_8h_neon: 5.97 8.18 4.81 8.56 vp9_put_8tap_smooth_8hv_neon:6.39 7.90 5.64 8.15 vp9_put_8tap_smooth_8v_neon: 9.03 11.84 8.07 11.51 vp9_put_8tap_smooth_64h_neon:6.78 9.48 4.88 10.89 vp9_put_8tap_smooth_64hv_neon: 6.99 8.87 5.94 9.56 vp9_put_8tap_smooth_64v_neon: 10.69 13.30 9.43 14.34 For the larger 8tap filters, the speedup vs C code is around 5-14x. This is significantly faster than libvpx's implementation of the same functions, at least when comparing the put_8tap_smooth_64 functions (compared to vpx_convolve8_horiz_neon and vpx_convolve8_vert_neon from libvpx). Absolute runtimes from checkasm: Cortex A7A8A9 A53 vp9_put_8tap_smooth_64h_neon:20150.3 14489.4 19733.6 10863.7 libvpx vpx_convolve8_horiz_neon: 52623.3 19736.4 21907.7 25027.7 vp9_put_8tap_smooth_64v_neon:14455.0 12303.9 13746.49628.9 libvpx vpx_convolve8_vert_neon: 42090.0 17706.2 17659.9 16941.2 Thus, on the A9, the horizontal filter is only marginally faster than libvpx, while our version is significantly faster on the other cores, and the vertical filter is significantly faster on all cores. The difference is especially large on the A7. The libvpx implementation does the accumulation in 32 bit, which probably explains most of the differences. This is an adapted cherry-pick from libav commits ffbd1d2b0002576ef0d976a41ff959c635373fdc, 392caa65df3efa8b2d48a80f08a6af4892c61c08, 557c1675cf0e803b2fee43b4c8b58433842c84d0 and 11623217e3c9b859daee544e31acdd0821b61039. --- libavcodec/arm/Makefile | 2 + libavcodec/arm/vp9dsp_init_arm.c | 143 libavcodec/arm/vp9mc_neon.S | 709 +++ libavcodec/vp9.c | 20 +- libavcodec/vp9dsp.c | 1 + libavcodec/vp9dsp.h | 1 + 6 files changed, 872 insertions(+), 4 deletions(-) create mode 100644 libavcodec/arm/vp9dsp_init_arm.c create mode 100644 libavcodec/arm/vp9mc_neon.S diff --git a/libavcodec/arm/Makefile b/libavcodec/arm/Makefile index a4ceca7..82b740b 100644 --- a/libavcodec/arm/Makefile +++ b/libavcodec/arm/Makefile @@ -44,6 +44,7 @@ OBJS-$(CONFIG_MLP_DECODER) += arm/mlpdsp_init_arm.o OBJS-$(CONFIG_RV40_DECODER)+= arm/rv40dsp_init_arm.o OBJS-$(CONFIG_VORBIS_DECODER) += arm/vorbisdsp_init_arm.o OBJS-$(CONFIG_VP6_DECODER) += arm/vp6dsp_init_arm.o +OBJS-$(CONFIG_VP9_DECODER) += arm/vp9dsp_init_arm.o # ARMv5 optimizations @@ -139,3 +140,4 @@ NEON-OBJS-$(CONFIG_RV40_DECODER) += arm/rv34dsp_neon.o\ arm/rv40dsp_neon.o NEON-OBJS-$(CONFIG_VORBIS_DECODER) += arm/vorbisdsp_neon.o NEON-OBJS-$(CONFIG_VP6_DECODER)
[FFmpeg-devel] [PATCH 4/9] arm: vp9: Add NEON itxfm routines
This work is sponsored by, and copyright, Google. For the transforms up to 8x8, we can fit all the data (including temporaries) in registers and just do a straightforward transform of all the data. For 16x16, we do a transform of 4x16 pixels in 4 slices, using a temporary buffer. For 32x32, we transform 4x32 pixels at a time, in two steps of 4x16 pixels each. Examples of relative speedup compared to the C version, from checkasm: Cortex A7 A8 A9A53 vp9_inv_adst_adst_4x4_add_neon: 3.39 5.83 4.17 4.01 vp9_inv_adst_adst_8x8_add_neon: 3.79 4.86 4.23 3.98 vp9_inv_adst_adst_16x16_add_neon: 3.33 4.36 4.11 4.16 vp9_inv_dct_dct_4x4_add_neon: 4.06 6.16 4.59 4.46 vp9_inv_dct_dct_8x8_add_neon: 4.61 6.01 4.98 4.86 vp9_inv_dct_dct_16x16_add_neon: 3.35 3.44 3.36 3.79 vp9_inv_dct_dct_32x32_add_neon: 3.89 3.50 3.79 4.42 vp9_inv_wht_wht_4x4_add_neon: 3.22 5.13 3.53 3.77 Thus, the speedup vs C code is around 3-6x. This is mostly marginally faster than the corresponding routines in libvpx on most cores, tested with their 32x32 idct (compared to vpx_idct32x32_1024_add_neon). These numbers are slightly in libvpx's favour since their version doesn't clear the input buffer like ours do (although the effect of that on the total runtime probably is negligible.) Cortex A7 A8 A9 A53 vp9_inv_dct_dct_32x32_add_neon:18436.8 16874.1 14235.1 11988.9 libvpx vpx_idct32x32_1024_add_neon 20789.0 13344.3 15049.9 13030.5 Only on the Cortex A8, the libvpx function is faster. On the other cores, ours is slightly faster even though ours has got source block clearing integrated. This is an adapted cherry-pick from libav commits a67ae67083151f2f9595a1f2d17b601da19b939e and 52d196fb30fb6628921b5f1b31e7bd11eb7e1d9a. --- libavcodec/arm/Makefile |3 +- libavcodec/arm/vp9dsp_init_arm.c | 54 +- libavcodec/arm/vp9itxfm_neon.S | 1149 ++ 3 files changed, 1204 insertions(+), 2 deletions(-) create mode 100644 libavcodec/arm/vp9itxfm_neon.S diff --git a/libavcodec/arm/Makefile b/libavcodec/arm/Makefile index 82b740b..8602e28 100644 --- a/libavcodec/arm/Makefile +++ b/libavcodec/arm/Makefile @@ -140,4 +140,5 @@ NEON-OBJS-$(CONFIG_RV40_DECODER) += arm/rv34dsp_neon.o\ arm/rv40dsp_neon.o NEON-OBJS-$(CONFIG_VORBIS_DECODER) += arm/vorbisdsp_neon.o NEON-OBJS-$(CONFIG_VP6_DECODER)+= arm/vp6dsp_neon.o -NEON-OBJS-$(CONFIG_VP9_DECODER)+= arm/vp9mc_neon.o +NEON-OBJS-$(CONFIG_VP9_DECODER)+= arm/vp9itxfm_neon.o \ + arm/vp9mc_neon.o diff --git a/libavcodec/arm/vp9dsp_init_arm.c b/libavcodec/arm/vp9dsp_init_arm.c index bd0ac37..1d4eabf 100644 --- a/libavcodec/arm/vp9dsp_init_arm.c +++ b/libavcodec/arm/vp9dsp_init_arm.c @@ -94,7 +94,7 @@ define_8tap_2d_funcs(8) define_8tap_2d_funcs(4) -av_cold void ff_vp9dsp_init_arm(VP9DSPContext *dsp, int bpp) +static av_cold void vp9dsp_mc_init_arm(VP9DSPContext *dsp, int bpp) { int cpu_flags = av_get_cpu_flags(); @@ -141,3 +141,55 @@ av_cold void ff_vp9dsp_init_arm(VP9DSPContext *dsp, int bpp) init_mc_funcs_dirs(4, 4); } } + +#define define_itxfm(type_a, type_b, sz) \ +void ff_vp9_##type_a##_##type_b##_##sz##x##sz##_add_neon(uint8_t *_dst,\ + ptrdiff_t stride, \ + int16_t *_block, int eob) + +#define define_itxfm_funcs(sz) \ +define_itxfm(idct, idct, sz); \ +define_itxfm(iadst, idct, sz); \ +define_itxfm(idct, iadst, sz); \ +define_itxfm(iadst, iadst, sz) + +define_itxfm_funcs(4); +define_itxfm_funcs(8); +define_itxfm_funcs(16); +define_itxfm(idct, idct, 32); +define_itxfm(iwht, iwht, 4); + + +static av_cold void vp9dsp_itxfm_init_arm(VP9DSPContext *dsp, int bpp) +{ +int cpu_flags = av_get_cpu_flags(); + +if (bpp != 8) +return; + +if (have_neon(cpu_flags)) { +#define init_itxfm(tx, sz) \ +dsp->itxfm_add[tx][DCT_DCT] = ff_vp9_idct_idct_##sz##_add_neon; \ +dsp->itxfm_add[tx][DCT_ADST] = ff_vp9_iadst_idct_##sz##_add_neon; \ +dsp->itxfm_add[tx][ADST_DCT] = ff_vp9_idct_iadst_##sz##_add_neon; \ +dsp->itxfm_add[tx][ADST_ADST] = ff_vp9_iadst_iadst_##sz##_add_neon + +#define init_idct(tx, nm) \ +dsp->itxfm_add[tx][DCT_DCT] = \ +dsp->itxfm_add[tx][ADST_DCT] = \ +dsp->itxfm_add[tx][DCT_ADST] = \ +dsp->itxfm_add[tx][ADST_ADST] = ff_vp9_##nm##_add_neon + +init_itxfm(TX_4X4, 4x4); +init_itxfm(TX_8X8, 8x8); +init_itxfm(TX_16X16, 16x16); +init_idct(TX_32X32, idct_idct_32x32); +init_idct(4, iwht_iwht_4x4); +} +} +
[FFmpeg-devel] [PATCH 9/9] aarch64: vp9: Implement NEON loop filters
This work is sponsored by, and copyright, Google. These are ported from the ARM version; thanks to the larger amount of registers available, we can do the loop filters with 16 pixels at a time. The implementation is fully templated, with a single macro which can generate versions for both 8 and 16 pixels wide, for both 4, 8 and 16 pixels loop filters (and the 4/8 mixed versions as well). For the 8 pixel wide versions, it is pretty close in speed (the v_4_8 and v_8_8 filters are the best examples of this; the h_4_8 and h_8_8 filters seem to get some gain in the load/transpose/store part). For the 16 pixels wide ones, we get a speedup of around 1.2-1.4x compared to the 32 bit version. Examples of runtimes vs the 32 bit version, on a Cortex A53: ARM AArch64 vp9_loop_filter_h_4_8_neon: 144.0 127.2 vp9_loop_filter_h_8_8_neon: 207.0 182.5 vp9_loop_filter_h_16_8_neon: 415.0 328.7 vp9_loop_filter_h_16_16_neon:672.0 558.6 vp9_loop_filter_mix2_h_44_16_neon: 302.0 203.5 vp9_loop_filter_mix2_h_48_16_neon: 365.0 305.2 vp9_loop_filter_mix2_h_84_16_neon: 365.0 305.2 vp9_loop_filter_mix2_h_88_16_neon: 376.0 305.2 vp9_loop_filter_mix2_v_44_16_neon: 193.2 128.2 vp9_loop_filter_mix2_v_48_16_neon: 246.7 218.4 vp9_loop_filter_mix2_v_84_16_neon: 248.0 218.5 vp9_loop_filter_mix2_v_88_16_neon: 302.0 218.2 vp9_loop_filter_v_4_8_neon: 89.088.7 vp9_loop_filter_v_8_8_neon: 141.0 137.7 vp9_loop_filter_v_16_8_neon: 295.0 272.7 vp9_loop_filter_v_16_16_neon:546.0 453.7 The speedup vs C code in checkasm tests is around 2-7x, which is pretty much the same as for the 32 bit version. Even if these functions are faster than their 32 bit equivalent, the C version that we compare to also became around 1.3-1.7x faster than the C version in 32 bit. Based on START_TIMER/STOP_TIMER wrapping around a few individual functions, the speedup vs C code is around 4-5x. Examples of runtimes vs C on a Cortex A57 (for a slightly older version of the patch): A57 gcc-5.3 neon loop_filter_h_4_8_neon:256.6 93.4 loop_filter_h_8_8_neon:307.3 139.1 loop_filter_h_16_8_neon: 340.1 254.1 loop_filter_h_16_16_neon: 827.0 407.9 loop_filter_mix2_h_44_16_neon: 524.5 155.4 loop_filter_mix2_h_48_16_neon: 644.5 173.3 loop_filter_mix2_h_84_16_neon: 630.5 222.0 loop_filter_mix2_h_88_16_neon: 697.3 222.0 loop_filter_mix2_v_44_16_neon: 598.5 100.6 loop_filter_mix2_v_48_16_neon: 651.5 127.0 loop_filter_mix2_v_84_16_neon: 591.5 167.1 loop_filter_mix2_v_88_16_neon: 855.1 166.7 loop_filter_v_4_8_neon:271.7 65.3 loop_filter_v_8_8_neon:312.5 106.9 loop_filter_v_16_8_neon: 473.3 206.5 loop_filter_v_16_16_neon: 976.1 327.8 The speed-up compared to the C functions is 2.5 to 6 and the cortex-a57 is again 30-50% faster than the cortex-a53. This is an adapted cherry-pick from libav commits 9d2afd1eb8c5cc0633062430e66326dbf98c99e0 and 31756abe29eb039a11c59a42cb12e0cc2aef3b97. --- libavcodec/aarch64/Makefile |1 + libavcodec/aarch64/vp9dsp_init_aarch64.c | 48 ++ libavcodec/aarch64/vp9lpf_neon.S | 1355 ++ 3 files changed, 1404 insertions(+) create mode 100644 libavcodec/aarch64/vp9lpf_neon.S diff --git a/libavcodec/aarch64/Makefile b/libavcodec/aarch64/Makefile index e8a7f7a..b7bb898 100644 --- a/libavcodec/aarch64/Makefile +++ b/libavcodec/aarch64/Makefile @@ -43,4 +43,5 @@ NEON-OBJS-$(CONFIG_MPEGAUDIODSP)+= aarch64/mpegaudiodsp_neon.o NEON-OBJS-$(CONFIG_DCA_DECODER) += aarch64/synth_filter_neon.o NEON-OBJS-$(CONFIG_VORBIS_DECODER) += aarch64/vorbisdsp_neon.o NEON-OBJS-$(CONFIG_VP9_DECODER) += aarch64/vp9itxfm_neon.o \ + aarch64/vp9lpf_neon.o \ aarch64/vp9mc_neon.o diff --git a/libavcodec/aarch64/vp9dsp_init_aarch64.c b/libavcodec/aarch64/vp9dsp_init_aarch64.c index 2848608..7e34375 100644 --- a/libavcodec/aarch64/vp9dsp_init_aarch64.c +++ b/libavcodec/aarch64/vp9dsp_init_aarch64.c @@ -201,8 +201,56 @@ static av_cold void vp9dsp_itxfm_init_aarch64(VP9DSPContext *dsp, int bpp) } } +#define define_loop_filter(dir, wd, len) \ +void ff_vp9_loop_filter_##dir##_##wd##_##len##_neon(uint8_t *dst, ptrdiff_t stride, int E, int I, int H) + +#define define_loop_filters(wd, len) \ +define_loop_filter(h, wd, len); \ +define_loop_filter(v, wd, len) + +define_loop_filters(4, 8); +define_loop_filters(8, 8); +define_loop_filters(16, 8); + +define_loop_filters(16, 16); + +define_loop_filters(44, 16); +define_loop_filters(48, 16); +define_loop_filters(84, 16); +define_loop_filters(88, 16); + +static av_cold void vp9dsp_loopfilter_init_aarch64(VP9DSPContext *dsp, int bpp) +{ +int cpu_flags = av_get_cpu_flags(); + +if (bpp != 8) +ret
Re: [FFmpeg-devel] [PATCH 2/2] lavfi: make filter_frame non-recursive.
Le quartidi 24 brumaire, an CCXXV, Clement Boesch a écrit : > You can dereference once before entering the loop That requires one extra variable, and therefore leaves one less register for the loop itself. And of course, all these efficiency problems correspond directly to readability issues. Now that I think of it (I spent almost a year on this thing, and that part was at the very beginning), I remember that my reasons for choosing this solution were first readability, and only second efficiency. Regards, -- Nicolas George ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
[FFmpeg-devel] [PATCH 2/9] arm: Clear the gp register alias at the end of functions
We reset .Lpic_gp to zero at the start of each function, which means that the logic within movrelx for clearing gp when necessary will be missed. This fixes using movrelx in different functions with a different helper register. This is cherry-picked from libav commit 824e8c284054f323f854892d1b4739239ed1fdc7. --- libavutil/arm/asm.S | 3 +++ 1 file changed, 3 insertions(+) diff --git a/libavutil/arm/asm.S b/libavutil/arm/asm.S index e9b0bca..b0a6e50 100644 --- a/libavutil/arm/asm.S +++ b/libavutil/arm/asm.S @@ -77,6 +77,9 @@ ELF .section .note.GNU-stack,"",%progbits @ Mark stack as non-executable put_pic %(.Lpic_idx - 1) .noaltmacro .endif + .if .Lpic_gp +.unreq gp + .endif ELF .size \name, . - \name FUNC.endfunc .purgem endfunc -- 2.7.4 ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
[FFmpeg-devel] [PATCH 1/9] vp9dsp: Deduplicate the subpel filters
Make them aligned, to allow efficient access to them from simd. This is an adapted cherry-pick from libav commit a4cfcddcb0f76e837d5abc06840c2b26c0e8aefc. --- libavcodec/vp9dsp.c | 56 +++ libavcodec/vp9dsp.h | 3 +++ libavcodec/vp9dsp_template.c | 63 +++- 3 files changed, 63 insertions(+), 59 deletions(-) diff --git a/libavcodec/vp9dsp.c b/libavcodec/vp9dsp.c index 54e77e2..6dd49c8 100644 --- a/libavcodec/vp9dsp.c +++ b/libavcodec/vp9dsp.c @@ -25,6 +25,62 @@ #include "libavutil/common.h" #include "vp9dsp.h" +const DECLARE_ALIGNED(16, int16_t, ff_vp9_subpel_filters)[3][16][8] = { +[FILTER_8TAP_REGULAR] = { +{ 0, 0, 0, 128, 0, 0, 0, 0 }, +{ 0, 1, -5, 126, 8, -3, 1, 0 }, +{ -1, 3, -10, 122, 18, -6, 2, 0 }, +{ -1, 4, -13, 118, 27, -9, 3, -1 }, +{ -1, 4, -16, 112, 37, -11, 4, -1 }, +{ -1, 5, -18, 105, 48, -14, 4, -1 }, +{ -1, 5, -19, 97, 58, -16, 5, -1 }, +{ -1, 6, -19, 88, 68, -18, 5, -1 }, +{ -1, 6, -19, 78, 78, -19, 6, -1 }, +{ -1, 5, -18, 68, 88, -19, 6, -1 }, +{ -1, 5, -16, 58, 97, -19, 5, -1 }, +{ -1, 4, -14, 48, 105, -18, 5, -1 }, +{ -1, 4, -11, 37, 112, -16, 4, -1 }, +{ -1, 3, -9, 27, 118, -13, 4, -1 }, +{ 0, 2, -6, 18, 122, -10, 3, -1 }, +{ 0, 1, -3, 8, 126, -5, 1, 0 }, +}, [FILTER_8TAP_SHARP] = { +{ 0, 0, 0, 128, 0, 0, 0, 0 }, +{ -1, 3, -7, 127, 8, -3, 1, 0 }, +{ -2, 5, -13, 125, 17, -6, 3, -1 }, +{ -3, 7, -17, 121, 27, -10, 5, -2 }, +{ -4, 9, -20, 115, 37, -13, 6, -2 }, +{ -4, 10, -23, 108, 48, -16, 8, -3 }, +{ -4, 10, -24, 100, 59, -19, 9, -3 }, +{ -4, 11, -24, 90, 70, -21, 10, -4 }, +{ -4, 11, -23, 80, 80, -23, 11, -4 }, +{ -4, 10, -21, 70, 90, -24, 11, -4 }, +{ -3, 9, -19, 59, 100, -24, 10, -4 }, +{ -3, 8, -16, 48, 108, -23, 10, -4 }, +{ -2, 6, -13, 37, 115, -20, 9, -4 }, +{ -2, 5, -10, 27, 121, -17, 7, -3 }, +{ -1, 3, -6, 17, 125, -13, 5, -2 }, +{ 0, 1, -3, 8, 127, -7, 3, -1 }, +}, [FILTER_8TAP_SMOOTH] = { +{ 0, 0, 0, 128, 0, 0, 0, 0 }, +{ -3, -1, 32, 64, 38, 1, -3, 0 }, +{ -2, -2, 29, 63, 41, 2, -3, 0 }, +{ -2, -2, 26, 63, 43, 4, -4, 0 }, +{ -2, -3, 24, 62, 46, 5, -4, 0 }, +{ -2, -3, 21, 60, 49, 7, -4, 0 }, +{ -1, -4, 18, 59, 51, 9, -4, 0 }, +{ -1, -4, 16, 57, 53, 12, -4, -1 }, +{ -1, -4, 14, 55, 55, 14, -4, -1 }, +{ -1, -4, 12, 53, 57, 16, -4, -1 }, +{ 0, -4, 9, 51, 59, 18, -4, -1 }, +{ 0, -4, 7, 49, 60, 21, -3, -2 }, +{ 0, -4, 5, 46, 62, 24, -3, -2 }, +{ 0, -4, 4, 43, 63, 26, -2, -2 }, +{ 0, -3, 2, 41, 63, 29, -2, -2 }, +{ 0, -3, 1, 38, 64, 32, -1, -3 }, +} +}; + + av_cold void ff_vp9dsp_init(VP9DSPContext *dsp, int bpp, int bitexact) { if (bpp == 8) { diff --git a/libavcodec/vp9dsp.h b/libavcodec/vp9dsp.h index 733f5bf..cb43f5e 100644 --- a/libavcodec/vp9dsp.h +++ b/libavcodec/vp9dsp.h @@ -120,6 +120,9 @@ typedef struct VP9DSPContext { vp9_scaled_mc_func smc[5][4][2]; } VP9DSPContext; + +extern const int16_t ff_vp9_subpel_filters[3][16][8]; + void ff_vp9dsp_init(VP9DSPContext *dsp, int bpp, int bitexact); void ff_vp9dsp_init_8(VP9DSPContext *dsp); diff --git a/libavcodec/vp9dsp_template.c b/libavcodec/vp9dsp_template.c index 4d810fe..bb54561 100644 --- a/libavcodec/vp9dsp_template.c +++ b/libavcodec/vp9dsp_template.c @@ -1991,61 +1991,6 @@ copy_avg_fn(4) #endif /* BIT_DEPTH != 12 */ -static const int16_t vp9_subpel_filters[3][16][8] = { -[FILTER_8TAP_REGULAR] = { -{ 0, 0, 0, 128, 0, 0, 0, 0 }, -{ 0, 1, -5, 126, 8, -3, 1, 0 }, -{ -1, 3, -10, 122, 18, -6, 2, 0 }, -{ -1, 4, -13, 118, 27, -9, 3, -1 }, -{ -1, 4, -16, 112, 37, -11, 4, -1 }, -{ -1, 5, -18, 105, 48, -14, 4, -1 }, -{ -1, 5, -19, 97, 58, -16, 5, -1 }, -{ -1, 6, -19, 88, 68, -18, 5, -1 }, -{ -1, 6, -19, 78, 78, -19, 6, -1 }, -{ -1, 5, -18, 68, 88, -19, 6, -1 }, -{ -1, 5, -16, 58, 97, -19, 5, -1 }, -{ -1, 4, -14, 48, 105, -18, 5, -1 }, -{ -1, 4, -11, 37, 112, -16, 4, -1 }, -{ -1, 3, -9, 27, 118, -13, 4, -1 }, -{ 0, 2, -6, 18, 122, -10, 3, -1 }, -{ 0, 1, -3, 8, 126, -5, 1, 0 }, -}, [FILTER_8TAP_SHARP] = { -{ 0, 0, 0, 128, 0, 0, 0, 0 }, -{ -1, 3, -7, 127, 8, -3, 1, 0 }, -{ -2, 5, -13, 125, 17, -6, 3, -1 }, -{ -3, 7, -17, 121, 27, -10,
Re: [FFmpeg-devel] [FFmpeg-cvslog] Merge commit '7a745f014f528d1001394ae4d2f4ed1a20bf7fa2'
On Sun, Nov 13, 2016 at 10:32:13PM +0100, Hendrik Leppkes wrote: > ffmpeg | branch: master | Hendrik Leppkes | Sun Nov 13 > 22:29:04 2016 +0100| [bd0db4a32d85d027da4d4dc00490c20048090312] | committer: > Hendrik Leppkes > > Merge commit '7a745f014f528d1001394ae4d2f4ed1a20bf7fa2' > > * commit '7a745f014f528d1001394ae4d2f4ed1a20bf7fa2': > options_table: Add aliases for color properties > > Merged-by: Hendrik Leppkes > > > http://git.videolan.org/gitweb.cgi/ffmpeg.git/?a=commit;h=bd0db4a32d85d027da4d4dc00490c20048090312 > --- > > libavcodec/options_table.h | 32 > libavcodec/version.h | 2 +- > 2 files changed, 25 insertions(+), 9 deletions(-) Please dont forget to add documentation for every option you add. IIRC Libav has code to automatically generate documentation from AVOptions. Several FFmpeg developers objected against enabling that code. IIRC because hand written documentation is of much higher quality ... You could of course also try to ping the people previously objecting to auto generated docs, maybe they would be interrested in helping. If there is interrest i can also attempt to write a fate test to detect newly added options without docs? [...] -- Michael GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB Complexity theory is the science of finding the exact solution to an approximation. Benchmarking OTOH is finding an approximation of the exact signature.asc Description: Digital signature ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
Re: [FFmpeg-devel] [PATCH 2/2] lavfi: make filter_frame non-recursive.
Le tridi 23 brumaire, an CCXXV, James Almer a écrit : > Why do you always only accept or consider replies and arguments from other > people when they exclusively fit the narrative you expect them to be? You're > literally telling me what i said is worth nothing because it wasn't, or you > assumed it wasn't, exactly what you wanted. Your goal is to convince me, is it not? Well, I am telling you what kind of argument is, I expect, necessary to do so. Is it not better that I tell you instead of just arguing and letting you figure it by yourself? It is not specific to me: if you want to convince anyone that A is better than B, you have to address the reasons they have to think that B is better than A. Otherwise, you just show them that you do not understand the issue as well as them. The only thing about me is that I am more conscious of these phenomenons than others, having spent so much of my youth bikeshedding trivial matters on NNTP forums. > I very much rather stick with a clean, longstanding and proven to work method > that doesn't fill the public headers with ifdeffery, doesn't require extra > care > to avoid including a new special kind of volatile header that should only be > used in a specific way, and that doesn't make future merges more a pain in the > ass than they already are. It is the second time you invoke the "longstanding and proven to work" argument. Do you have any reason to think that my proposal does not work? If not, withdraw this argument, because it only amounts to FUD. For the merges, it has been a long time since merges on the lavfi framework were mostly impossible. The fork has the scheduling in lavfi completely broken. > One indirection is not going to make a difference, even if we catalog it as a > considerable disadvantage for the sake of this discussion. It's not enough of > an argument in favor or replacing it with your design when you weight it with > the above. > > And "Aesthetics comes second" many times ends up in spaghetti and nigh > unreadable and maintainable code, so lets try to not go that route if > possible. > Ask Ronald about his libswr simd refactor work if you want anecdotal evidence, > or refer to avutil/common.h As I said to Clément a few minutes ago, I remembered that my reasons for choosing this solution were mostly readability. The extra efficiency is just a bonus. You seem to consider consistency in the code to be a goal by itself. It is not, it is only a means to an end, the end being simplicity, readability, maintenability. The version with the private structure and an indirection leads to more complex, less readable code. I say we throw consistency away, we go for the simpler and more readable code, as it happens it is more efficient too. And when everybody is convinced that it is satisfactory, we bring back consistency the other way around, porting other parts of the code to that new, better pattern. I can explain why the indirection gives more complex code, but not right now because the time I have is elapsed. Regards, -- Nicolas George ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
[FFmpeg-devel] MPEGTS UDP Stream Overflow & Underflow Case Fixes
Hello, This patch is to close Ticker 4155: https://trac.ffmpeg.org/ticket/4155 Actualy, patch is made by "rycius". I just updated the diff file to date and separated into two parts with making a small change a) Overflow fix exit fix b) Underflow timebase fix Kind Regards, Ali KIZIL 0001-Fix-UDP-MPEGTS-Overflow-Exit-Case.patch Description: Binary data 0002-UDP-MPEGTS-Fixing-timestamps-while-filling-queue.patch Description: Binary data ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
Re: [FFmpeg-devel] [PATCH 2/2] lavfi: make filter_frame non-recursive.
On Sun, Nov 13, 2016 at 11:54:23PM +0100, Nicolas George wrote: > Le tridi 23 brumaire, an CCXXV, James Almer a écrit : > > AVSomethingInternal *internal; > > > No technical advantages i can think of compared to your method, but it's > > thoroughly tested, clean, and above all ubiquitous. See for example > > AVFilterInternal and AVCodecInternal. > > It has technical DISadvantages, and therefore needs a good reason to > choose. I would really appreciate that objections acknowledge and > discuss them, however minimally. In other words, unless your message > contains a sense that looks like that: > > Compared to yours, my solution has the drawback ofhere>, but I still think it is better because . > > ... I consider that the advice was given without having thought about > all the implications and therefore cannot be really valuable. > The pros for an internal fields are, for me: - consistency across the codebase - definitions more readable (if you're advocating the fact that a multiline define is as readable I'll stop reading right now) - common C pattern, accessible to other developers - tools will likely like it more (doxygen, looking up a field within your IDE/editor, GDB, or any other tool really) - we can still embed various ifdefery in the structure (for FF_API ifdefery, or simply defining constants above a field) So please, I don't want FFmpeg to be again the laughing stock of readability on the Internet just because a structure dereferencing was considered too much overhead. Regards, -- Clément B. ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
Re: [FFmpeg-devel] MPEGTS UDP Stream Overflow & Underflow Case Fixes
2016-11-14 12:21 GMT+01:00 Ali KIZIL : > This patch is to close Ticker 4155: https://trac.ffmpeg.org/ticket/4155 > > Actualy, patch is made by "rycius". Then why does the patch claim to be written by you? Carl Eugen ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
Re: [FFmpeg-devel] MPEGTS UDP Stream Overflow & Underflow Case Fixes
2016-11-14 14:31 GMT+03:00 Carl Eugen Hoyos : > 2016-11-14 12:21 GMT+01:00 Ali KIZIL : > > > This patch is to close Ticker 4155: https://trac.ffmpeg.org/ticket/4155 > > > > Actualy, patch is made by "rycius". > > Then why does the patch claim to be written by you? > > Carl Eugen > ___ > ffmpeg-devel mailing list > ffmpeg-devel@ffmpeg.org > http://ffmpeg.org/mailman/listinfo/ffmpeg-devel > I dont claim anything and I also said it is not written by me. I just got diff file of "rycius" and made 2 git format patch files. In his diff, there were 2 sections doing the same fixes in a slightly different way. I split his diff file for overflow and underflow fixes. I didnt touch to overflow fix at all. I modified his underflow fix as it was not working. It was blocking the UDP MpegTS packet output. So, I re-placed his timebase suggestions in the while loop. That's it. If I did something wrong or missing, I am sorry. There is no bad will. Just, I shared his trac patch in Devel ML to be applied. Regards, ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
Re: [FFmpeg-devel] [PATCH 2/2] lavfi: make filter_frame non-recursive.
Le quartidi 24 brumaire, an CCXXV, Clement Boesch a écrit : > The pros for an internal fields are, for me: > > - consistency across the codebase > - definitions more readable (if you're advocating the fact that a > multiline define is as readable I'll stop reading right now) > - common C pattern, accessible to other developers > - tools will likely like it more (doxygen, looking up a field within your > IDE/editor, GDB, or any other tool really) > - we can still embed various ifdefery in the structure (for FF_API > ifdefery, or simply defining constants above a field) I do not disagree with all that. But you are speaking of the readability of the structure definition, which is rather minor. Your forget another much more important point : with the indirection, the CODE that USES the structure becomes much less readable. Making the structure definition a little less readable to make the actual code a lot simpler, I say it is a very interesting compromise. Code needs readability more than definitions. I can propose another approach to avoid the indirection but address a few of your points : struct AVFilterLink { int public1; int public2; ... #ifdef FF_PRIVATE_FIELDS int private1; int private2; ... #endif }; In particular, I think it will work perfectly fine with Doxygen, IDEs, etc. > So please, I don't want FFmpeg to be again the laughing stock of > readability on the Internet Please keep the emotional arguments out of it. Regards, -- Nicolas George ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
Re: [FFmpeg-devel] MPEGTS UDP Stream Overflow & Underflow Case Fixes
2016-11-14 12:41 GMT+01:00 Ali KIZIL : > I dont claim anything and I also said it is not written by me. In your mail you wrote 'patch is made by "rycius"'. The git-formatted patch you attached has the following in its author field: "smallishzulu ". This is a contradiction. I don't know who wrote the patch (I did not look at it except for the header). No patch can be committed where the sender claims that the actual author is different from the author in the commit field of the patch. Carl Eugen ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
Re: [FFmpeg-devel] [PATCH 2/2] lavfi: make filter_frame non-recursive.
On Mon, Nov 14, 2016 at 01:06:30PM +0100, Nicolas George wrote: > Le quartidi 24 brumaire, an CCXXV, Clement Boesch a écrit : > > The pros for an internal fields are, for me: > > > > - consistency across the codebase > > - definitions more readable (if you're advocating the fact that a > > multiline define is as readable I'll stop reading right now) > > - common C pattern, accessible to other developers > > - tools will likely like it more (doxygen, looking up a field within your > > IDE/editor, GDB, or any other tool really) > > - we can still embed various ifdefery in the structure (for FF_API > > ifdefery, or simply defining constants above a field) > > I do not disagree with all that. But you are speaking of the readability > of the structure definition, which is rather minor. > > Your forget another much more important point : with the indirection, > the CODE that USES the structure becomes much less readable. > I don't see how. You add a pointer at the beginning of your function pointing on that internal structure and use that. > Making the structure definition a little less readable to make the > actual code a lot simpler, I say it is a very interesting compromise. > Code needs readability more than definitions. > > I can propose another approach to avoid the indirection but address a > few of your points : > > struct AVFilterLink { > int public1; > int public2; > ... > #ifdef FF_PRIVATE_FIELDS > int private1; > int private2; > ... > #endif > }; > > In particular, I think it will work perfectly fine with Doxygen, IDEs, > etc. > That might not prevent tools from wrongly detecting ABI breakage from headers. The best way to have private fields is to make them private for real, so I would still prefer an opaque pointer, "leaking" zero information. > > So please, I don't want FFmpeg to be again the laughing stock of > > readability on the Internet > > Please keep the emotional arguments out of it. > It's not an emotional argument, it's a political one. "look at the shit FFmpeg developers still do" does not help the project from getting contributors/contributions. -- Clément B. ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
Re: [FFmpeg-devel] MPEGTS UDP Stream Overflow & Underflow Case Fixes
2016-11-14 15:15 GMT+03:00 Carl Eugen Hoyos : > 2016-11-14 12:41 GMT+01:00 Ali KIZIL : > > > I dont claim anything and I also said it is not written by me. > > In your mail you wrote 'patch is made by "rycius"'. > The git-formatted patch you attached has the following > in its author field: "smallishzulu ". > This is a contradiction. > > I don't know who wrote the patch (I did not look at it > except for the header). No patch can be committed > where the sender claims that the actual author is > different from the author in the commit field of the patch. > > Carl Eugen > ___ > ffmpeg-devel mailing list > ffmpeg-devel@ffmpeg.org > http://ffmpeg.org/mailman/listinfo/ffmpeg-devel > I see. The originator is rycius. I saw his diff file and nick on https://trac.ffmpeg.org/ticket/4155 I have no clue on his real name or email. I can resend the patchs by updating them with his nickname if required. ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
Re: [FFmpeg-devel] MPEGTS UDP Stream Overflow & Underflow Case Fixes
2016-11-14 13:24 GMT+01:00 Ali KIZIL : > 2016-11-14 15:15 GMT+03:00 Carl Eugen Hoyos : > >> 2016-11-14 12:41 GMT+01:00 Ali KIZIL : >> >> > I dont claim anything and I also said it is not written by me. >> >> In your mail you wrote 'patch is made by "rycius"'. >> The git-formatted patch you attached has the following >> in its author field: "smallishzulu ". >> This is a contradiction. >> >> I don't know who wrote the patch (I did not look at it >> except for the header). No patch can be committed >> where the sender claims that the actual author is >> different from the author in the commit field of the patch. >> > The originator is rycius. I saw his diff file and nick on > https://trac.ffmpeg.org/ticket/4155 > I have no clue on his real name or email. Use rycius as his name (it is the name he chose), his email is rycius at ryci dot us Please remove the mail footer when replying, Carl Eugen ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
Re: [FFmpeg-devel] [PATCH 2/9] arm: Clear the gp register alias at the end of functions
Hi, assume this is my reply on all patches in the range 2-9: On Mon, Nov 14, 2016 at 5:32 AM, Martin Storsjö wrote: > We reset .Lpic_gp to zero at the start of each function, which means > that the logic within movrelx for clearing gp when necessary will > be missed. > > This fixes using movrelx in different functions with a different > helper register. > > This is cherry-picked from libav commit > 824e8c284054f323f854892d1b4739239ed1fdc7. > --- > libavutil/arm/asm.S | 3 +++ > 1 file changed, 3 insertions(+) I have no idea about any of this stuff, so I can't give a proper technical review. I've followed review by actually knowledgeable people on these patches on libav-devel and I think they have been sufficiently reviewed, so I'll push this whole set tomorrow unless people object. Ronald ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
Re: [FFmpeg-devel] [PATCH 1/9] vp9dsp: Deduplicate the subpel filters
Hi, On Mon, Nov 14, 2016 at 5:32 AM, Martin Storsjö wrote: > Make them aligned, to allow efficient access to them from simd. > > This is an adapted cherry-pick from libav commit > a4cfcddcb0f76e837d5abc06840c2b26c0e8aefc. > --- > libavcodec/vp9dsp.c | 56 +++ > libavcodec/vp9dsp.h | 3 +++ > libavcodec/vp9dsp_template.c | 63 +++--- > -- > 3 files changed, 63 insertions(+), 59 deletions(-) OK. Do I need to queue them up? I thought they would be merged automagically from Libav... Ronald ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
Re: [FFmpeg-devel] [PATCH 1/9] vp9dsp: Deduplicate the subpel filters
On Mon, 14 Nov 2016, Ronald S. Bultje wrote: Hi, On Mon, Nov 14, 2016 at 5:32 AM, Martin Storsjö wrote: Make them aligned, to allow efficient access to them from simd. This is an adapted cherry-pick from libav commit a4cfcddcb0f76e837d5abc06840c2b26c0e8aefc. --- libavcodec/vp9dsp.c | 56 +++ libavcodec/vp9dsp.h | 3 +++ libavcodec/vp9dsp_template.c | 63 +++--- -- 3 files changed, 63 insertions(+), 59 deletions(-) OK. Do I need to queue them up? Yes, that'd be appreciated. I thought they would be merged automagically from Libav... In principle, but the merging is quite far behind at the moment. I've included the commit hashes of all included commits to make it clear which commits can be no-oped in future merges at least. Also for the record, it has been tested on linux, iOS and with the MSVC toolchain (in wine). // Martin ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
Re: [FFmpeg-devel] MPEGTS UDP Stream Overflow & Underflow Case Fixes
2016-11-14 15:52 GMT+03:00 Carl Eugen Hoyos : > 2016-11-14 13:24 GMT+01:00 Ali KIZIL : > > 2016-11-14 15:15 GMT+03:00 Carl Eugen Hoyos : > > > >> 2016-11-14 12:41 GMT+01:00 Ali KIZIL : > >> > >> > I dont claim anything and I also said it is not written by me. > >> > >> In your mail you wrote 'patch is made by "rycius"'. > >> The git-formatted patch you attached has the following > >> in its author field: "smallishzulu ". > >> This is a contradiction. > >> > >> I don't know who wrote the patch (I did not look at it > >> except for the header). No patch can be committed > >> where the sender claims that the actual author is > >> different from the author in the commit field of the patch. > >> > > The originator is rycius. I saw his diff file and nick on > > https://trac.ffmpeg.org/ticket/4155 > > I have no clue on his real name or email. > > Use rycius as his name (it is the name he chose), his email > is rycius at ryci dot us > > Please remove the mail footer when replying, Carl Eugen > Please find the attached updated patch files. 0001-Fix-UDP-MPEGTS-Overflow-Exit-Case.patch Description: Binary data 0002-UDP-MPEGTS-Fixing-timestamps-while-filling-queue.patch Description: Binary data ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
Re: [FFmpeg-devel] [PATCH] avcodec/libx264: fix forced_idr logic
On 11/11/2016 4:01 PM, Timo Rothenpieler wrote: > -1 and 0 both mean false now, and I left in the option to pass -1 to > stay compatible with possible 3rd parties who pass it. Well that's certainly non-obvious... > So changing to default to 0 doesn't really matter, and I decided to keep > the patch minimal. > Changing to default to 0 would be fine as well. This patch LGTM if you push a 2nd patch with it that changes the default to 0. - Derek ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
Re: [FFmpeg-devel] [PATCH] lavf/mov.c: Set skip_samples according to first edit list, when -ignore_editlist is set.
On 11/12/2016 6:02 AM, Sasi Inguva wrote: > +/* Adjust skip_samples correctly when ignore_editlist is set, to support > gapless playback */ > +if (mov->ignore_editlist && sc->num_non_empty_elst <= 1 && > +st->codecpar->codec_id == AV_CODEC_ID_AAC && > sc->first_elist_start_time > 0) { > +sc->start_pad = sc->first_elist_start_time; > +} Doesn't this make the name "ignore_editlist" rather untrue? Also, pinging wm4. - Derek ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
[FFmpeg-devel] [PATCH]lavf/mov: Fix an out-of-bound-read in mov_read_mac_string().
Hi! I believe attached patch fixes an out-of-bound-read in mov_read_mac_string() if pFrom 6c69f755e1c4f22d1efb36777631c98a4d20ffef Mon Sep 17 00:00:00 2001 From: Carl Eugen Hoyos Date: Mon, 14 Nov 2016 14:52:58 +0100 Subject: [PATCH] lavf/mov: Fix an out-of-bound-read in mov_read_mac_string(). --- libavformat/mov.c |2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/libavformat/mov.c b/libavformat/mov.c index 8d6cc12..21556be 100644 --- a/libavformat/mov.c +++ b/libavformat/mov.c @@ -160,7 +160,7 @@ static int mov_read_mac_string(MOVContext *c, AVIOContext *pb, int len, uint8_t t, c = avio_r8(pb); if (c < 0x80 && p < end) *p++ = c; -else if (p < end) +else if (c >= 0x80 && p < end) PUT_UTF8(mac_to_unicode[c-0x80], t, if (p < end) *p++ = t;); } *p = 0; -- 1.7.10.4 ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
Re: [FFmpeg-devel] [PATCH 1/9] vp9dsp: Deduplicate the subpel filters
On Mon, Nov 14, 2016 at 12:32:19PM +0200, Martin Storsjö wrote: > Make them aligned, to allow efficient access to them from simd. > > This is an adapted cherry-pick from libav commit > a4cfcddcb0f76e837d5abc06840c2b26c0e8aefc. > --- > libavcodec/vp9dsp.c | 56 +++ > libavcodec/vp9dsp.h | 3 +++ > libavcodec/vp9dsp_template.c | 63 > +++- > 3 files changed, 63 insertions(+), 59 deletions(-) patchset tested with fate on arm-qemu, all fate tests pass [...] -- Michael GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB No human being will ever know the Truth, for even if they happen to say it by chance, they would not even known they had done so. -- Xenophanes signature.asc Description: Digital signature ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
Re: [FFmpeg-devel] [PATCH 2/2] lavfi: make filter_frame non-recursive.
Le quartidi 24 brumaire, an CCXXV, Clement Boesch a écrit : > I don't see how. You add a pointer at the beginning of your function > pointing on that internal structure and use that. That makes extra code. Remember we are talking about lavfi: we frequently have inlink1, inlink2, outlink. If we must have inlink1priv, inlink2priv, outlinkpriv on top of that, that is starting to make a lot of namespace pollution. Also, good luck keeping the variables names in sync. Someone will rename link to inlink someday, but not the corresponding internal variable, and the code will become an unreadable mess because variable that mean the same thing do not have a similar name. Plus, even with a single link, whether we access it using the indirection explicitly "link->internal->x" or with a dedicated variable "linkpriv->x", we still have to remember, for every damn bloody field, whether it is public or private. For codec/filter private context, there is a rather simple rule: if it is specific to the filter, it is private, if it is common to all filters it is in the common context. But it is still pretty annoying to have two variables and to must decide every time which one to use. And I waste a non-negligible amount of time getting it wrong somehow (for example after a copy-paste) and fixing it. For private/public fields, the rule is much more ambiguous, a lot of fields are visible but could be internal, or are in the public structure below the "/*private*/" line. If we were to move a field from visible to internal, or the other way around, we would have to fix all the code that uses it. Sorry, this is bad design, I refuse to have that in my code. I want to write link->fifo the same way I write link->src or link->time_base. You can reject my proposals, sure, but I can reject any proposal that do not verify this property. > That might not prevent tools from wrongly detecting ABI breakage from > headers. Can you explain? > The best way to have private fields is to make them private for > real, so I would still prefer an opaque pointer, "leaking" zero > information. And making the corresponding code unreadable, as I just explained. Rejected. > It's not an emotional argument, it's a political one. "look at the shit > FFmpeg developers still do" does not help the project from getting > contributors/contributions. Or maybe it help the project not getting bad contributors. Regards, -- Nicolas George signature.asc Description: Digital signature ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
Re: [FFmpeg-devel] [PATCH 2/2] lavfi: make filter_frame non-recursive.
On Mon, Nov 14, 2016 at 04:52:01PM +0100, Nicolas George wrote: > Le quartidi 24 brumaire, an CCXXV, Clement Boesch a écrit : > > I don't see how. You add a pointer at the beginning of your function > > pointing on that internal structure and use that. > > That makes extra code. Remember we are talking about lavfi: we > frequently have inlink1, inlink2, outlink. If we must have inlink1priv, > inlink2priv, outlinkpriv on top of that, that is starting to make a lot > of namespace pollution. > > Also, good luck keeping the variables names in sync. Someone will rename > link to inlink someday, but not the corresponding internal variable, and > the code will become an unreadable mess because variable that mean the > same thing do not have a similar name. > > Plus, even with a single link, whether we access it using the > indirection explicitly "link->internal->x" or with a dedicated variable > "linkpriv->x", we still have to remember, for every damn bloody field, > whether it is public or private. > > For codec/filter private context, there is a rather simple rule: if it > is specific to the filter, it is private, if it is common to all filters > it is in the common context. But it is still pretty annoying to have two > variables and to must decide every time which one to use. And I waste a > non-negligible amount of time getting it wrong somehow (for example > after a copy-paste) and fixing it. > > For private/public fields, the rule is much more ambiguous, a lot of > fields are visible but could be internal, or are in the public structure > below the "/*private*/" line. If we were to move a field from visible to > internal, or the other way around, we would have to fix all the code > that uses it. > > Sorry, this is bad design, I refuse to have that in my code. I want to > write link->fifo the same way I write link->src or link->time_base. You > can reject my proposals, sure, but I can reject any proposal that do not > verify this property. > You could also make a local copy of the fields you need; it is -in my opinion- not that much noise to add one or a few more lines of declaration. > > That might not prevent tools from wrongly detecting ABI breakage from > > headers. > > Can you explain? > Some tool parsing the public header for analyzing structure layout changes in public structure to detect ABI and API breakage would not be able to make a difference in which fields are public or private. BTW, it also means you're showing all the internals and their doc to the users, which is a lot of noise for them. > > It's not an emotional argument, it's a political one. "look at the shit > > FFmpeg developers still do" does not help the project from getting > > contributors/contributions. > > Or maybe it help the project not getting bad contributors. > Then I guess I'm a bad contributor. Your private ifdefery suggestion is probably the least worst, but I still prefer the common paradigm for previously said reasons that IMO still stand. Anyway, I have your points and you have mine, and my care security fuse has blown, so I'm out of the debate. Regards, -- Clément B. ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
Re: [FFmpeg-devel] [PATCH 2/2] lavfi: make filter_frame non-recursive.
Le quartidi 24 brumaire, an CCXXV, Clement Boesch a écrit : > You could also make a local copy of the fields you need; it is -in my > opinion- not that much noise to add one or a few more lines of > declaration. When the fields are used read-only, probably. Still more code for no good reason. When the fields are used read-write, it would require also storing them at the end. And of course it does not work in more complex cases, such as when one of the field is accessed concurrently by threads. > Some tool parsing the public header for analyzing structure layout changes > in public structure to detect ABI and API breakage would not be able to > make a difference in which fields are public or private. Ok. Obviously you are speaking about third-party tools. These tools must rely on the public headers. And, of course, they have to use the preprocessor correctly, otherwise all matter of libraries, not just ffmpeg, will not work. > BTW, it also means you're showing all the internals and their doc to the > users, which is a lot of noise for them. I agree, it is not ideal. I can see a few ways of mitigating that, not very elegant. But I think even that way it is still acceptable. > Then I guess I'm a bad contributor. That is not what I meant, and obviously you were not disgusted to the point of leaving. > Your private ifdefery suggestion is probably the least worst, but I still > prefer the common paradigm for previously said reasons that IMO still > stand. What about: struct AVFilterLink { ... #ifdef FF_INTERNAL_API #inclde "avfilterlink_internal.h" #endif That avoids your objection about showing the internal fields to the users, while avoiding multiline macro definitions and Doxygen trouble. As a side note, I notice that I have quite a few "defined(__KERNEL__)" lying around in /usr/include/. Regards, -- Nicolas George signature.asc Description: Digital signature ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
[FFmpeg-devel] [PATCH] ffmpeg: simplify init_output_stream_streamcopy()
Skip the intermediary codecpar struct and instead copy everything to the destination codecpar struct directly. Signed-off-by: James Almer --- ffmpeg.c | 17 +++-- ffmpeg.h | 1 - ffmpeg_opt.c | 6 -- 3 files changed, 7 insertions(+), 17 deletions(-) diff --git a/ffmpeg.c b/ffmpeg.c index a6e9476..8a6978d 100644 --- a/ffmpeg.c +++ b/ffmpeg.c @@ -536,7 +536,6 @@ static void ffmpeg_cleanup(int ret) av_dict_free(&ost->sws_dict); avcodec_free_context(&ost->enc_ctx); -avcodec_parameters_free(&ost->ref_par); while (ost->muxing_queue && av_fifo_size(ost->muxing_queue)) { AVPacket pkt; @@ -2878,31 +2877,29 @@ static int init_output_stream_streamcopy(OutputStream *ost) OutputFile *of = output_files[ost->file_index]; InputStream *ist = get_input_stream(ost); AVCodecParameters *par_dst = ost->st->codecpar; -AVCodecParameters *par_src = ost->ref_par; +AVCodecParameters *par_src = ist->st->codecpar; AVRational sar; int i, ret; uint32_t codec_tag = par_dst->codec_tag; av_assert0(ist && !ost->filter); -avcodec_parameters_to_context(ost->enc_ctx, ist->st->codecpar); +avcodec_parameters_to_context(ost->enc_ctx, par_src); ret = av_opt_set_dict(ost->enc_ctx, &ost->encoder_opts); if (ret < 0) { av_log(NULL, AV_LOG_FATAL, "Error setting up codec context options.\n"); return ret; } -avcodec_parameters_from_context(par_src, ost->enc_ctx); if (!codec_tag) { -unsigned int codec_tag_tmp; if (!of->ctx->oformat->codec_tag || -av_codec_get_id (of->ctx->oformat->codec_tag, par_src->codec_tag) == par_src->codec_id || -!av_codec_get_tag2(of->ctx->oformat->codec_tag, par_src->codec_id, &codec_tag_tmp)) -codec_tag = par_src->codec_tag; +av_codec_get_id (of->ctx->oformat->codec_tag, ost->enc_ctx->codec_tag) == ost->enc_ctx->codec_id || +av_codec_get_tag(of->ctx->oformat->codec_tag, ost->enc_ctx->codec_id) <= 0) +codec_tag = ost->enc_ctx->codec_tag; } -ret = avcodec_parameters_copy(par_dst, par_src); +ret = avcodec_parameters_from_context(par_dst, ost->enc_ctx); if (ret < 0) return ret; @@ -2973,7 +2970,7 @@ static int init_output_stream_streamcopy(OutputStream *ost) else if (ist->st->sample_aspect_ratio.num) sar = ist->st->sample_aspect_ratio; else -sar = par_src->sample_aspect_ratio; +sar = par_dst->sample_aspect_ratio; ost->st->sample_aspect_ratio = par_dst->sample_aspect_ratio = sar; ost->st->avg_frame_rate = ist->st->avg_frame_rate; ost->st->r_frame_rate = ist->st->r_frame_rate; diff --git a/ffmpeg.h b/ffmpeg.h index ebe5bf0..1256b1f 100644 --- a/ffmpeg.h +++ b/ffmpeg.h @@ -450,7 +450,6 @@ typedef struct OutputStream { AVBSFContext**bsf_ctx; AVCodecContext *enc_ctx; -AVCodecParameters *ref_par; /* associated input codec parameters with encoders options applied */ AVCodec *enc; int64_t max_frames; AVFrame *filtered_frame; diff --git a/ffmpeg_opt.c b/ffmpeg_opt.c index ddda3f2..6b802ca 100644 --- a/ffmpeg_opt.c +++ b/ffmpeg_opt.c @@ -1266,12 +1266,6 @@ static OutputStream *new_output_stream(OptionsContext *o, AVFormatContext *oc, e } ost->enc_ctx->codec_type = type; -ost->ref_par = avcodec_parameters_alloc(); -if (!ost->ref_par) { -av_log(NULL, AV_LOG_ERROR, "Error allocating the encoding parameters.\n"); -exit_program(1); -} - if (ost->enc) { AVIOContext *s = NULL; char *buf = NULL, *arg = NULL, *preset = NULL; -- 2.10.1 ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
Re: [FFmpeg-devel] [Patch] hwaccel transocode broken
On Sun, 13 Nov 2016 23:50:03 +0100 Carl Eugen Hoyos wrote: > 2016-11-11 12:15 GMT+01:00 wm4 : > > On Fri, 11 Nov 2016 10:19:08 +0100 > > Carl Eugen Hoyos wrote: > > > >> 2016-11-11 9:03 GMT+01:00 Hendrik Leppkes : > >> > On Fri, Nov 11, 2016 at 6:34 AM, Yogender Gupta > >> > wrote: > > >> >> The included patch should fix. > >> >> > >> > > >> > ffmpeg.c is in the middle of some changes being merged, and once that > >> > is done ffmpeg_cuvid.c will need to be adapted to a new system. Until > >> > then, it would make everything easier if the code actively being > >> > modified by these merges is not modified by hackish fixes. > >> > >> Iirc, the documentation requires to only commit patches that do not > >> (knowingly) break functionality and imo, this implies that you should > >> quickly fix such bugs once they are reported. Or do I miss something? > > > > These are merges. Unless you volunteer to do the merges yourself (and > > properly please) you'll have to live with this. > > Not sure I understand: > Do you mean merges are exempt from any rules and may randomly > break features or introduce bugs? Yes. ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
Re: [FFmpeg-devel] [PATCH 1/2] lavfi: add FFFrameQueue API.
On Sun, 13 Nov 2016 11:10:20 +0100 Nicolas George wrote: > Signed-off-by: Nicolas George > --- > libavfilter/Makefile | 1 + > libavfilter/framequeue.c | 123 + > libavfilter/framequeue.h | 173 > +++ > 3 files changed, 297 insertions(+) > create mode 100644 libavfilter/framequeue.c > create mode 100644 libavfilter/framequeue.h > > > Both preliminary patches pushed, and actual changes rebased. > > It really needs a review. Maybe not the global logic (part of the job next > will be to ensure that I am no longer a single point of failure for lavfi > maintenance) but at least details of the code and cosmetic points like > private_fields.h. > > > diff --git a/libavfilter/Makefile b/libavfilter/Makefile > index cdddb1b..35bf11b 100644 > --- a/libavfilter/Makefile > +++ b/libavfilter/Makefile > @@ -18,6 +18,7 @@ OBJS = allfilters.o > \ > fifo.o \ > formats.o\ > framepool.o \ > + framequeue.o \ > graphdump.o \ > graphparser.o\ > opencl_allkernels.o \ > diff --git a/libavfilter/framequeue.c b/libavfilter/framequeue.c > new file mode 100644 > index 000..debeab2 > --- /dev/null > +++ b/libavfilter/framequeue.c > @@ -0,0 +1,123 @@ > +/* > + * Generic frame queue > + * Copyright (c) 2016 Nicolas George > + * > + * This file is part of FFmpeg. > + * > + * FFmpeg is free software; you can redistribute it and/or > + * modify it under the terms of the GNU Lesser General Public License > + * as published by the Free Software Foundation; either > + * version 2.1 of the License, or (at your option) any later version. > + * > + * FFmpeg is distributed in the hope that it will be useful, > + * but WITHOUT ANY WARRANTY; without even the implied warranty of > + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the > + * GNU Lesser General Public License for more details. > + * > + * You should have received a copy of the GNU Lesser General Public License > + * along with FFmpeg; if not, write to the Free Software Foundation, Inc., > + * 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA > + */ > + > +#include "libavutil/avassert.h" > +#include "framequeue.h" > + > +static inline FFFrameBucket *bucket(FFFrameQueue *fq, size_t idx) > +{ > +return &fq->queue[(fq->tail + idx) & (fq->allocated - 1)]; > +} > + > +void ff_framequeue_global_init(FFFrameQueueGlobal *fqg) > +{ > +} > + > +static void check_consistency(FFFrameQueue *fq) > +{ > +#if ASSERT_LEVEL >= 2 > +uint64_t nb_samples = 0; > +size_t i; > + > +av_assert0(fq->queued == fq->total_frames_head - fq->total_frames_tail); > +for (i = 0; i < fq->queued; i++) > +nb_samples += bucket(fq, i)->frame->nb_samples; > +av_assert0(nb_samples == fq->total_samples_head - > fq->total_samples_tail); > +#endif > +} > + > +void ff_framequeue_init(FFFrameQueue *fq, FFFrameQueueGlobal *fqg) > +{ > +fq->queue = &fq->first_bucket; > +fq->allocated = 1; > +} > + > +void ff_framequeue_free(FFFrameQueue *fq) > +{ > +while (fq->queued) { > +AVFrame *frame = ff_framequeue_take(fq); > +av_frame_free(&frame); > +} > +if (fq->queue != &fq->first_bucket) > +av_freep(&fq->queue); > +} > + > +int ff_framequeue_add(FFFrameQueue *fq, AVFrame *frame) > +{ > +FFFrameBucket *b; > + > +check_consistency(fq); > +if (fq->queued == fq->allocated) { > +if (fq->allocated == 1) { > +size_t na = 8; > +FFFrameBucket *nq = av_realloc_array(NULL, na, sizeof(*nq)); > +if (!nq) > +return AVERROR(ENOMEM); > +nq[0] = fq->queue[0]; > +fq->queue = nq; > +fq->allocated = na; > +} else { > +size_t na = fq->allocated << 1; > +FFFrameBucket *nq = av_realloc_array(fq->queue, na, sizeof(*nq)); > +if (!nq) > +return AVERROR(ENOMEM); > +if (fq->tail + fq->queued > fq->allocated) > +memmove(nq + fq->allocated, nq, > +(fq->tail + fq->queued - fq->allocated) * > sizeof(*nq)); > +fq->queue = nq; > +fq->allocated = na; > +} > +} > +b = bucket(fq, fq->queued); > +b->frame = frame; > +fq->queued++; > +fq->total_frames_head++; > +fq->total_samples_head += frame->nb_samples; > +check_consistency(fq); > +return 0; > +} > + > +AVFrame *ff_framequeue_take(FFFrameQueue *fq) > +{ > +FFFrameBucket *b; > + > +
Re: [FFmpeg-devel] [PATCH 1/2] lavfi: add FFFrameQueue API.
Le quartidi 24 brumaire, an CCXXV, wm4 a écrit : > Nice, previous review was completely ignored. Indeed. You told me yourself to do that when I consider your remarks rude. signature.asc Description: Digital signature ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
Re: [FFmpeg-devel] [FFmpeg-cvslog] Merge commit '7a745f014f528d1001394ae4d2f4ed1a20bf7fa2'
On 11/14/2016 8:01 AM, Michael Niedermayer wrote: > On Sun, Nov 13, 2016 at 10:32:13PM +0100, Hendrik Leppkes wrote: >> ffmpeg | branch: master | Hendrik Leppkes | Sun Nov 13 >> 22:29:04 2016 +0100| [bd0db4a32d85d027da4d4dc00490c20048090312] | committer: >> Hendrik Leppkes >> >> Merge commit '7a745f014f528d1001394ae4d2f4ed1a20bf7fa2' >> >> * commit '7a745f014f528d1001394ae4d2f4ed1a20bf7fa2': >> options_table: Add aliases for color properties >> >> Merged-by: Hendrik Leppkes >> >>> http://git.videolan.org/gitweb.cgi/ffmpeg.git/?a=commit;h=bd0db4a32d85d027da4d4dc00490c20048090312 >> --- >> >> libavcodec/options_table.h | 32 >> libavcodec/version.h | 2 +- >> 2 files changed, 25 insertions(+), 9 deletions(-) > > Please dont forget to add documentation for every option you add. Done. > IIRC Libav has code to automatically generate documentation from > AVOptions. Several FFmpeg developers objected against enabling that > code. IIRC because hand written documentation is of much higher quality Automatically generated documentation for options like this sounds very useful. I don't know why it was rejected. There's not much more "quality" to be had over manually written lines for AVOptions. > ... > You could of course also try to ping the people previously objecting to > auto generated docs, maybe they would be interrested in helping. > > If there is interrest i can also attempt to write a fate test to > detect newly added options without docs? > > [...] > > > > ___ > ffmpeg-devel mailing list > ffmpeg-devel@ffmpeg.org > http://ffmpeg.org/mailman/listinfo/ffmpeg-devel > ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
Re: [FFmpeg-devel] [PATCH 2/2] lavfi: make filter_frame non-recursive.
On Mon, Nov 14, 2016 at 05:42:51PM +0100, Nicolas George wrote: [...] > What about: > > struct AVFilterLink { > ... > #ifdef FF_INTERNAL_API > #inclde "avfilterlink_internal.h" > #endif > I'm OK to make an exception for filter links that way if it's really important for you, but I don't think that's a good way of dealing with the problem in the rest of the codebase. -- Clément B. ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
Re: [FFmpeg-devel] [libav-devel] [PATCH] libopusdec: fix out-of-bounds read
On 14.11.2016 00:01, Luca Barbato wrote: > On 13/11/2016 19:23, Andreas Cadhalpun wrote: >> avc->channels can be 0. > > 0 and less than zero shouldn't be an error? Such values should be rejected, wherever they are set. However, ensuring that is a larger change I'm currently working on. Meanwhile, this patch is a trivial fix for the potential security problem that can easily be backported. Best regards, Andreas ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
Re: [FFmpeg-devel] [PATCH 2/2] lavfi: make filter_frame non-recursive.
On 14.11.2016 17:42, Nicolas George wrote: > What about: > > struct AVFilterLink { > ... > #ifdef FF_INTERNAL_API > #inclde "avfilterlink_internal.h" > #endif I suspect that having different sizes for the same struct in different parts of the code base will upset some static analyzers. (I'm thinking of cbmc.) And rightly so, I'd say, because this can easily cause big trouble. Consider the (not so) theoretical case of API users simply (and wrongly) using the structs on the stack instead of dynamically allocating them. With the internal pointer the internal struct gets allocated in the respective init function and things mostly work fine, while with your proposal this would cause out-of-bounds reads/writes, which likely means an arbitrary code execution vulnerability. Thus I strongly advise against using this approach. On 14.11.2016 16:52, Nicolas George wrote: > Sorry, this is bad design, I refuse to have that in my code. I want to > write link->fifo the same way I write link->src or link->time_base. You > can reject my proposals, sure, but I can reject any proposal that do not > verify this property. If you really care so much about this, you can make the struct completely private and add getter/setter functions for those elements that need public access. Best regards, Andreas ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
Re: [FFmpeg-devel] doc/filters : add metadata info for blackframe filter
2016-10-23 17:18 GMT+02:00 Martin Vignali : > > > 2016-10-23 15:41 GMT+02:00 Moritz Barsnick : > >> On Sat, Oct 22, 2016 at 22:06:50 +0200, Martin Vignali wrote: >> > +This filter export frame metadata @code{lavfi.blackframe.pblack}. >> ^s >> > +The value represent the percentage of pixels in the picture that >> ^s >> >> Fix in new patch in attach > > Martin > > Ping Martin ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
Re: [FFmpeg-devel] [PATCH 2/2] lavfi: make filter_frame non-recursive.
Le quartidi 24 brumaire, an CCXXV, Andreas Cadhalpun a écrit : > I suspect that having different sizes for the same struct in different > parts of the code base will upset some static analyzers. Still, it is completely legal wrt the C standard. > Consider the (not so) theoretical case of API users simply (and wrongly) > using the structs on the stack instead of dynamically allocating them. > > With the internal pointer the internal struct gets allocated in the > respective init function and things mostly work fine, while with > your proposal this would cause out-of-bounds reads/writes, which > likely means an arbitrary code execution vulnerability. I think it will likely result in their application not working at all very soon, and therefore them realizing their mistake. This is exactly how I noticed that filfmts allocates the links itself. I would rather have found a way of making the structure unallocatable, but apparently the C standard refuses undefined structs even when it and the fields after it are not accessed. > If you really care so much about this, you can make the struct completely > private and add getter/setter functions for those elements that need > public access. I would be ok with that. But that is a big incompatible API change. I could argue that lavfi's API is currently unusable anyway. Regards, -- Nicolas George signature.asc Description: Digital signature ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
Re: [FFmpeg-devel] [PATCH 2/2] lavfi: make filter_frame non-recursive.
Le quartidi 24 brumaire, an CCXXV, Clement Boesch a écrit : > I'm OK to make an exception for filter links that way if it's really > important for you, but I don't think that's a good way of dealing with the > problem in the rest of the codebase. Why? The arguments I gave for AVFilterLink apply to a lot of similar structures. It would make the actual code in a lot of places simpler. Regards, -- Nicolas George signature.asc Description: Digital signature ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
Re: [FFmpeg-devel] [libav-devel] [PATCH] libopusdec: fix out-of-bounds read
On 14.11.2016 20:54, Anton Khirnov wrote: > Quoting Andreas Cadhalpun (2016-11-14 20:30:10) >> On 14.11.2016 00:01, Luca Barbato wrote: >>> On 13/11/2016 19:23, Andreas Cadhalpun wrote: avc->channels can be 0. >>> >>> 0 and less than zero shouldn't be an error? >> >> Such values should be rejected, wherever they are set. >> However, ensuring that is a larger change I'm currently >> working on. >> Meanwhile, this patch is a trivial fix for the potential >> security problem that can easily be backported. > > channels being zero is perfectly valid, it means the caller does not > know the channel count and expects the decoder to read it from the > bitstream. In general code this is correct, however if e.g. the matroska demuxer reads an audio stream which claims to have 0 channels, it should be rejected as broken. > This should fail for codecs that do not store this > information in the bitstream, but work fine otherwise. > > In the case of opus, the channel count is always known -- when the > extradata is present, the channel count is stored there. Otherwise the > stream is simple and can be decoded either as mono or stereo, as we > want. > > The patch does not seem to be doing the right thing -- I think it will > simply fail on the opus_multistream_decoder_create() call. Correct. > What it should do instead is just default to stereo. OK, patch doing that is attached. > Even better, you could replace the whole extradata parsing block with > a call to ff_opus_parse_extradata(), though that would require some > refactoring. The fix should be easily backportable, which excludes refactoring. Best regards, Andreas >From d33ded293d15e8ceab666bea834d436f3a225bcc Mon Sep 17 00:00:00 2001 From: Andreas Cadhalpun Date: Mon, 14 Nov 2016 21:41:45 +0100 Subject: [PATCH] libopusdec: default to stereo for invalid number of channels This fixes an out-of-bounds read if avc->channels is 0. Signed-off-by: Andreas Cadhalpun --- libavcodec/libopusdec.c | 6 ++ 1 file changed, 6 insertions(+) diff --git a/libavcodec/libopusdec.c b/libavcodec/libopusdec.c index acc62f1..61f68ed 100644 --- a/libavcodec/libopusdec.c +++ b/libavcodec/libopusdec.c @@ -47,6 +47,12 @@ static av_cold int libopus_decode_init(AVCodecContext *avc) int ret, channel_map = 0, gain_db = 0, nb_streams, nb_coupled; uint8_t mapping_arr[8] = { 0, 1 }, *mapping; +if (avc->channels <= 0) { +av_log(avc, AV_LOG_WARNING, + "Invalid number of channels %d, defaulting to stereo\n", avc->channels); +avc->channels = 2; +} + avc->sample_rate= 48000; avc->sample_fmt = avc->request_sample_fmt == AV_SAMPLE_FMT_FLT ? AV_SAMPLE_FMT_FLT : AV_SAMPLE_FMT_S16; -- 2.10.2 ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
Re: [FFmpeg-devel] [PATCH 2/2] lavfi: make filter_frame non-recursive.
On Mon, Nov 14, 2016 at 9:39 PM, Nicolas George wrote: > Le quartidi 24 brumaire, an CCXXV, Clement Boesch a écrit : >> I'm OK to make an exception for filter links that way if it's really >> important for you, but I don't think that's a good way of dealing with the >> problem in the rest of the codebase. > > Why? The arguments I gave for AVFilterLink apply to a lot of similar > structures. It would make the actual code in a lot of places simpler. > If anything this discussion has shown that there are quite different priorities for many different developers. A key priority for many of us here seems to be to have a clean separation in public and private fields and keeping the public API clean and respectable (no weird ifdeffery in public headers, for example). We've had such a messy API for such a long time, so if we define any new solutions for the future, it should be one we can all be happy with. You seem to prefer a bit more internal simplicity over external interfaces, which is fine, but you should not expect everyone to share that preference. - Hendrik ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
Re: [FFmpeg-devel] doc/filters : add metadata info for blackframe filter
On Sun, 23 Oct 2016 17:18:09 +0200, Martin Vignali wrote: > From b8d7150d4e94fdaa0ed1fa2ac215b03c775f37c6 Mon Sep 17 00:00:00 2001 > From: Martin Vignali > Date: Sun, 23 Oct 2016 17:16:12 +0200 > Subject: [PATCH] doc/filters : add metadata information for blackframe > > --- > doc/filters.texi | 4 > 1 file changed, 4 insertions(+) LGTM and pushed after making a few very minor changes. ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
Re: [FFmpeg-devel] [PATCH 2/2] lavfi: make filter_frame non-recursive.
On 14.11.2016 21:38, Nicolas George wrote: > Le quartidi 24 brumaire, an CCXXV, Andreas Cadhalpun a écrit : >> Consider the (not so) theoretical case of API users simply (and wrongly) >> using the structs on the stack instead of dynamically allocating them. >> >> With the internal pointer the internal struct gets allocated in the >> respective init function and things mostly work fine, while with >> your proposal this would cause out-of-bounds reads/writes, which >> likely means an arbitrary code execution vulnerability. > > I think it will likely result in their application not working at all > very soon, and therefore them realizing their mistake. This is exactly > how I noticed that filfmts allocates the links itself. It might work at first by accident and then some change in the internal API could cause problems. If the API usage is wrong, it should fail at build-time and not introduce grave security vulnerabilities at run-time. >> If you really care so much about this, you can make the struct completely >> private and add getter/setter functions for those elements that need >> public access. > > I would be ok with that. But that is a big incompatible API change. I > could argue that lavfi's API is currently unusable anyway. API changes in libavfilter are less of a problem, since not that many programs use it. For example, the codecpar change has orders of magnitude more impact. However, ideally any incompatible API change should be coordinated with Libav. Best regards, Andreas ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
Re: [FFmpeg-devel] [PATCH] vp9: add avx2 iadst16 implementations.
On 11/8/2016 1:22 PM, Ronald S. Bultje wrote: > Also a small cosmetic change to the avx2 idct16 version to make it > explicit that one of the arguments to the write-out macros is unused > for >=avx2 (it uses pmovzxbw instead of punpcklbw). A braindead test (ffmpeg -i 4kHDRsample.webm -benchmark -f null -) on an i5 Haswell went from frame= 2000 fps= 73 q=-0.0 Lsize=N/A time=00:00:33.36 bitrate=N/A speed=1.21x bench: utime=92.250s To frame= 2000 fps= 77 q=-0.0 Lsize=N/A time=00:00:33.36 bitrate=N/A speed=1.28x bench: utime=86.891s In comparison, a 1080p version of the same video now reaches ~360fps. FATE passes, so LGTM (After the x86_32 fix). ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
Re: [FFmpeg-devel] [PATCH] lavc/vc1dec: add multi-slice decoding support for hwaccel.
On 14/11/16 00:57, Jun Zhao wrote: > On 2016/11/11 16:29, Hendrik Leppkes wrote: >> On Fri, Nov 11, 2016 at 9:09 AM, Jun Zhao wrote: >>> >> >> Do you have a sample file for this case? AFAIK all vc1 files I ever >> saw worked with the DXVA2 hwaccel before, just want to make sure they >> are not getting broken. >> >> - Hendrik > > We used the file fate-suite/vc1/SA10091.vc1, you can get the files with > the command: > rsync -aL rsync://fate-suite.ffmpeg.org:/fate-suite/ fate-suite. Can you describe your test setup(s) a bit more? I had a go at testing this with VAAPI. With the i965 driver on Skylake GT2, fate-vc1_sa10091 fails cleanly without the patch, but gives a GPU hang with it. With the mesa driver on Polaris 11, fate-vc1_sa10091 passes without the patch, but fails with it. I haven't really looked at VC-1 decode much before, so I'm not sure which of these tests should pass. Still, the GPU hang is certainly bad (and possibly the fault of the driver, but it would be helpful to be sure of that if we want to apply this sort of change). More detailed results below. Thanks, - Mark Without patch, i965 + Skylake GT2: $ make HWACCEL='vaapi -vaapi_device /dev/dri/renderD128 -hwaccel_output_format yuv420p' fate-vc1 TESTvc1_sa00040 TESTvc1_sa00050 TESTvc1_sa10091 --- /home/mrt/video/ffmpeg/vaapi/tests/ref/fate/vc1_sa10091 2016-05-09 17:55:19.599803161 +0100 +++ tests/data/fate/vc1_sa10091 2016-11-14 21:18:10.550918661 + [...] Test vc1_sa10091 failed. Look at tests/data/fate/vc1_sa10091.err for details. /home/mrt/video/ffmpeg/vaapi/tests/Makefile:218: recipe for target 'fate-vc1_sa10091' failed make: *** [fate-vc1_sa10091] Error 1 Without patch, mesa + Polaris 11: $ LIBVA_DRIVER_NAME=radeonsi make HWACCEL='vaapi -vaapi_device /dev/dri/renderD129 -hwaccel_output_format yuv420p' fate-vc1 TESTvc1_sa00040 TESTvc1_sa00050 TESTvc1_sa10091 TESTvc1_sa10143 --- /home/mrt/video/ffmpeg/vaapi/tests/ref/fate/vc1_sa10143 2016-05-09 17:55:19.599803161 +0100 +++ tests/data/fate/vc1_sa10143 2016-11-14 21:17:03.712198274 + [...] Test vc1_sa10143 failed. Look at tests/data/fate/vc1_sa10143.err for details. /home/mrt/video/ffmpeg/vaapi/tests/Makefile:218: recipe for target 'fate-vc1_sa10143' failed make: *** [fate-vc1_sa10143] Error 1 With patch, i965 + Skylake GT2: $ make HWACCEL='vaapi -vaapi_device /dev/dri/renderD128 -hwaccel_output_format yuv420p' fate-vc1 TESTvc1_sa00040 TESTvc1_sa00050 TESTvc1_sa10091 ^C^C/home/mrt/video/ffmpeg/vaapi/tests/Makefile:218: recipe for target 'fate-vc1_sa10091' failed make: *** [fate-vc1_sa10091] Interrupt dmesg output: [1300652.142872] [drm] stuck on bsd ring [1300652.143138] [drm] GPU HANG: ecode 9:2:0xcbfcffe7, in ffmpeg [31688], reason: Engine(s) hung, action: reset [1300652.144959] drm/i915: Resetting chip after gpu hang [1300654.130765] [drm] RC6 on [1300660.106392] [drm] stuck on bsd ring [1300660.106927] [drm] GPU HANG: ecode 9:2:0xa8dfbffd, reason: Engine(s) hung, action: reset [1300660.107078] [drm:i915_set_reset_status [i915]] *ERROR* gpu hanging too fast, banning! [1300660.109931] drm/i915: Resetting chip after gpu hang [1300661.142622] [drm] RC6 on With patch, mesa + Polaris 11: $ LIBVA_DRIVER_NAME=radeonsi make HWACCEL='vaapi -vaapi_device /dev/dri/renderD129 -hwaccel_output_format yuv420p' fate-vc1 TESTvc1_sa00040 TESTvc1_sa00050 TESTvc1_sa10091 --- /home/mrt/video/ffmpeg/vaapi/tests/ref/fate/vc1_sa10091 2016-05-09 17:55:19.599803161 +0100 +++ tests/data/fate/vc1_sa10091 2016-11-14 21:20:24.324357577 + [...] Test vc1_sa10091 failed. Look at tests/data/fate/vc1_sa10091.err for details. /home/mrt/video/ffmpeg/vaapi/tests/Makefile:218: recipe for target 'fate-vc1_sa10091' failed make: *** [fate-vc1_sa10091] Error 1 ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
Re: [FFmpeg-devel] [libav-devel] [PATCH] libopusdec: fix out-of-bounds read
2016-11-14 21:55 GMT+01:00 Andreas Cadhalpun : >> channels being zero is perfectly valid, it means the caller does not >> know the channel count and expects the decoder to read it from the >> bitstream. > > In general code this is correct, however if e.g. the matroska demuxer > reads an audio stream which claims to have 0 channels, it should > be rejected as broken. I don't know the exact "broken" case you are referring to but generally, FFmpeg should not reject files because a field in their header is set incorrectly, especially if such "broken" files were played in the past. Carl Eugen ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
Re: [FFmpeg-devel] [PATCH] softfloat: handle -INT_MAX correctly
On 14.11.2016 00:30, Michael Niedermayer wrote: > On Sun, Nov 13, 2016 at 08:55:01PM +0100, Andreas Cadhalpun wrote: >> This is similar to commit 9ac61e73d0843ec4b83f4e3d47eded73234e406e. >> >> Signed-off-by: Andreas Cadhalpun >> --- >> libavutil/softfloat.h | 2 +- >> 1 file changed, 1 insertion(+), 1 deletion(-) > > LGTM Pushed. Best regards, Andreas ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
[FFmpeg-devel] [PATCH] avformat/hls: Added subtitle support
This patch is a fix for ticket #2833. Each subtitle segment is its own WebVTT file and has to demuxed individually. The timing of the subtitles are not perfect but it is the best I could do and it also does not take into account the X-TIMESTAMP-MAP header in the WebVTT files which is needed to conform to the specification (https://tools.ietf.org/html/draft-pantos-http-live-streaming-20#section-3.5). Signed-off-by: Franklin Phillips --- libavformat/hls.c | 178 -- 1 file changed, 159 insertions(+), 19 deletions(-) diff --git a/libavformat/hls.c b/libavformat/hls.c index 3ae3c7c..bf13be4 100644 --- a/libavformat/hls.c +++ b/libavformat/hls.c @@ -153,6 +153,8 @@ struct playlist { * playlist, if any. */ int n_init_sections; struct segment **init_sections; + +int is_subtitle; /* Indicates if the playlist is for subtitles */ }; /* @@ -312,6 +314,8 @@ static struct playlist *new_playlist(HLSContext *c, const char *url, pls->is_id3_timestamped = -1; pls->id3_mpegts_timestamp = AV_NOPTS_VALUE; +pls->is_subtitle = 0; + dynarray_add(&c->playlists, &c->n_playlists, pls); return pls; } @@ -482,11 +486,6 @@ static struct rendition *new_rendition(HLSContext *c, struct rendition_info *inf if (type == AVMEDIA_TYPE_SUBTITLE && !info->uri[0]) return NULL; -/* TODO: handle subtitles (each segment has to parsed separately) */ -if (c->strict_std_compliance > FF_COMPLIANCE_EXPERIMENTAL) -if (type == AVMEDIA_TYPE_SUBTITLE) -return NULL; - rend = av_mallocz(sizeof(struct rendition)); if (!rend) return NULL; @@ -501,9 +500,14 @@ static struct rendition *new_rendition(HLSContext *c, struct rendition_info *inf /* add the playlist if this is an external rendition */ if (info->uri[0]) { rend->playlist = new_playlist(c, info->uri, url_base); -if (rend->playlist) +if (rend->playlist) { dynarray_add(&rend->playlist->renditions, &rend->playlist->n_renditions, rend); +if (type == AVMEDIA_TYPE_SUBTITLE) { +rend->playlist->is_subtitle = 1; +rend->playlist->is_id3_timestamped = 0; +} +} } if (info->assoc_language[0]) { @@ -1349,6 +1353,136 @@ reload: goto restart; } +static int nested_io_open(AVFormatContext *s, AVIOContext **pb, const char *url, + int flags, AVDictionary **opts) +{ +av_log(s, AV_LOG_ERROR, + "A HLS playlist item '%s' referred to an external file '%s'. " + "Opening this file was forbidden for security reasons\n", + s->filename, url); +return AVERROR(EPERM); +} + +static int read_data_simple(void *opaque, uint8_t *buf, int buf_size) +{ +struct playlist *v = opaque; +HLSContext *c = v->parent->priv_data; +struct segment *seg; + +if (v->cur_seq_no >= v->start_seq_no + v->n_segments) { +return AVERROR_EOF; +} else { +seg = current_segment(v); +} + +if (!v->input) { +open_input(c, v, seg); +} + +return read_from_url(v, seg, buf, buf_size, READ_NORMAL); +} + +static int read_packet_subtitle(struct playlist *v, AVPacket *pkt) +{ +HLSContext *c = v->parent->priv_data; +int ret, i; + +restart: +if (!v->needed) +return AVERROR_EOF; + +if (!v->input) { +int64_t reload_interval; + +/* Check that the playlist is still needed before opening a new + * segment. */ +if (v->ctx && v->ctx->nb_streams) { +v->needed = 0; +for (i = 0; i < v->n_main_streams; i++) { +if (v->main_streams[i]->discard < AVDISCARD_ALL) { +v->needed = 1; +break; +} +} +} +if (!v->needed) { +av_log(v->parent, AV_LOG_INFO, "No longer receiving playlist %d\n", +v->index); +return AVERROR_EOF; +} + +/* If this is a live stream and the reload interval has elapsed since + * the last playlist reload, reload the playlists now. */ +reload_interval = default_reload_interval(v); + +if (!v->finished && +av_gettime_relative() - v->last_load_time >= reload_interval) { +if ((ret = parse_playlist(c, v->url, v, NULL)) < 0) { +av_log(v->parent, AV_LOG_WARNING, "Failed to reload playlist %d\n", + v->index); +return ret; +} +/* If we need to reload the playlist again below (if + * there's still no more segments), switch to a reload + * interval of half the target duration. */ +reload_interval = v->target_duration / 2; +} +if (v->cur_seq_no < v->start_seq_no) { +av_log(NULL, AV_LOG_WARNING, + "skipping %d
Re: [FFmpeg-devel] [libav-devel] [PATCH] libopusdec: fix out-of-bounds read
On Mon, Nov 14, 2016 at 9:55 PM, Andreas Cadhalpun wrote: > On 14.11.2016 20:54, Anton Khirnov wrote: >> Quoting Andreas Cadhalpun (2016-11-14 20:30:10) >>> On 14.11.2016 00:01, Luca Barbato wrote: On 13/11/2016 19:23, Andreas Cadhalpun wrote: > avc->channels can be 0. 0 and less than zero shouldn't be an error? >>> >>> Such values should be rejected, wherever they are set. >>> However, ensuring that is a larger change I'm currently >>> working on. >>> Meanwhile, this patch is a trivial fix for the potential >>> security problem that can easily be backported. >> >> channels being zero is perfectly valid, it means the caller does not >> know the channel count and expects the decoder to read it from the >> bitstream. > > In general code this is correct, however if e.g. the matroska demuxer > reads an audio stream which claims to have 0 channels, it should > be rejected as broken. > Well, not necessarily. Just because the container info is wrong or missing does not mean the stream is undecodable - not all containers have such levels of info after all, or sometimes none (see mpegts). Compressed codecs are often designed to be independent of container info. - Hendrik ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
Re: [FFmpeg-devel] [PATCH 1/3] lavu: Add AVSphericalMapping type and frame side data
On Sun, Nov 13, 2016 at 11:57 AM, Michael Niedermayer wrote: > On Sun, Nov 13, 2016 at 10:18:18AM +0100, Michael Niedermayer wrote: >> On Sat, Nov 12, 2016 at 10:05:22PM -0500, Vittorio Giovara wrote: >> [...] >> > So I really do not see a use case for using int64 here. >> >> then you can use int32, less than int32 is too little if for example >> you wnat the precission to be sufficient to know where a tele lens >> points with pixel precission and have a bit precission headroom Hi Michael, thanks for keeping me in CC. I understand the problem, but this is a solution for a issue that is non-existent. There is no application or usecase for "pixel perfect" precision, the rendering of a spherical video will distort the video surface in order to map the flat surface to a sphere, and it is impossible to predict where this operation will move pixels to. I believe this is why this specification describes the initial orientation as rotation degrees rather than pixel offsets. As I said, I believe the use case of a rotation calls in the use of double, exactly as it happens for other rotation APIs in the codebase. Secondly I am against exporting the data as fixed point since that would be burden to the end user and would tie this API to the google specification too much. Finally I don't see what is wrong about these fields being platform-specific, the whole structure is bound to be platform specific, and the API mandates that a structure is used all the time. -- Vittorio ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
Re: [FFmpeg-devel] [PATCH] lavc/vc1dec: add multi-slice decoding support for hwaccel.
On Mon, Nov 14, 2016 at 10:31 PM, Mark Thompson wrote: > On 14/11/16 00:57, Jun Zhao wrote: >> On 2016/11/11 16:29, Hendrik Leppkes wrote: >>> On Fri, Nov 11, 2016 at 9:09 AM, Jun Zhao wrote: >>> >>> Do you have a sample file for this case? AFAIK all vc1 files I ever >>> saw worked with the DXVA2 hwaccel before, just want to make sure they >>> are not getting broken. >>> >>> - Hendrik >> >> We used the file fate-suite/vc1/SA10091.vc1, you can get the files with >> the command: >> rsync -aL rsync://fate-suite.ffmpeg.org:/fate-suite/ fate-suite. > > Can you describe your test setup(s) a bit more? > > I had a go at testing this with VAAPI. > > With the i965 driver on Skylake GT2, fate-vc1_sa10091 fails cleanly without > the patch, but gives a GPU hang with it. > > With the mesa driver on Polaris 11, fate-vc1_sa10091 passes without the > patch, but fails with it. > > I haven't really looked at VC-1 decode much before, so I'm not sure which of > these tests should pass. Still, the GPU hang is certainly bad (and possibly > the fault of the driver, but it would be helpful to be sure of that if we > want to apply this sort of change). > > More detailed results below. For the record, this makes DXVA2 decoding break more as well. Decoding is not correct before this patch on that sample (only the first slice looks ok), but after its entirely broken, with error messages to boot. I have some experience with vc1 hwaccel, at least with dxva2, and if I find some time I might look into what might be needed to make it work, but this patch seems to have issues. - Hendrik ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
Re: [FFmpeg-devel] [PATCH] lavc/vc1dec: add multi-slice decoding support for hwaccel.
On Mon, Nov 14, 2016 at 11:15 PM, Hendrik Leppkes wrote: > On Mon, Nov 14, 2016 at 10:31 PM, Mark Thompson wrote: >> On 14/11/16 00:57, Jun Zhao wrote: >>> On 2016/11/11 16:29, Hendrik Leppkes wrote: On Fri, Nov 11, 2016 at 9:09 AM, Jun Zhao wrote: > Do you have a sample file for this case? AFAIK all vc1 files I ever saw worked with the DXVA2 hwaccel before, just want to make sure they are not getting broken. - Hendrik >>> >>> We used the file fate-suite/vc1/SA10091.vc1, you can get the files with >>> the command: >>> rsync -aL rsync://fate-suite.ffmpeg.org:/fate-suite/ fate-suite. >> >> Can you describe your test setup(s) a bit more? >> >> I had a go at testing this with VAAPI. >> >> With the i965 driver on Skylake GT2, fate-vc1_sa10091 fails cleanly without >> the patch, but gives a GPU hang with it. >> >> With the mesa driver on Polaris 11, fate-vc1_sa10091 passes without the >> patch, but fails with it. >> >> I haven't really looked at VC-1 decode much before, so I'm not sure which of >> these tests should pass. Still, the GPU hang is certainly bad (and possibly >> the fault of the driver, but it would be helpful to be sure of that if we >> want to apply this sort of change). >> >> More detailed results below. > > For the record, this makes DXVA2 decoding break more as well. Decoding > is not correct before this patch on that sample (only the first slice > looks ok), but after its entirely broken, with error messages to boot. > > I have some experience with vc1 hwaccel, at least with dxva2, and if I > find some time I might look into what might be needed to make it work, > but this patch seems to have issues. > After a quick look - one key problem is that hwaccels typically want raw/escaped buffers, but the buffers in the slice GetBitContext are unescaped, so that won't work. - Hendrik ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
Re: [FFmpeg-devel] [PATCH]lavf/mov: Fix an out-of-bound-read in mov_read_mac_string().
On 14.11.2016 14:56, Carl Eugen Hoyos wrote: > I believe attached patch fixes an out-of-bound-read in mov_read_mac_string() > if p > Please comment, Carl Eugen This patch is not necessary, the issue was fixed with commit 437f5daf0. If (p < end) is false, the 'else if (p < end)' branch will not be entered. > --- a/libavformat/mov.c > +++ b/libavformat/mov.c > @@ -160,7 +160,7 @@ static int mov_read_mac_string(MOVContext *c, AVIOContext > *pb, int len, > uint8_t t, c = avio_r8(pb); However, reusing the variable name of the MOVContext as uint8_t looks strange. > if (c < 0x80 && p < end) > *p++ = c; > -else if (p < end) > +else if (c >= 0x80 && p < end) > PUT_UTF8(mac_to_unicode[c-0x80], t, if (p < end) *p++ = t;); > } > *p = 0; > -- 1.7.10.4 Best regards, Andreas ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
Re: [FFmpeg-devel] [PATCH] lavc/vc1dec: add multi-slice decoding support for hwaccel.
On 14/11/16 21:31, Mark Thompson wrote: > On 14/11/16 00:57, Jun Zhao wrote: >> On 2016/11/11 16:29, Hendrik Leppkes wrote: >>> On Fri, Nov 11, 2016 at 9:09 AM, Jun Zhao wrote: >>> >>> Do you have a sample file for this case? AFAIK all vc1 files I ever >>> saw worked with the DXVA2 hwaccel before, just want to make sure they >>> are not getting broken. >>> >>> - Hendrik >> >> We used the file fate-suite/vc1/SA10091.vc1, you can get the files with >> the command: >> rsync -aL rsync://fate-suite.ffmpeg.org:/fate-suite/ fate-suite. > > Can you describe your test setup(s) a bit more? > > I had a go at testing this with VAAPI. > > With the i965 driver on Skylake GT2, fate-vc1_sa10091 fails cleanly without > the patch, but gives a GPU hang with it. > > With the mesa driver on Polaris 11, fate-vc1_sa10091 passes without the > patch, but fails with it. > > I haven't really looked at VC-1 decode much before, so I'm not sure which of > these tests should pass. Still, the GPU hang is certainly bad (and possibly > the fault of the driver, but it would be helpful to be sure of that if we > want to apply this sort of change). > > More detailed results below. > > Thanks, > > - Mark > > > > Without patch, i965 + Skylake GT2: > > $ make HWACCEL='vaapi -vaapi_device /dev/dri/renderD128 > -hwaccel_output_format yuv420p' fate-vc1 > TESTvc1_sa00040 > TESTvc1_sa00050 > TESTvc1_sa10091 > --- /home/mrt/video/ffmpeg/vaapi/tests/ref/fate/vc1_sa10091 2016-05-09 > 17:55:19.599803161 +0100 > +++ tests/data/fate/vc1_sa10091 2016-11-14 21:18:10.550918661 + > [...] > Test vc1_sa10091 failed. Look at tests/data/fate/vc1_sa10091.err for details. > /home/mrt/video/ffmpeg/vaapi/tests/Makefile:218: recipe for target > 'fate-vc1_sa10091' failed > make: *** [fate-vc1_sa10091] Error 1 > > > Without patch, mesa + Polaris 11: > > $ LIBVA_DRIVER_NAME=radeonsi make HWACCEL='vaapi -vaapi_device > /dev/dri/renderD129 -hwaccel_output_format yuv420p' fate-vc1 > TESTvc1_sa00040 > TESTvc1_sa00050 > TESTvc1_sa10091 > TESTvc1_sa10143 > --- /home/mrt/video/ffmpeg/vaapi/tests/ref/fate/vc1_sa10143 2016-05-09 > 17:55:19.599803161 +0100 > +++ tests/data/fate/vc1_sa10143 2016-11-14 21:17:03.712198274 + > [...] > Test vc1_sa10143 failed. Look at tests/data/fate/vc1_sa10143.err for details. > /home/mrt/video/ffmpeg/vaapi/tests/Makefile:218: recipe for target > 'fate-vc1_sa10143' failed > make: *** [fate-vc1_sa10143] Error 1 > > > With patch, i965 + Skylake GT2: > > $ make HWACCEL='vaapi -vaapi_device /dev/dri/renderD128 > -hwaccel_output_format yuv420p' fate-vc1 > TESTvc1_sa00040 > TESTvc1_sa00050 > TESTvc1_sa10091 > ^C^C/home/mrt/video/ffmpeg/vaapi/tests/Makefile:218: recipe for target > 'fate-vc1_sa10091' failed > make: *** [fate-vc1_sa10091] Interrupt > > dmesg output: > [1300652.142872] [drm] stuck on bsd ring > [1300652.143138] [drm] GPU HANG: ecode 9:2:0xcbfcffe7, in ffmpeg [31688], > reason: Engine(s) hung, action: reset > [1300652.144959] drm/i915: Resetting chip after gpu hang > [1300654.130765] [drm] RC6 on > [1300660.106392] [drm] stuck on bsd ring > [1300660.106927] [drm] GPU HANG: ecode 9:2:0xa8dfbffd, reason: Engine(s) > hung, action: reset > [1300660.107078] [drm:i915_set_reset_status [i915]] *ERROR* gpu hanging too > fast, banning! > [1300660.109931] drm/i915: Resetting chip after gpu hang > [1300661.142622] [drm] RC6 on > > > With patch, mesa + Polaris 11: > > $ LIBVA_DRIVER_NAME=radeonsi make HWACCEL='vaapi -vaapi_device > /dev/dri/renderD129 -hwaccel_output_format yuv420p' fate-vc1 > TESTvc1_sa00040 > TESTvc1_sa00050 > TESTvc1_sa10091 > --- /home/mrt/video/ffmpeg/vaapi/tests/ref/fate/vc1_sa10091 2016-05-09 > 17:55:19.599803161 +0100 > +++ tests/data/fate/vc1_sa10091 2016-11-14 21:20:24.324357577 + > [...] > Test vc1_sa10091 failed. Look at tests/data/fate/vc1_sa10091.err for details. > /home/mrt/video/ffmpeg/vaapi/tests/Makefile:218: recipe for target > 'fate-vc1_sa10091' failed > make: *** [fate-vc1_sa10091] Error 1 > VDPAU has the same results as VAAPI. Without patch, mesa + Polaris 11: $ DISPLAY=:0 make HWACCEL='vdpau' fate-vc1 TESTvc1_sa00040 TESTvc1_sa00050 TESTvc1_sa10091 TESTvc1_sa10143 --- /home/mrt/video/ffmpeg/vaapi/tests/ref/fate/vc1_sa10143 2016-05-09 17:55:19.599803161 +0100 +++ tests/data/fate/vc1_sa10143 2016-11-14 22:27:41.903250981 + [...] Test vc1_sa10143 failed. Look at tests/data/fate/vc1_sa10143.err for details. /home/mrt/video/ffmpeg/vaapi/tests/Makefile:218: recipe for target 'fate-vc1_sa10143' failed make: *** [fate-vc1_sa10143] Error 1 With patch, mesa + Polaris 11: $ DISPLAY=:0 make HWACCEL='vdpau' fate-vc1 TESTvc1_sa00040 TESTvc1_sa00050 TESTvc1_sa10091 --- /home/mrt/video/ffmpeg/vaapi/tests/ref/fate/vc1_sa10091 2016-05-09 17:55:19.599803161 +0100 +++ tests/data/fate/vc1_sa10091 2016-11-14 22:26:17.108873397 + [...] Test vc1_sa10091 failed. Loo
Re: [FFmpeg-devel] [libav-devel] [PATCH] libopusdec: fix out-of-bounds read
On 14.11.2016 22:59, Carl Eugen Hoyos wrote: > 2016-11-14 21:55 GMT+01:00 Andreas Cadhalpun > : > >>> channels being zero is perfectly valid, it means the caller does not >>> know the channel count and expects the decoder to read it from the >>> bitstream. >> >> In general code this is correct, however if e.g. the matroska demuxer >> reads an audio stream which claims to have 0 channels, it should >> be rejected as broken. > > I don't know the exact "broken" case you are referring to but > generally, FFmpeg should not reject files because a field in > their header is set incorrectly, especially if such "broken" > files were played in the past. Well, if the field should contain the number of channels but doesn't, the sample is not correct. Anyway, zero channels is borderline, but what about a negative number of channels? Best regards, Andreas ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
Re: [FFmpeg-devel] [PATCH] lavc/vc1dec: add multi-slice decoding support for hwaccel.
On 11/14/2016 7:15 PM, Hendrik Leppkes wrote: > On Mon, Nov 14, 2016 at 10:31 PM, Mark Thompson wrote: >> On 14/11/16 00:57, Jun Zhao wrote: >>> On 2016/11/11 16:29, Hendrik Leppkes wrote: On Fri, Nov 11, 2016 at 9:09 AM, Jun Zhao wrote: > Do you have a sample file for this case? AFAIK all vc1 files I ever saw worked with the DXVA2 hwaccel before, just want to make sure they are not getting broken. - Hendrik >>> >>> We used the file fate-suite/vc1/SA10091.vc1, you can get the files with >>> the command: >>> rsync -aL rsync://fate-suite.ffmpeg.org:/fate-suite/ fate-suite. >> >> Can you describe your test setup(s) a bit more? >> >> I had a go at testing this with VAAPI. >> >> With the i965 driver on Skylake GT2, fate-vc1_sa10091 fails cleanly without >> the patch, but gives a GPU hang with it. >> >> With the mesa driver on Polaris 11, fate-vc1_sa10091 passes without the >> patch, but fails with it. >> >> I haven't really looked at VC-1 decode much before, so I'm not sure which of >> these tests should pass. Still, the GPU hang is certainly bad (and possibly >> the fault of the driver, but it would be helpful to be sure of that if we >> want to apply this sort of change). >> >> More detailed results below. > > For the record, this makes DXVA2 decoding break more as well. Decoding > is not correct before this patch on that sample (only the first slice > looks ok), but after its entirely broken, with error messages to boot. > > I have some experience with vc1 hwaccel, at least with dxva2, and if I > find some time I might look into what might be needed to make it work, > but this patch seems to have issues. > > - Hendrik On a first gen GCN GPU, vc1_sa10091 passes but vc1_sa10143 fails using DXVA2 and a recent driver. Did not test with this patch applied. ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
[FFmpeg-devel] [PATCH] mov: extract stsd vendor field in metadata. Also updates fate ref for rgb24-mkv and fate-mov-zombie as the output will contain the new metadata field.
--- libavformat/mov.c | 23 +-- tests/ref/fate/mov-zombie | 2 +- tests/ref/fate/rgb24-mkv | 4 ++-- 3 files changed, 24 insertions(+), 5 deletions(-) diff --git a/libavformat/mov.c b/libavformat/mov.c index 9ec7d03..6176415 100644 --- a/libavformat/mov.c +++ b/libavformat/mov.c @@ -1835,6 +1835,8 @@ static void mov_parse_stsd_video(MOVContext *c, AVIOContext *pb, uint8_t codec_name[32]; int64_t stsd_start; unsigned int len; +int video_vendor_id = 0; +char video_vendor_id_buffer[5]; /* The first 16 bytes of the video sample description are already * read in ff_mov_read_stsd_entries() */ @@ -1842,7 +1844,15 @@ static void mov_parse_stsd_video(MOVContext *c, AVIOContext *pb, avio_rb16(pb); /* version */ avio_rb16(pb); /* revision level */ -avio_rb32(pb); /* vendor */ + +/* set video_vendor_id */ +video_vendor_id = avio_rl32(pb); /* vendor */ +if (video_vendor_id != 0) { + memset(video_vendor_id_buffer, 0, 5); + memcpy(video_vendor_id_buffer, (char*)&video_vendor_id, 4); + av_dict_set(&st->metadata, "vendor_id", video_vendor_id_buffer, 0); +} + avio_rb32(pb); /* temporal quality */ avio_rb32(pb); /* spatial quality */ @@ -1891,9 +1901,18 @@ static void mov_parse_stsd_audio(MOVContext *c, AVIOContext *pb, int bits_per_sample, flags; uint16_t version = avio_rb16(pb); AVDictionaryEntry *compatible_brands = av_dict_get(c->fc->metadata, "compatible_brands", NULL, AV_DICT_MATCH_CASE); +int audio_vendor_id = 0; +char audio_vendor_id_buffer[5]; avio_rb16(pb); /* revision level */ -avio_rb32(pb); /* vendor */ + +/* set audio_vendor_id */ +audio_vendor_id = avio_rl32(pb); /* vendor */ +if (audio_vendor_id != 0) { + memset(audio_vendor_id_buffer, 0, 5); + memcpy(audio_vendor_id_buffer, (char*)&audio_vendor_id, 4); + av_dict_set(&st->metadata, "vendor_id", audio_vendor_id_buffer, 0); +} st->codecpar->channels = avio_rb16(pb); /* channel count */ st->codecpar->bits_per_coded_sample = avio_rb16(pb); /* sample size */ diff --git a/tests/ref/fate/mov-zombie b/tests/ref/fate/mov-zombie index 42e3a6f..d032618 100644 --- a/tests/ref/fate/mov-zombie +++ b/tests/ref/fate/mov-zombie @@ -129,5 +129,5 @@ packet|codec_type=video|stream_index=0|pts=188623|pts_time=2.095811|dts=188622|d frame|media_type=video|stream_index=0|key_frame=0|pkt_pts=188623|pkt_pts_time=2.095811|pkt_dts=188622|pkt_dts_time=2.095800|best_effort_timestamp=188623|best_effort_timestamp_time=2.095811|pkt_duration=3003|pkt_duration_time=0.033367|pkt_pos=100846|pkt_size=974|width=160|height=240|pix_fmt=yuv420p|sample_aspect_ratio=2:1|pict_type=B|coded_picture_number=64|display_picture_number=0|interlaced_frame=0|top_field_first=0|repeat_pict=0 packet|codec_type=video|stream_index=0|pts=197632|pts_time=2.195911|dts=191625|dts_time=2.129167|duration=3003|duration_time=0.033367|convergence_duration=N/A|convergence_duration_time=N/A|size=580|pos=101820|flags=__ frame|media_type=video|stream_index=0|key_frame=0|pkt_pts=191626|pkt_pts_time=2.129178|pkt_dts=N/A|pkt_dts_time=N/A|best_effort_timestamp=191626|best_effort_timestamp_time=2.129178|pkt_duration=3003|pkt_duration_time=0.033367|pkt_pos=99180|pkt_size=1666|width=160|height=240|pix_fmt=yuv420p|sample_aspect_ratio=2:1|pict_type=P|coded_picture_number=63|display_picture_number=0|interlaced_frame=0|top_field_first=0|repeat_pict=0 -stream|index=0|codec_name=h264|profile=77|codec_type=video|codec_time_base=212521/12744000|codec_tag_string=avc1|codec_tag=0x31637661|width=160|height=240|coded_width=160|coded_height=240|has_b_frames=0|sample_aspect_ratio=2:1|display_aspect_ratio=4:3|pix_fmt=yuv420p|level=12|color_range=tv|color_space=smpte170m|color_transfer=bt709|color_primaries=smpte170m|chroma_location=topleft|field_order=unknown|timecode=N/A|refs=2|is_avc=true|nal_length_size=4|id=N/A|r_frame_rate=3/1001|avg_frame_rate=6372000/212521|time_base=1/9|start_pts=0|start_time=0.00|duration_ts=2125200|duration=23.61|bit_rate=333874|max_bit_rate=N/A|bits_per_raw_sample=8|nb_frames=708|nb_read_frames=65|nb_read_packets=66|disposition:default=1|disposition:dub=0|disposition:original=0|disposition:comment=0|disposition:lyrics=0|disposition:karaoke=0|disposition:forced=0|disposition:hearing_impaired=0|disposition:visual_impaired=0|disposition:clean_effects=0|disposition:attached_pic=0|disposition:timed_thumbnails=0|tag:rotate=0|tag:creation_time=2008-05-12T20:59:27.00Z|tag:language=eng|tag:handler_name=Apple Alias Data Handler|tag:encoder=H.264 +stream|index=0|codec_name=h264|profile=77|codec_type=video|codec_time_base=212521/12744000|codec_tag_string=avc1|codec_tag=0x31637661|width=160|height=240|coded_width=160|coded_height=240|has_b_frames=0|sample_aspect_ratio=2:1|display_aspect_ratio=4:3|pix_fmt=yuv420p|level=12|color_range=tv|color_space=smpte170m|color_transfer=bt709|color_primari
[FFmpeg-devel] [PATCH] mlz: limit next_code to data buffer size
This fixes a heap-buffer-overflow detected by AddressSanitizer. Signed-off-by: Andreas Cadhalpun --- libavcodec/mlz.c | 8 1 file changed, 8 insertions(+) diff --git a/libavcodec/mlz.c b/libavcodec/mlz.c index a2d1b89..ebce796 100644 --- a/libavcodec/mlz.c +++ b/libavcodec/mlz.c @@ -166,6 +166,10 @@ int ff_mlz_decompression(MLZ* mlz, GetBitContext* gb, int size, unsigned char *b } output_chars += ret; set_new_entry_dict(dict, mlz->next_code, last_string_code, char_code); +if (mlz->next_code >= TABLE_SIZE - 1) { +av_log(mlz->context, AV_LOG_ERROR, "Too many MLZ codes\n"); +return output_chars; +} mlz->next_code++; } else { int ret = decode_string(mlz, &buff[output_chars], string_code, &char_code, size - output_chars); @@ -177,6 +181,10 @@ int ff_mlz_decompression(MLZ* mlz, GetBitContext* gb, int size, unsigned char *b if (output_chars <= size && !mlz->freeze_flag) { if (last_string_code != -1) { set_new_entry_dict(dict, mlz->next_code, last_string_code, char_code); +if (mlz->next_code >= TABLE_SIZE - 1) { +av_log(mlz->context, AV_LOG_ERROR, "Too many MLZ codes\n"); +return output_chars; +} mlz->next_code++; } } else { -- 2.10.2 ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
[FFmpeg-devel] [PATCH] lavc: report frame field order in avctx
--- libavcodec/utils.c | 13 + tests/api/api-codec-param-test.c | 3 +++ tests/fate/matroska.mak | 2 +- tests/ref/fate/api-mjpeg-codec-param | 2 +- tests/ref/fate/api-png-codec-param | 2 +- tests/ref/fate/mov-zombie| 2 +- 6 files changed, 20 insertions(+), 4 deletions(-) diff --git a/libavcodec/utils.c b/libavcodec/utils.c index d6dca18..b9af880 100644 --- a/libavcodec/utils.c +++ b/libavcodec/utils.c @@ -2296,6 +2296,12 @@ fail: guess_correct_pts(avctx, picture->pts, picture->pkt_dts)); + +if (avctx->field_order == AV_FIELD_UNKNOWN) { +avctx->field_order = picture->interlaced_frame + ? (picture->top_field_first ? AV_FIELD_TT : AV_FIELD_BB) + : AV_FIELD_PROGRESSIVE; +} } else av_frame_unref(picture); } else @@ -2895,6 +2901,13 @@ int attribute_align_arg avcodec_receive_frame(AVCodecContext *avctx, AVFrame *fr av_frame_set_best_effort_timestamp(frame, guess_correct_pts(avctx, frame->pts, frame->pkt_dts)); } + +if (avctx->field_order == AV_FIELD_UNKNOWN && +avctx->codec_type == AVMEDIA_TYPE_VIDEO) { +avctx->field_order = frame->interlaced_frame + ? (frame->top_field_first ? AV_FIELD_TT : AV_FIELD_BB) + : AV_FIELD_PROGRESSIVE; +} } return ret; } diff --git a/tests/api/api-codec-param-test.c b/tests/api/api-codec-param-test.c index 377a5e9..16ba8c6 100644 --- a/tests/api/api-codec-param-test.c +++ b/tests/api/api-codec-param-test.c @@ -211,6 +211,9 @@ static int check_video_streams(const AVFormatContext *fmt_ctx1, const AVFormatCo if (!strcmp(opt->name, "frame_number")) continue; +if (!strcmp(opt->name, "field_order")) +continue; + av_assert0(av_opt_get(codec_ctx1, opt->name, 0, &str1) >= 0); av_assert0(av_opt_get(codec_ctx2, opt->name, 0, &str2) >= 0); if (strcmp(str1, str2)) { diff --git a/tests/fate/matroska.mak b/tests/fate/matroska.mak index 35ed41f..16397b5 100644 --- a/tests/fate/matroska.mak +++ b/tests/fate/matroska.mak @@ -4,6 +4,6 @@ FATE_MATROSKA-$(call DEMMUX, MATROSKA, MATROSKA) += fate-matroska-remux fate-matroska-remux: CMD = md5 -i $(TARGET_SAMPLES)/vp9-test-vectors/vp90-2-2pass-akiyo.webm -color_trc 4 -c:v copy -fflags +bitexact -strict -2 -f matroska fate-matroska-remux: CMP = oneline -fate-matroska-remux: REF = 9b8398b42804ba12c39d2f47299a0996 +fate-matroska-remux: REF = cb6cc1cb581e31c98dd77173c7b4d221 FATE_SAMPLES_AVCONV += $(FATE_MATROSKA-yes) diff --git a/tests/ref/fate/api-mjpeg-codec-param b/tests/ref/fate/api-mjpeg-codec-param index c67d1b1..c7e8da2 100644 --- a/tests/ref/fate/api-mjpeg-codec-param +++ b/tests/ref/fate/api-mjpeg-codec-param @@ -307,7 +307,7 @@ stream=0, decode=1 refcounted_frames=false side_data_only_packets=true skip_alpha=false -field_order=0 +field_order=1 dump_separator= codec_whitelist= pixel_format=yuvj422p diff --git a/tests/ref/fate/api-png-codec-param b/tests/ref/fate/api-png-codec-param index bd53441..cba634e 100644 --- a/tests/ref/fate/api-png-codec-param +++ b/tests/ref/fate/api-png-codec-param @@ -307,7 +307,7 @@ stream=0, decode=1 refcounted_frames=false side_data_only_packets=true skip_alpha=false -field_order=0 +field_order=1 dump_separator= codec_whitelist= pixel_format=rgba diff --git a/tests/ref/fate/mov-zombie b/tests/ref/fate/mov-zombie index 42e3a6f..c9cf7db 100644 --- a/tests/ref/fate/mov-zombie +++ b/tests/ref/fate/mov-zombie @@ -129,5 +129,5 @@ packet|codec_type=video|stream_index=0|pts=188623|pts_time=2.095811|dts=188622|d frame|media_type=video|stream_index=0|key_frame=0|pkt_pts=188623|pkt_pts_time=2.095811|pkt_dts=188622|pkt_dts_time=2.095800|best_effort_timestamp=188623|best_effort_timestamp_time=2.095811|pkt_duration=3003|pkt_duration_time=0.033367|pkt_pos=100846|pkt_size=974|width=160|height=240|pix_fmt=yuv420p|sample_aspect_ratio=2:1|pict_type=B|coded_picture_number=64|display_picture_number=0|interlaced_frame=0|top_field_first=0|repeat_pict=0 packet|codec_type=video|stream_index=0|pts=197632|pts_time=2.195911|dts=191625|dts_time=2.129167|duration=3003|duration_time=0.033367|convergence_duration=N/A|convergence_duration_time=N/A|size=580|pos=101820|flags=__ frame|media_type=video|stream_index=0|key_frame=0|pkt_pts=191626|pkt_pts_time=2.129178|pkt_dts=N/A|pkt_dts_time=N/A|best_effort_timestamp=191626|best_effort_timestamp_time=2.129178|pkt_duration=3003|pkt_duration_tim
[FFmpeg-devel] [PATCH] ffplay: add support for negative RGBA linesize
This fixes the crash reported in ticket #5947. Signed-off-by: Marton Balint --- ffplay.c | 14 -- 1 file changed, 12 insertions(+), 2 deletions(-) diff --git a/ffplay.c b/ffplay.c index 79dc768..12502f2 100644 --- a/ffplay.c +++ b/ffplay.c @@ -165,6 +165,7 @@ typedef struct Frame { int format; AVRational sar; int uploaded; +int flip_v; } Frame; typedef struct FrameQueue { @@ -861,12 +862,20 @@ static int upload_texture(SDL_Texture *tex, AVFrame *frame, struct SwsContext ** int ret = 0; switch (frame->format) { case AV_PIX_FMT_YUV420P: +if (frame->linesize[0] < 0 || frame->linesize[1] < 0 || frame->linesize[2] < 0) { +av_log(NULL, AV_LOG_ERROR, "Negative linesize is not supported for YUV.\n"); +return -1; +} ret = SDL_UpdateYUVTexture(tex, NULL, frame->data[0], frame->linesize[0], frame->data[1], frame->linesize[1], frame->data[2], frame->linesize[2]); break; case AV_PIX_FMT_BGRA: -ret = SDL_UpdateTexture(tex, NULL, frame->data[0], frame->linesize[0]); +if (frame->linesize[0] < 0) { +ret = SDL_UpdateTexture(tex, NULL, frame->data[0] + frame->linesize[0] * (frame->height - 1), -frame->linesize[0]); +} else { +ret = SDL_UpdateTexture(tex, NULL, frame->data[0], frame->linesize[0]); +} break; default: /* This should only happen if we are not using avfilter... */ @@ -949,9 +958,10 @@ static void video_image_display(VideoState *is) if (upload_texture(vp->bmp, vp->frame, &is->img_convert_ctx) < 0) return; vp->uploaded = 1; +vp->flip_v = vp->frame->linesize[0] < 0; } -SDL_RenderCopy(renderer, vp->bmp, NULL, &rect); +SDL_RenderCopyEx(renderer, vp->bmp, NULL, &rect, 0, NULL, vp->flip_v ? SDL_FLIP_VERTICAL : 0); if (sp) { #if USE_ONEPASS_SUBTITLE_RENDER SDL_RenderCopy(renderer, is->sub_texture, NULL, &rect); -- 2.6.6 ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
Re: [FFmpeg-devel] [PATCH]lavf/mov: Fix an out-of-bound-read in mov_read_mac_string().
2016-11-14 23:26 GMT+01:00 Andreas Cadhalpun : > On 14.11.2016 14:56, Carl Eugen Hoyos wrote: >> I believe attached patch fixes an out-of-bound-read in mov_read_mac_string() >> if p> >> Please comment, Carl Eugen > > This patch is not necessary, the issue was fixed with commit 437f5daf0. > If (p < end) is false, the 'else if (p < end)' branch will not be entered. Sorry for the noise! >> --- a/libavformat/mov.c >> +++ b/libavformat/mov.c >> @@ -160,7 +160,7 @@ static int mov_read_mac_string(MOVContext *c, >> AVIOContext *pb, int len, >> uint8_t t, c = avio_r8(pb); > > However, reusing the variable name of the MOVContext as uint8_t looks strange. Maybe that's what irritated me;-) Thank you, Carl Eugen ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
[FFmpeg-devel] [PATCH]lavc/ffv1enc: Support pix_fmt GRAY10
Hi! Attached patch allows GRAY10 encoding, decoding works fine with current FFmpeg. Please comment, Carl Eugen From 5fb552828dc775045ad437065a429a54b4f1e353 Mon Sep 17 00:00:00 2001 From: Carl Eugen Hoyos Date: Tue, 15 Nov 2016 00:52:30 +0100 Subject: [PATCH] lavc/ffv1enc: Support pix_fmt GRAY10. --- libavcodec/ffv1enc.c |2 ++ 1 file changed, 2 insertions(+) diff --git a/libavcodec/ffv1enc.c b/libavcodec/ffv1enc.c index 68d311d..c56623b 100644 --- a/libavcodec/ffv1enc.c +++ b/libavcodec/ffv1enc.c @@ -573,6 +573,7 @@ FF_ENABLE_DEPRECATION_WARNINGS case AV_PIX_FMT_YUVA420P9: if (!avctx->bits_per_raw_sample) s->bits_per_raw_sample = 9; +case AV_PIX_FMT_GRAY10: case AV_PIX_FMT_YUV444P10: case AV_PIX_FMT_YUV420P10: case AV_PIX_FMT_YUV422P10: @@ -1298,6 +1299,7 @@ AVCodec ff_ffv1_encoder = { AV_PIX_FMT_GRAY16,AV_PIX_FMT_GRAY8, AV_PIX_FMT_GBRP9, AV_PIX_FMT_GBRP10, AV_PIX_FMT_GBRP12,AV_PIX_FMT_GBRP14, AV_PIX_FMT_YA8, +AV_PIX_FMT_GRAY10, AV_PIX_FMT_GBRP16, AV_PIX_FMT_RGB48, AV_PIX_FMT_NONE -- 1.7.10.4 ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
Re: [FFmpeg-devel] [PATCH] lavc/vc1dec: add multi-slice decoding support for hwaccel.
On Mon, Nov 14, 2016 at 11:47 PM, James Almer wrote: > On 11/14/2016 7:15 PM, Hendrik Leppkes wrote: >> On Mon, Nov 14, 2016 at 10:31 PM, Mark Thompson wrote: >>> On 14/11/16 00:57, Jun Zhao wrote: On 2016/11/11 16:29, Hendrik Leppkes wrote: > On Fri, Nov 11, 2016 at 9:09 AM, Jun Zhao wrote: >> > > Do you have a sample file for this case? AFAIK all vc1 files I ever > saw worked with the DXVA2 hwaccel before, just want to make sure they > are not getting broken. > > - Hendrik We used the file fate-suite/vc1/SA10091.vc1, you can get the files with the command: rsync -aL rsync://fate-suite.ffmpeg.org:/fate-suite/ fate-suite. >>> >>> Can you describe your test setup(s) a bit more? >>> >>> I had a go at testing this with VAAPI. >>> >>> With the i965 driver on Skylake GT2, fate-vc1_sa10091 fails cleanly without >>> the patch, but gives a GPU hang with it. >>> >>> With the mesa driver on Polaris 11, fate-vc1_sa10091 passes without the >>> patch, but fails with it. >>> >>> I haven't really looked at VC-1 decode much before, so I'm not sure which >>> of these tests should pass. Still, the GPU hang is certainly bad (and >>> possibly the fault of the driver, but it would be helpful to be sure of >>> that if we want to apply this sort of change). >>> >>> More detailed results below. >> >> For the record, this makes DXVA2 decoding break more as well. Decoding >> is not correct before this patch on that sample (only the first slice >> looks ok), but after its entirely broken, with error messages to boot. >> >> I have some experience with vc1 hwaccel, at least with dxva2, and if I >> find some time I might look into what might be needed to make it work, >> but this patch seems to have issues. >> >> - Hendrik > > On a first gen GCN GPU, vc1_sa10091 passes but vc1_sa10143 fails using > DXVA2 and a recent driver. > > Did not test with this patch applied. > I made 10091 work on my NVIDIA with DXVA2 by re-writing hwaccel slice support from scratch. 10143 still fails. I'll post patches once I got those all figured out. - Hendrik ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
Re: [FFmpeg-devel] [PATCH] lavc/vc1dec: add multi-slice decoding support for hwaccel.
2016-11-14 23:47 GMT+01:00 James Almer : > but vc1_sa10143 fails using DXVA2 and a recent driver. I suspect it actually passes with DXVA2: FFmpeg is not bit-exact for vc1. Carl Eugen Correct output: 0, 0, 0,1, 518400, 0x34fa7f55 0, 1, 1,1, 518400, 0x60466bc1 0, 2, 2,1, 518400, 0xe68dff1e 0, 3, 3,1, 518400, 0x790ac06a 0, 4, 4,1, 518400, 0xb3b26b27 0, 5, 5,1, 518400, 0x8840096c 0, 6, 6,1, 518400, 0xf75c3d61 0, 7, 7,1, 518400, 0xca071781 0, 8, 8,1, 518400, 0xa8e6edf9 0, 9, 9,1, 518400, 0xabb61984 0, 10, 10,1, 518400, 0x0b31dedd 0, 11, 11,1, 518400, 0xf44378ef 0, 12, 12,1, 518400, 0xf7268996 0, 13, 13,1, 518400, 0x8c5b1ff4 0, 14, 14,1, 518400, 0xda356fd2 0, 15, 15,1, 518400, 0x0e091c57 0, 16, 16,1, 518400, 0x17645e68 0, 17, 17,1, 518400, 0xf47a71ef 0, 18, 18,1, 518400, 0x6c440498 0, 19, 19,1, 518400, 0xd705bd32 0, 20, 20,1, 518400, 0x0800edd0 0, 21, 21,1, 518400, 0x902be119 0, 22, 22,1, 518400, 0x0f7d7bc4 0, 23, 23,1, 518400, 0x9f4dc421 0, 24, 24,1, 518400, 0x3b8c8d5a 0, 25, 25,1, 518400, 0xbcdfb2b9 0, 26, 26,1, 518400, 0xa02a46c3 0, 27, 27,1, 518400, 0x8ecde915 0, 28, 28,1, 518400, 0x20576bfd 0, 29, 29,1, 518400, 0xac40bc36 ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
Re: [FFmpeg-devel] [libav-devel] [PATCH] libopusdec: fix out-of-bounds read
2016-11-14 23:40 GMT+01:00 Andreas Cadhalpun : > On 14.11.2016 22:59, Carl Eugen Hoyos wrote: >> 2016-11-14 21:55 GMT+01:00 Andreas Cadhalpun >> : >> channels being zero is perfectly valid, it means the caller does not know the channel count and expects the decoder to read it from the bitstream. >>> >>> In general code this is correct, however if e.g. the matroska demuxer >>> reads an audio stream which claims to have 0 channels, it should >>> be rejected as broken. >> >> I don't know the exact "broken" case you are referring to but >> generally, FFmpeg should not reject files because a field in >> their header is set incorrectly, especially if such "broken" >> files were played in the past. > > Well, if the field should contain the number of channels but doesn't, > the sample is not correct. Assuming the opus bitstream contains the number of channels, the value in the demuxer should be ignored by default. Carl Eugen ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
Re: [FFmpeg-devel] [PATCH] lavc/vc1dec: add multi-slice decoding support for hwaccel.
On Tue, Nov 15, 2016 at 1:19 AM, Carl Eugen Hoyos wrote: > 2016-11-14 23:47 GMT+01:00 James Almer : > >> but vc1_sa10143 fails using DXVA2 and a recent driver. > > I suspect it actually passes with DXVA2: FFmpeg is not > bit-exact for vc1. Looks like you are right, thats the hashes I get as well. In any case, I have a working WIP patch that fixes sa10091 and sa20021 with DXVA2, which were broken before. I'll clean it up tomorrow and send it for testing. Unfortunately I don't have a sample for field mode with slices, so that remains un-implemented. If someone comes across such a thing, that would be nice to have. - Hendrik ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
Re: [FFmpeg-devel] [PATCH 1/2] Add experimental muxing support for FLAC in ISO BMFF (MP4).
Ping? It's been a couple of weeks since the last comment and patch update. Can I help move this forward somehow? Firefox will be shipping experimental support for playback of FLAC in MP4 based on the current spec in Firefox 51. ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
Re: [FFmpeg-devel] [PATCH 1/3] lavu: Add AVSphericalMapping type and frame side data
On Mon, Nov 14, 2016 at 04:55:36PM -0500, Vittorio Giovara wrote: > On Sun, Nov 13, 2016 at 11:57 AM, Michael Niedermayer > wrote: > > On Sun, Nov 13, 2016 at 10:18:18AM +0100, Michael Niedermayer wrote: > >> On Sat, Nov 12, 2016 at 10:05:22PM -0500, Vittorio Giovara wrote: > >> [...] > >> > So I really do not see a use case for using int64 here. > >> > >> then you can use int32, less than int32 is too little if for example > >> you wnat the precission to be sufficient to know where a tele lens > >> points with pixel precission and have a bit precission headroom > > Hi Michael, thanks for keeping me in CC. > I understand the problem, but this is a solution for a issue that is > non-existent. The problem of difficut to test code is real and existing in general. Encoders&muxers using floats/doubles can not be tested as completely as Encoders&muxers not using floats unless they by chance give binary identical results on all platforms. Muxers&encoders would use/write the rotation side data Similarly ffprobe would at some point print this data, the text printout of doubles has a good chance to not match exactly between platforms. > There is no application or usecase for "pixel perfect" > precision, the rendering of a spherical video will distort the video > surface in order to map the flat surface to a sphere, and it is > impossible to predict where this operation will move pixels to. I > believe this is why this specification describes the initial > orientation as rotation degrees rather than pixel offsets. I referred to pixels only to show that 32bit floats are insufficiently precisse in some cases. [...] -- Michael GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB It is dangerous to be right in matters on which the established authorities are wrong. -- Voltaire signature.asc Description: Digital signature ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
Re: [FFmpeg-devel] [PATCH]lavc/ffv1enc: Support pix_fmt GRAY10
On Tue, Nov 15, 2016 at 12:54:11AM +0100, Carl Eugen Hoyos wrote: > Hi! > > Attached patch allows GRAY10 encoding, decoding works fine with current > FFmpeg. > > Please comment, Carl Eugen > ffv1enc.c |2 ++ > 1 file changed, 2 insertions(+) > b71c71317d488d0f7e8cd4b820c247c977443080 > 0001-lavc-ffv1enc-Support-pix_fmt-GRAY10.patch > From 5fb552828dc775045ad437065a429a54b4f1e353 Mon Sep 17 00:00:00 2001 > From: Carl Eugen Hoyos > Date: Tue, 15 Nov 2016 00:52:30 +0100 > Subject: [PATCH] lavc/ffv1enc: Support pix_fmt GRAY10. > LGTM can you add a matching fate test too ? thx [...] -- Michael GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB No snowflake in an avalanche ever feels responsible. -- Voltaire signature.asc Description: Digital signature ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
Re: [FFmpeg-devel] [PATCH 1/2] Add experimental muxing support for FLAC in ISO BMFF (MP4).
On 11/14/2016 10:38 PM, Matthew Gregan wrote: > Ping? It's been a couple of weeks since the last comment and patch update. > Can I help move this forward somehow? > > Firefox will be shipping experimental support for playback of FLAC in MP4 > based on the current spec in Firefox 51. CCing Matthieu Bouron since he's listed as movenc maintainer. Otherwise I'll push it later this week as they seem ok and the code is under a strict experimental check anyway. ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
Re: [FFmpeg-devel] [PATCH 1/2] libschroedingerdec: don't produce empty frames
On Sun, Nov 13, 2016 at 11:24:45PM +0100, Andreas Cadhalpun wrote: > They are not valid and can cause problems/crashes for API users. > > Signed-off-by: Andreas Cadhalpun > --- > libavcodec/libschroedingerdec.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) should be ok [...] thx -- Michael GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB Awnsering whenever a program halts or runs forever is On a turing machine, in general impossible (turings halting problem). On any real computer, always possible as a real computer has a finite number of states N, and will either halt in less than N cycles or never halt. signature.asc Description: Digital signature ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
Re: [FFmpeg-devel] [PATCH 2/2] libschroedingerdec: fix leaking of framewithpts
On Sun, Nov 13, 2016 at 11:25:32PM +0100, Andreas Cadhalpun wrote: > Signed-off-by: Andreas Cadhalpun > --- > libavcodec/libschroedingerdec.c | 26 +- > 1 file changed, 17 insertions(+), 9 deletions(-) > > diff --git a/libavcodec/libschroedingerdec.c b/libavcodec/libschroedingerdec.c > index 1e392b3..83c790c 100644 > --- a/libavcodec/libschroedingerdec.c > +++ b/libavcodec/libschroedingerdec.c > @@ -218,6 +218,7 @@ static int libschroedinger_decode_frame(AVCodecContext > *avctx, > int outer = 1; > SchroParseUnitContext parse_ctx; > LibSchroFrameContext *framewithpts = NULL; > +int ret; > > *got_frame = 0; > > @@ -236,7 +237,8 @@ static int libschroedinger_decode_frame(AVCodecContext > *avctx, > enc_buf->tag = schro_tag_new(av_malloc(sizeof(int64_t)), > av_free); > if (!enc_buf->tag->value) { > av_log(avctx, AV_LOG_ERROR, "Unable to allocate SchroTag\n"); > -return AVERROR(ENOMEM); > +ret = AVERROR(ENOMEM); > +goto end; > } > AV_WN(64, enc_buf->tag->value, pts); > /* Push buffer into decoder. */ > @@ -267,8 +269,10 @@ static int libschroedinger_decode_frame(AVCodecContext > *avctx, > /* Decoder needs a frame - create one and push it in. */ > frame = ff_create_schro_frame(avctx, >p_schro_params->frame_format); > -if (!frame) > -return AVERROR(ENOMEM); > +if (!frame) { > +ret = AVERROR(ENOMEM); > +goto end; > +} > schro_decoder_add_output_picture(decoder, frame); > break; > this looks a bit strange framewithpts is set to newly allocated memory below which is injected into the que and IIUC that can occur multiple times the free at the end for one of multiple such que entries feels wrong [...] -- Michael GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB It is what and why we do it that matters, not just one of them. signature.asc Description: Digital signature ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpeg.org/mailman/listinfo/ffmpeg-devel