Re: [FFmpeg-devel] [PATCH] fate/oggvorbis: Fix tests after fixing AV_PKT_DATA_SKIP_SAMPLES

2021-07-12 Thread Lynne
10 Jul 2021, 02:12 by sunguangy...@gmail.com:

> After fixing AV_PKT_DATA_SKIP_SAMPLES for reading vorbis packets from ogg,
> the actual decoded samples become fewer. Three fate tests are failing:
>
> fate-vorbis-20:
> The samples in 6.ogg are not frame aligned. 6.pcm file was generated by
> ffmpeg before the fix. After the fix, the decoded pcm file does not match
> anymore. Ideally the ref file 6.pcm should be updated but it is probably
> not worth it including another copy of the same file, only smaller.
> SIZE_TOLERANCE is added for this test case.
>
> fate-webm-dash-chapters:
> The original vorbis_chapter_extension_demo.ogg is transmuxed to dash-webm.
> The ref file webm-dash-chapters needs to be updated.
>
> fate-vorbis-encode:
> This exposes another bug in the vorbis encoder that initial_padding is not
> correctly set. It is fixed in the previous patch.
>
> Signed-off-by: Guangyu Sun 
> ---
>  tests/fate/vorbis.mak | 1 +
>  tests/ref/fate/webm-dash-chapters | 4 ++--
>  2 files changed, 3 insertions(+), 2 deletions(-)
>

Pushed the patchset, thanks
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".


Re: [FFmpeg-devel] [PATCH] cafenc: fill in avg. packet size later if unknown

2021-07-12 Thread Lynne
10 Jul 2021, 09:42 by d...@lynne.ee:

> 10 Jul 2021, 03:10 by roman.bera...@prusa3d.cz:
>
>> Frame size of Opus stream was previously presumed here to be 960 samples
>> (20ms), however sizes of 120, 240, 480, 1920, and 2880 are also allowed.
>> It can also alter on a per-packet basis and even multiple frames may be
>> present in a single packet according to the specification, for the sake
>> of simplicity however, let us assume that this doesn't occur.
>>
>
> Actually 120ms frames are the maximum, so 5760 samples, but that's
> irrelevant to the patch.
>  
>
>> if (pb->seekable & AVIO_SEEKABLE_NORMAL) {
>>  int64_t file_size = avio_tell(pb);
>>  
>>  avio_seek(pb, caf->data, SEEK_SET);
>>  avio_wb64(pb, file_size - caf->data - 8);
>> -avio_seek(pb, file_size, SEEK_SET);
>>  if (!par->block_align) {
>> +int packet_size = samples_per_packet(par->codec_id, 
>> par->channels, par->block_align);
>> +if (!packet_size) {
>> +packet_size = st->duration / (caf->packets - 1);
>> +avio_seek(pb, FRAME_SIZE_OFFSET, SEEK_SET);
>> +avio_wb32(pb, packet_size);
>> +}
>> +avio_seek(pb, file_size, SEEK_SET);
>>  ffio_wfourcc(pb, "pakt");
>>  avio_wb64(pb, caf->size_entries_used + 24);
>>  avio_wb64(pb, caf->packets); ///< mNumberPackets
>> -avio_wb64(pb, caf->packets * samples_per_packet(par->codec_id, 
>> par->channels, par->block_align)); ///< mNumberValidFrames
>> +avio_wb64(pb, caf->packets * packet_size); ///< 
>> mNumberValidFrames
>>  avio_wb32(pb, 0); ///< mPrimingFrames
>>  avio_wb32(pb, 0); ///< mRemainderFrames
>>  avio_write(pb, caf->pkt_sizes, caf->size_entries_used);
>>
>
> This doesn't move the pointer back to the file end if par->block_align is set.
> I think that's fine though, since the function writes the trailer, which 
> should
> mean that nothing more needs to be written.
> Patch LGTM. But please, someone yell at Apple to support Opus in MP4,
> WebM and OGG, as terrible as that is.
>

Patch pushed, thanks
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".


[FFmpeg-devel] [PATCH v2] libavcodec/libx265: add user data unregistered SEI encoding

2021-07-12 Thread Brad Hards
MISB ST 0604 and ST 2101 require user data unregistered SEI messages
(precision timestamps and sensor identifiers) to be included. That
currently isn't supported for libx265. This patch adds support
for user data unregistered SEI messages in accordance with
ISO/IEC 23008-2:2020 Section D.2.7

The design is based on nvenc, with support finished up at
57de80673cb
---
 libavcodec/libx265.c | 33 +
 1 file changed, 33 insertions(+)

diff --git a/libavcodec/libx265.c b/libavcodec/libx265.c
index 90658d3d9e..9395120471 100644
--- a/libavcodec/libx265.c
+++ b/libavcodec/libx265.c
@@ -35,6 +35,7 @@
 #include "encode.h"
 #include "internal.h"
 #include "packet_internal.h"
+#include "sei.h"
 
 typedef struct libx265Context {
 const AVClass *class;
@@ -51,6 +52,9 @@ typedef struct libx265Context {
 char *profile;
 AVDictionary *x265_opts;
 
+void *sei_data;
+int sei_data_size;
+
 /**
  * If the encoder does not support ROI then warn the first time we
  * encounter a frame with ROI side data.
@@ -78,6 +82,7 @@ static av_cold int libx265_encode_close(AVCodecContext *avctx)
 libx265Context *ctx = avctx->priv_data;
 
 ctx->api->param_free(ctx->params);
+av_freep(&ctx->sei_data);
 
 if (ctx->encoder)
 ctx->api->encoder_close(ctx->encoder);
@@ -489,6 +494,8 @@ static int libx265_encode_frame(AVCodecContext *avctx, 
AVPacket *pkt,
 ctx->api->picture_init(ctx->params, &x265pic);
 
 if (pic) {
+x265_sei *sei = &x265pic.userSEI;
+sei->numPayloads = 0;
 for (i = 0; i < 3; i++) {
x265pic.planes[i] = pic->data[i];
x265pic.stride[i] = pic->linesize[i];
@@ -516,6 +523,32 @@ static int libx265_encode_frame(AVCodecContext *avctx, 
AVPacket *pkt,
 
 memcpy(x265pic.userData, &pic->reordered_opaque, 
sizeof(pic->reordered_opaque));
 }
+
+for (i = 0; i < pic->nb_side_data; i++) {
+AVFrameSideData *side_data = pic->side_data[i];
+void *tmp;
+x265_sei_payload *sei_payload;
+
+if (side_data->type != AV_FRAME_DATA_SEI_UNREGISTERED)
+continue;
+
+tmp = av_fast_realloc(ctx->sei_data,
+  &ctx->sei_data_size,
+  (sei->numPayloads + 1) * 
sizeof(*sei_payload));
+if (!tmp) {
+av_freep(&x265pic.userData);
+av_freep(&x265pic.quantOffsets);
+return AVERROR(ENOMEM);
+}
+ctx->sei_data = tmp;
+sei->payloads = ctx->sei_data;
+sei_payload = &sei->payloads[sei->numPayloads];
+sei_payload->payload = side_data->data;
+sei_payload->payloadSize = side_data->size;
+/* Equal to libx265 USER_DATA_UNREGISTERED */
+sei_payload->payloadType = SEI_TYPE_USER_DATA_UNREGISTERED;
+sei->numPayloads++;
+}
 }
 
 ret = ctx->api->encoder_encode(ctx->encoder, &nal, &nnal,
-- 
2.27.0

___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".


Re: [FFmpeg-devel] Request for review - x265 User Data Unregistered SEI patch

2021-07-12 Thread Brad Hards
On Sunday, 11 July 2021 10:01:47 PM AEST Derek Buitenhuis wrote:
> Can you amend the commit message to contain the reasoning from [1]?
Amended.

> A quick review:
> > +void *sei_data;
> > +int sei_data_size;
> 
> I don't see sei_data freed anywhere at the end of decoding?
Fixed in v2. Included in the _close().
 
> >  if (pic) {
> > 
> > +x265_sei *sei = &(x265pic.userSEI);
> 
> Drop the paren for consistency with the rest of the codebase.
Fixed in v2.
 
> > +tmp = av_fast_realloc(ctx->sei_data,
> > +  &ctx->sei_data_size,
> > +  (sei->numPayloads + 1) *
> > sizeof(x265_sei_payload));
> Convention in FFmpeg is to do sizeof(*var).
Fixed in v2.

> > +if (!tmp) {
> > +av_freep(&x265pic.userData);
> > +av_freep(&x265pic.quantOffsets);
> > +return AVERROR(ENOMEM);
> > +} else {
> 
> This else statement is not needed.
Fixed in v2.
 
> > +sei_payload = &(sei->payloads[sei->numPayloads]);
> 
> Drop the paren.
Fixed in v2.
 
> > +sei_payload->payloadType = USER_DATA_UNREGISTERED;
> 
> I'm surprised x265 has un-namespaced enums... gross.
I took Timo's suggest in v2, although I conceptually wanted to say "the x265 
value". So 
there is a comment.

Brad

___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".


Re: [FFmpeg-devel] [PATCH v3 0/2] libx264 configure check clean-up

2021-07-12 Thread Jan Ekström
On Sat, Jul 10, 2021 at 8:55 PM James Almer  wrote:
>
> On 7/10/2021 1:26 PM, Jan Ekström wrote:
> > On Wed, Jul 7, 2021, 22:01 Jan Ekström  wrote:
> >
> >> Changes compared to v2:
> >> - Kept the CONFIG_LIBX264RGB_ENCODER define check for ff_libx264rgb_encoder
> >>and the AVClass for libx264rgb.
> >> - Removed the libx264rgb removal from this patch set since while I hoped I
> >>would be getting the initial two fixups reviewed even if people would
> >> oppose
> >>the libx264rgb removal, so at least those could get in - that didn't 
> >> seem
> >>to be happening. This way I hope people would be more likely to focus on
> >>that bit at first.
> >>
> >> The patch set contains two improvements to the libx264rgb configure checks,
> >> as I found out that for all the time I had been building FFmpeg with a
> >> custom
> >> prefix and utilizing pkg-config - it never got enabled due to the configure
> >> check relying on the header being in the default include paths or in
> >> extra-cflags.
> >>
> >> - The first change fixes libx264rgb enablement without having x264.h
> >>in the system default include path, such as with custom prefixes.
> >>
> >> - The second change removes the separate X264_CSP_BGR check as x264.h
> >>has this define unconditionally defined with the required X264_BUILD
> >>118 or newer (it was added a few X264_BUILD versions before).
> >>
> >>This change was checked by bumping the require_cpp_condition
> >>check to X264_BUILD >= 255 and checking with both pkg-config
> >>as well as by not having PKG_CONFIG_PATH defined as well as
> >>making the non-pkg-config check pass with
> >>`--extra-cflags="-I/prefix/include" --extra-ldflags="-L/prefix/lib 
> >> -ldl"`
> >>So the X264_BUILD check should properly fail the enablement in
> >>case X264_BUILD is older than the one requested in the relevant
> >>require_cpp_condition.
> >>
> >> Best regards,
> >> Jan
> >>
> >> Jan Ekström (2):
> >>configure: move x264_csp_bgr check under general libx264 checks
> >>{configure,avcodec/libx264}: remove separate x264_csp_bgr check
> >>
> >>   configure| 3 +--
> >>   libavcodec/libx264.c | 2 --
> >>   2 files changed, 1 insertion(+), 4 deletions(-)
> >>
> >> --
> >> 2.31.1
> >>
> >
> > Ping on this patch set.
> >
> > These should be relatively straightforward changes that enable x264rgb when
> > it is searched through pkg-config, and testable by installing x264 into a
> > specific prefix and not having its headers in the default search path (but
> > setting PKG_CONFIG_PATH accordingly).
> >
> > Jan
>
> Should be ok.
>

Thanks, applied as 25d28f297b755d3cb6a3e036a1e251148d0e4d5c and
f32f56468c6caa03f4ebbf6cf58b2bb7bc775216 .

> And removing libx264rgb altogether should be done at some point if you
> can get rgb output with the normal wrapper.

We purely limit it by the accessible color spaces defined for the
AVCodec, so by removing that limitation you most definitely can get
RGB output from the normal wrapper. I had that patch as part of v1/2
of this set. I will now post a separate patch for it.

If there are some steps that have to be done for it, I'm fine with
that (like enabling libx264 to handle RGB in a first commit, then
marking libx264rgb for deprecation etc). But the only comments I got
last time were:
- "-c:v libx264rgb" no longer works. (well yes, the idea of the patch
was to remove it)
- the current libx264 default is more supported by non-libavcodec
decoders (which generally with RGB input at the moment is and probably
always was 4:4:4 YCbCr - which I am not sure if it is in any way or
form a major difference against 4:4:4 + RGB color space metadata?).
The original ticket against the backdrop of which libx264rgb was
originally split was 100% talking about yuv420p as the default for HW
support (https://trac.ffmpeg.org/ticket/658), which is not what the
split is doing.

Jan
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".


[FFmpeg-devel] [PATCH] cafenc: fill in avg. packet size later if unknown

2021-07-12 Thread Roman Beranek
In-Reply-To: 
References: <20210710011006.3383868-1-roman.bera...@prusa3d.cz> 

Reply-To: FFmpeg development discussions and patches 

> This doesn't move the pointer back to the file end if par->block_align is set.
> I think that's fine though, since the function writes the trailer, which 
> should
> mean that nothing more needs to be written.

Is it a convention to leave the pointer positioned at the end of the file?
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".


Re: [FFmpeg-devel] FFMPEG for V4L2 M2M devices ?

2021-07-12 Thread Dave Stevenson
On Sat, 10 Jul 2021 at 00:56, Brad Hards  wrote:
>
> On Saturday, 10 July 2021 8:53:27 AM AEST Andrii wrote:
> > I am working on porting a Kodi player to an NVidia Jetson Nano device. I've
> > been developing a decoder for quite some time now, and realized that the
> > best approach would be to have it inside of ffmpeg, instead of embedding
> > the decoder into Kodi as it heavily relies on FFMPEG. Just wondering if
> > there is any effort in making FFMPEG suppring M2M V4L devices ?
>
> https://git.ffmpeg.org/gitweb/ffmpeg.git/blob_plain/HEAD:/libavcodec/v4l2_m2m.c[1]
>
> I guess that would be the basis for further work as required to meet your 
> needs.

Do note that there are 2 V4L2 M2M decoder APIs - the stateful API[1] ,
and the stateless API [2]. They differ in the amount of bitstream
parsing and buffer management that the driver implements vs expecting
the client to do.

The *_v4l2m2m drivers within FFMPEG support the stateful API (ie the
kernel driver has bitstream parsing). For Raspberry Pi we use that to
support the (older) H264 implementation, and FFMPEG master does that
very well.

The Pi HEVC decoder uses the V4L2 stateless API. Stateless HEVC
support hasn't been merged to the mainline kernel as yet, so there are
downstream patches to support that.

A quick Google implies that NVidia already has a stateful V4L2 M2M
driver in their vendor kernel. Other than the strange choice of device
node name (/dev/nvhost-nvdec), the details at [3] make it look like a
normal V4L2 M2M decoder that has a good chance of working against
h264_v4l2m2m.

[1] 
https://www.kernel.org/doc/html/latest/userspace-api/media/v4l/dev-decoder.html
[2] 
https://www.kernel.org/doc/html/latest/userspace-api/media/v4l/dev-stateless-decoder.html
[3] https://docs.nvidia.com/jetson/l4t-multimedia/group__V4L2Dec.html

  Dave

> Brad
>
> 
> [1] 
> https://git.ffmpeg.org/gitweb/ffmpeg.git/blob_plain/HEAD:/libavcodec/v4l2_m2m.c
> ___
> ffmpeg-devel mailing list
> ffmpeg-devel@ffmpeg.org
> https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
>
> To unsubscribe, visit link above, or email
> ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".


[FFmpeg-devel] [PATCH] cafenc: fill in avg. packet size later if unknown

2021-07-12 Thread Roman Beranek
> This doesn't move the pointer back to the file end if par->block_align is set.
> I think that's fine though, since the function writes the trailer, which 
> should
> mean that nothing more needs to be written.

Is it a convention to leave the pointer positioned at the end of
the file?
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".


Re: [FFmpeg-devel] [PATCH 1/2] libavutil/cpu: Adds fast gather detection.

2021-07-12 Thread Alan Kelly
On Fri, Jun 25, 2021 at 1:24 PM Alan Kelly  wrote:

> On Fri, Jun 25, 2021 at 10:40 AM Lynne  wrote:
>
>> Jun 25, 2021, 09:54 by alankelly-at-google@ffmpeg.org:
>>
>> > Broadwell and later and Zen3 and later have fast gather instructions.
>> > ---
>> >  Gather requires between 9 and 12 cycles on Haswell, 5 to 7 on
>> Broadwell,
>> >  and 2 to 5 on Skylake and newer. It is also slow on AMD before Zen 3.
>> >  libavutil/cpu.h |  2 ++
>> >  libavutil/x86/cpu.c | 18 --
>> >  libavutil/x86/cpu.h |  1 +
>> >  3 files changed, 19 insertions(+), 2 deletions(-)
>> >
>>
>> No, we really don't need more FAST/SLOW flags, especially for
>> something like this which is just fixable by _not_using_vgather_.
>> Take a look at libavutil/x86/tx_float.asm, we only use vgather
>> if it's guaranteed to either be faster for what we're gathering or
>> is just as fast "slow". If neither is true, we use manual lookups,
>> which is actually advantageous since for AVX2 we can interleave
>> the lookups that happen in each lane.
>>
>> Even if we disregard this, I've extensively benchmarked vgather
>> on Zen 3, Zen 2, Cascade Lake and Skylake, and there's hardly
>> a great vgather improvement to be found in Zen 3 to justify
>> using a new CPU flag for this.
>> ___
>> ffmpeg-devel mailing list
>> ffmpeg-devel@ffmpeg.org
>> https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
>>
>> To unsubscribe, visit link above, or email
>> ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".
>>
>
> Thanks for your response. I'm not against finding a cleaner way of
> enabling/disabling the code which will be protected by this flag. However,
> the manual lookups solution proposed will not work in this case, the avx2
> version of hscale will only be faster if fast gathers are available,
> otherwise, the ssse3 version should be used.
>
> I haven't got access to a Zen3 so I can't comment on the performance. I
> have tested on a Zen 2 and it is slow. On Broadwell hscale avx2 is about
> 10% faster than the ssse3 version and on Skylake about 40% faster, Haswell
> has similar performance to Zen2.
>
> Is there a proxy which could be used for detecting Broadwell or Skylake
> and later? AVX512 seems too strict as there are Skylake chips without
> AVX512. Thanks
>

Hi,

I will paste the performance figures from the thread for the other part of
this patch here so that the justification for this flag is clearer:

Skylake Haswell
hscale_8_to_15_width4_ssse3 761.2 760
hscale_8_to_15_width4_avx2 468.7 957
hscale_8_to_15_width8_ssse3 1170.7 1032
hscale_8_to_15_width8_avx2 865.7 1979
hscale_8_to_15_width12_ssse3 2172.2 2472
hscale_8_to_15_width12_avx2 1245.7 2901
hscale_8_to_15_width16_ssse3 2244.2 2400
hscale_8_to_15_width16_avx2 1647.2 3681

As you can see, it is catastrophic on Haswell and older chips but the gains
on Skylake are impressive.
As I don't have performance figures for Zen 3, I can disable this feature
on all cpus apart from Broadwell and later as you say that there is no
worthwhile improvement on Zen3. Is this OK with you?

Thanks
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".


Re: [FFmpeg-devel] [PATCH 24/24] lavfi/vf_scale: implement slice threading

2021-07-12 Thread Anton Khirnov
Quoting Michael Niedermayer (2021-07-03 18:27:36)
> On Sat, Jul 03, 2021 at 03:27:36PM +0200, Anton Khirnov wrote:
> > Quoting Michael Niedermayer (2021-06-01 11:35:13)
> > > On Mon, May 31, 2021 at 09:55:15AM +0200, Anton Khirnov wrote:
> > > > ---
> > > >  libavfilter/vf_scale.c | 182 +++--
> > > >  1 file changed, 141 insertions(+), 41 deletions(-)
> > > 
> > > breaks: (lower 50% is bright green)
> > > ./ffplay -i mm-short.mpg -an   -vf "tinterlace,scale=720:576:interl=1"
> > 
> > Fixed locally, but I'm wondering why is interlaced scaling not done by
> > default for interlaced videos.
> 
> IIRC the flags where quite unreliable. If we have reliable knowledge about
> interlacing it certainly should be used automatically

You mean there is a significant amount of progressive content that is
flagged as interlaced?

-- 
Anton Khirnov
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".


Re: [FFmpeg-devel] [PATCH 1/2] libavutil/cpu: Adds fast gather detection.

2021-07-12 Thread Lynne
12 Jul 2021, 11:29 by alankelly-at-google@ffmpeg.org:

> On Fri, Jun 25, 2021 at 1:24 PM Alan Kelly  wrote:
>
>> On Fri, Jun 25, 2021 at 10:40 AM Lynne  wrote:
>>
>>> Jun 25, 2021, 09:54 by alankelly-at-google@ffmpeg.org:
>>>
>>> > Broadwell and later and Zen3 and later have fast gather instructions.
>>> > ---
>>> >  Gather requires between 9 and 12 cycles on Haswell, 5 to 7 on
>>> Broadwell,
>>> >  and 2 to 5 on Skylake and newer. It is also slow on AMD before Zen 3.
>>> >  libavutil/cpu.h |  2 ++
>>> >  libavutil/x86/cpu.c | 18 --
>>> >  libavutil/x86/cpu.h |  1 +
>>> >  3 files changed, 19 insertions(+), 2 deletions(-)
>>> >
>>>
>>> No, we really don't need more FAST/SLOW flags, especially for
>>> something like this which is just fixable by _not_using_vgather_.
>>> Take a look at libavutil/x86/tx_float.asm, we only use vgather
>>> if it's guaranteed to either be faster for what we're gathering or
>>> is just as fast "slow". If neither is true, we use manual lookups,
>>> which is actually advantageous since for AVX2 we can interleave
>>> the lookups that happen in each lane.
>>>
>>> Even if we disregard this, I've extensively benchmarked vgather
>>> on Zen 3, Zen 2, Cascade Lake and Skylake, and there's hardly
>>> a great vgather improvement to be found in Zen 3 to justify
>>> using a new CPU flag for this.
>>> ___
>>> ffmpeg-devel mailing list
>>> ffmpeg-devel@ffmpeg.org
>>> https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
>>>
>>> To unsubscribe, visit link above, or email
>>> ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".
>>>
>>
>> Thanks for your response. I'm not against finding a cleaner way of
>> enabling/disabling the code which will be protected by this flag. However,
>> the manual lookups solution proposed will not work in this case, the avx2
>> version of hscale will only be faster if fast gathers are available,
>> otherwise, the ssse3 version should be used.
>>
>> I haven't got access to a Zen3 so I can't comment on the performance. I
>> have tested on a Zen 2 and it is slow. On Broadwell hscale avx2 is about
>> 10% faster than the ssse3 version and on Skylake about 40% faster, Haswell
>> has similar performance to Zen2.
>>
>> Is there a proxy which could be used for detecting Broadwell or Skylake
>> and later? AVX512 seems too strict as there are Skylake chips without
>> AVX512. Thanks
>>
>
> Hi,
>
> I will paste the performance figures from the thread for the other part of
> this patch here so that the justification for this flag is clearer:
>
> Skylake Haswell
> hscale_8_to_15_width4_ssse3 761.2 760
> hscale_8_to_15_width4_avx2 468.7 957
> hscale_8_to_15_width8_ssse3 1170.7 1032
> hscale_8_to_15_width8_avx2 865.7 1979
> hscale_8_to_15_width12_ssse3 2172.2 2472
> hscale_8_to_15_width12_avx2 1245.7 2901
> hscale_8_to_15_width16_ssse3 2244.2 2400
> hscale_8_to_15_width16_avx2 1647.2 3681
>
> As you can see, it is catastrophic on Haswell and older chips but the gains
> on Skylake are impressive.
> As I don't have performance figures for Zen 3, I can disable this feature
> on all cpus apart from Broadwell and later as you say that there is no
> worthwhile improvement on Zen3. Is this OK with you?
>

It's not that catastrophic. Since Haswell CPUs generally don't have
large AVX2 gains, could you just exclude Haswell only from
EXTERNAL_AVX2_FAST, and require EXTERNAL_AVX2_FAST
to enable those functions?
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".


Re: [FFmpeg-devel] [PATCH] mxfdec.c: fixed frame wrapping detection for MXFGCP1FrameWrappedPicture essence container

2021-07-12 Thread Tomas Härdin
sön 2021-07-11 klockan 09:47 -0700 skrev p...@sandflow.com:
> From: Pierre-Anthony Lemieux 
> 
> Signed-off-by: Pierre-Anthony Lemieux 
> ---
> 
> Notes:
> For JPEG 2000 essence, the MXF input format module currently uses the 
> value of byte 14 of the essence container UL to determines whether the J2K 
> essence is clip- (byte 14 is 0x02)
> or frame-wrapped (byte 14 is 0x01). This approach does work when the 
> essence container UL is equal to MXFGCP1FrameWrappedPicture, in which case 
> the essence is always frame-wrapped.
> 
>  libavformat/mxf.h| 3 ++-
>  libavformat/mxfdec.c | 4 
>  2 files changed, 6 insertions(+), 1 deletion(-)
> 
> diff --git a/libavformat/mxf.h b/libavformat/mxf.h
> index b1b1fedac7..ca510f5a2f 100644
> --- a/libavformat/mxf.h
> +++ b/libavformat/mxf.h
> @@ -75,7 +75,8 @@ typedef enum {
>  NormalWrap = 0,
>  D10D11Wrap,
>  RawAWrap,
> -RawVWrap
> +RawVWrap,
> +AlwaysFrameWrap
>  } MXFWrappingIndicatorType;
>  
>  typedef struct MXFLocalTagPair {
> diff --git a/libavformat/mxfdec.c b/libavformat/mxfdec.c
> index 3bf480a3a6..7024d2ea7d 100644
> --- a/libavformat/mxfdec.c
> +++ b/libavformat/mxfdec.c
> @@ -1413,6 +1413,7 @@ static void *mxf_resolve_strong_ref(MXFContext *mxf, 
> UID *strong_ref, enum MXFMe
>  
>  static const MXFCodecUL mxf_picture_essence_container_uls[] = {
>  // video essence container uls
> +{ { 
> 0x06,0x0e,0x2b,0x34,0x04,0x01,0x01,0x07,0x0d,0x01,0x03,0x01,0x02,0x0c,0x06,0x00
>  }, 15,   AV_CODEC_ID_JPEG2000, NULL, 16, AlwaysFrameWrap }, /*  MXF-GC P1 
> Frame-Wrapped JPEG 2000 */
>  { { 
> 0x06,0x0e,0x2b,0x34,0x04,0x01,0x01,0x07,0x0d,0x01,0x03,0x01,0x02,0x0c,0x01,0x00
>  }, 14,   AV_CODEC_ID_JPEG2000, NULL, 14 },
>  { { 
> 0x06,0x0e,0x2b,0x34,0x04,0x01,0x01,0x02,0x0d,0x01,0x03,0x01,0x02,0x10,0x60,0x01
>  }, 14,   AV_CODEC_ID_H264, NULL, 15 }, /* H.264 */
>  { { 
> 0x06,0x0e,0x2b,0x34,0x04,0x01,0x01,0x02,0x0d,0x01,0x03,0x01,0x02,0x11,0x01,0x00
>  }, 14,  AV_CODEC_ID_DNXHD, NULL, 14 }, /* VC-3 */
> @@ -1497,6 +1498,9 @@ static MXFWrappingScheme mxf_get_wrapping_kind(UID 
> *essence_container_ul)
>  if (val == 0x02)
>  val = 0x01;
>  break;
> +case AlwaysFrameWrap:
> +val = 0x01;
> +break;

Looks OK. Still passes FATE.

/Tomas

___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".


[FFmpeg-devel] [RFC/PATCH v2] swscale slice threading

2021-07-12 Thread Anton Khirnov
Hi,
here is a new iteration of $subj. Compared to the first version,
threading has been moved into sws using lavu slicethread.

There is also a new AVFrame-based API that allows submitting and
receiving partial slices (at least API-wise, the implementation will
still wait for complete input). There is still no way for the caller to
know how much input is required for a given amount of output, but that
may be implemented in a new function in the future, if there is a use
case for it.

The set still needs some polishing, but I am sending it now to see if
the general shape of the API is now acceptable.

Please comment.
-- 
Anton Khirnov

___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".


[FFmpeg-devel] [PATCH 2/8] FATE: add a test for sliced scaling

2021-07-12 Thread Anton Khirnov
---
 Makefile  |   2 +
 tests/Makefile|   1 +
 tests/fate/libswscale.mak |  11 +++
 tools/Makefile|   3 +-
 tools/scale_slice_test.c  | 190 ++
 5 files changed, 206 insertions(+), 1 deletion(-)
 create mode 100644 tools/scale_slice_test.c

diff --git a/Makefile b/Makefile
index 1e3da6271b..26c9107237 100644
--- a/Makefile
+++ b/Makefile
@@ -64,6 +64,8 @@ tools/target_io_dem_fuzzer$(EXESUF): 
tools/target_io_dem_fuzzer.o $(FF_DEP_LIBS)
 
 tools/enum_options$(EXESUF): ELIBS = $(FF_EXTRALIBS)
 tools/enum_options$(EXESUF): $(FF_DEP_LIBS)
+tools/scale_slice_test$(EXESUF): $(FF_DEP_LIBS)
+tools/scale_slice_test$(EXESUF): ELIBS = $(FF_EXTRALIBS)
 tools/sofa2wavs$(EXESUF): ELIBS = $(FF_EXTRALIBS)
 tools/uncoded_frame$(EXESUF): $(FF_DEP_LIBS)
 tools/uncoded_frame$(EXESUF): ELIBS = $(FF_EXTRALIBS)
diff --git a/tests/Makefile b/tests/Makefile
index d726484b3a..e42e66d81b 100644
--- a/tests/Makefile
+++ b/tests/Makefile
@@ -221,6 +221,7 @@ $(FATE_FFPROBE) $(FATE_FFMPEG_FFPROBE) 
$(FATE_SAMPLES_FFPROBE) $(FATE_SAMPLES_FF
 
 $(FATE_SAMPLES_FASTSTART): tools/qt-faststart$(EXESUF)
 $(FATE_SAMPLES_DUMP_DATA): tools/venc_data_dump$(EXESUF)
+$(FATE_SAMPLES_SCALE_SLICE): tools/scale_slice_test$(EXESUF)
 
 ifdef SAMPLES
 FATE += $(FATE_EXTERN)
diff --git a/tests/fate/libswscale.mak b/tests/fate/libswscale.mak
index 5ec5f34cc4..599d27b0a5 100644
--- a/tests/fate/libswscale.mak
+++ b/tests/fate/libswscale.mak
@@ -6,6 +6,17 @@ FATE_LIBSWSCALE += fate-sws-floatimg-cmp
 fate-sws-floatimg-cmp: libswscale/tests/floatimg_cmp$(EXESUF)
 fate-sws-floatimg-cmp: CMD = run libswscale/tests/floatimg_cmp$(EXESUF)
 
+SWS_SLICE_TEST-$(call DEMDEC, MATROSKA, VP9) += 
fate-sws-slice-yuv422-12bit-rgb48
+fate-sws-slice-yuv422-12bit-rgb48: CMD = run tools/scale_slice_test$(EXESUF) 
$(TARGET_SAMPLES)/vp9-test-vectors/vp93-2-20-12bit-yuv422.webm 150 100 rgb48
+
+SWS_SLICE_TEST-$(call DEMDEC, IMAGE_BMP_PIPE, BMP) += fate-sws-slice-bgr0-nv12
+fate-sws-slice-bgr0-nv12: CMD = run tools/scale_slice_test$(EXESUF) 
$(TARGET_SAMPLES)/bmp/test32bf.bmp 32 64 nv12
+
+fate-sws-slice: $(SWS_SLICE_TEST-yes)
+$(SWS_SLICE_TEST-yes): tools/scale_slice_test$(EXESUF)
+$(SWS_SLICE_TEST-yes): REF = /dev/null
+FATE_LIBSWSCALE += $(SWS_SLICE_TEST-yes)
+
 FATE_LIBSWSCALE += $(FATE_LIBSWSCALE-yes)
 FATE-$(CONFIG_SWSCALE) += $(FATE_LIBSWSCALE)
 fate-libswscale: $(FATE_LIBSWSCALE)
diff --git a/tools/Makefile b/tools/Makefile
index ec260f254e..f4d1327b9f 100644
--- a/tools/Makefile
+++ b/tools/Makefile
@@ -1,4 +1,4 @@
-TOOLS = enum_options qt-faststart trasher uncoded_frame
+TOOLS = enum_options qt-faststart scale_slice_test trasher uncoded_frame
 TOOLS-$(CONFIG_LIBMYSOFA) += sofa2wavs
 TOOLS-$(CONFIG_ZLIB) += cws2fws
 
@@ -18,6 +18,7 @@ tools/target_io_dem_fuzzer.o: tools/target_dem_fuzzer.c
$(COMPILE_C) -DIO_FLAT=0
 
 tools/venc_data_dump$(EXESUF): tools/decode_simple.o
+tools/scale_slice_test$(EXESUF): tools/decode_simple.o
 
 OUTDIRS += tools
 
diff --git a/tools/scale_slice_test.c b/tools/scale_slice_test.c
new file mode 100644
index 00..d869eaae74
--- /dev/null
+++ b/tools/scale_slice_test.c
@@ -0,0 +1,190 @@
+/*
+ * This file is part of FFmpeg.
+ *
+ * FFmpeg is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU Lesser General Public
+ * License as published by the Free Software Foundation; either
+ * version 2.1 of the License, or (at your option) any later version.
+ *
+ * FFmpeg is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+ * Lesser General Public License for more details.
+ *
+ * You should have received a copy of the GNU Lesser General Public
+ * License along with FFmpeg; if not, write to the Free Software
+ * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA
+ */
+
+#include 
+#include 
+#include 
+
+#include "decode_simple.h"
+
+#include "libavutil/common.h"
+#include "libavutil/pixdesc.h"
+#include "libavutil/error.h"
+#include "libavutil/lfg.h"
+#include "libavutil/random_seed.h"
+#include "libavutil/video_enc_params.h"
+
+#include "libavformat/avformat.h"
+
+#include "libavcodec/avcodec.h"
+
+#include "libswscale/swscale.h"
+
+typedef struct PrivData {
+unsigned int random_seed;
+AVLFGlfg;
+
+struct SwsContext *scaler;
+
+int v_shift_dst, h_shift_dst;
+int v_shift_src, h_shift_src;
+
+AVFrame *frame_ref;
+AVFrame *frame_dst;
+} PrivData;
+
+static int process_frame(DecodeContext *dc, AVFrame *frame)
+{
+PrivData *pd = dc->opaque;
+int slice_start = 0;
+int ret;
+
+if (!frame)
+return 0;
+
+if (!pd->scaler) {
+pd->scaler = sws_getContext(frame->width, frame->height, frame->format,
+pd->frame_ref->width, 
pd->frame_ref->height,
+   

[FFmpeg-devel] [PATCH 3/8] lavfi/vf_scale: remove the nb_slices option

2021-07-12 Thread Anton Khirnov
It was intended for debugging only and has been superseded by the
standalone tool for testing sliced scaling.
---
 libavfilter/vf_scale.c | 14 --
 1 file changed, 14 deletions(-)

diff --git a/libavfilter/vf_scale.c b/libavfilter/vf_scale.c
index 71d7fa2890..39ab3a4b28 100644
--- a/libavfilter/vf_scale.c
+++ b/libavfilter/vf_scale.c
@@ -149,8 +149,6 @@ typedef struct ScaleContext {
 int force_original_aspect_ratio;
 int force_divisible_by;
 
-int nb_slices;
-
 int eval_mode;  ///< expression evaluation mode
 
 } ScaleContext;
@@ -794,17 +792,6 @@ scale:
 ret = scale_slice(scale, out, in, scale->isws[0], 0, (link->h+1)/2, 2, 
0);
 if (ret >= 0)
 ret = scale_slice(scale, out, in, scale->isws[1], 0,  link->h   
/2, 2, 1);
-} else if (scale->nb_slices) {
-int i, slice_h, slice_start, slice_end = 0;
-const int nb_slices = FFMIN(scale->nb_slices, link->h);
-for (i = 0; i < nb_slices; i++) {
-slice_start = slice_end;
-slice_end   = (link->h * (i+1)) / nb_slices;
-slice_h = slice_end - slice_start;
-ret = scale_slice(scale, out, in, scale->sws, slice_start, 
slice_h, 1, 0);
-if (ret < 0)
-break;
-}
 } else {
 ret = scale_slice(scale, out, in, scale->sws, 0, link->h, 1, 0);
 }
@@ -936,7 +923,6 @@ static const AVOption scale_options[] = {
 { "force_divisible_by", "enforce that the output resolution is divisible 
by a defined integer when force_original_aspect_ratio is used", 
OFFSET(force_divisible_by), AV_OPT_TYPE_INT, { .i64 = 1}, 1, 256, FLAGS },
 { "param0", "Scaler param 0", OFFSET(param[0]),  
AV_OPT_TYPE_DOUBLE, { .dbl = SWS_PARAM_DEFAULT  }, INT_MIN, INT_MAX, FLAGS },
 { "param1", "Scaler param 1", OFFSET(param[1]),  
AV_OPT_TYPE_DOUBLE, { .dbl = SWS_PARAM_DEFAULT  }, INT_MIN, INT_MAX, FLAGS },
-{ "nb_slices", "set the number of slices (debug purpose only)", 
OFFSET(nb_slices), AV_OPT_TYPE_INT, { .i64 = 0 }, 0, INT_MAX, FLAGS },
 { "eval", "specify when to evaluate expressions", OFFSET(eval_mode), 
AV_OPT_TYPE_INT, {.i64 = EVAL_MODE_INIT}, 0, EVAL_MODE_NB-1, FLAGS, "eval" },
  { "init",  "eval expressions once during initialization", 0, 
AV_OPT_TYPE_CONST, {.i64=EVAL_MODE_INIT},  .flags = FLAGS, .unit = "eval" },
  { "frame", "eval expressions during initialization and per-frame", 0, 
AV_OPT_TYPE_CONST, {.i64=EVAL_MODE_FRAME}, .flags = FLAGS, .unit = "eval" },
-- 
2.30.2

___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".


[FFmpeg-devel] [PATCH 4/8] lavu/slicethread: return ENOSYS rather than EINVAL in the dummy func

2021-07-12 Thread Anton Khirnov
EINVAL is the wrong error code here, since the arguments passed to the
function are valid. The error is that the function is not implemented in
the build, which corresponds to ENOSYS.
---
 libavutil/slicethread.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/libavutil/slicethread.c b/libavutil/slicethread.c
index dfbe551ef2..fd2145040d 100644
--- a/libavutil/slicethread.c
+++ b/libavutil/slicethread.c
@@ -239,7 +239,7 @@ int avpriv_slicethread_create(AVSliceThread **pctx, void 
*priv,
   int nb_threads)
 {
 *pctx = NULL;
-return AVERROR(EINVAL);
+return AVERROR(ENOSYS);
 }
 
 void avpriv_slicethread_execute(AVSliceThread *ctx, int nb_jobs, int 
execute_main)
-- 
2.30.2

___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".


[FFmpeg-devel] [PATCH 1/8] tools/venc_data_dump: factor out demux/decode code

2021-07-12 Thread Anton Khirnov
It can be shared with other simple demux/decode tools.
---
 tests/ref/fate/source  |   1 +
 tools/Makefile |   2 +
 tools/decode_simple.c  | 157 +
 tools/decode_simple.h  |  53 ++
 tools/venc_data_dump.c | 156 +---
 5 files changed, 248 insertions(+), 121 deletions(-)
 create mode 100644 tools/decode_simple.c
 create mode 100644 tools/decode_simple.h

diff --git a/tests/ref/fate/source b/tests/ref/fate/source
index c64bc05241..69dcdc4f27 100644
--- a/tests/ref/fate/source
+++ b/tests/ref/fate/source
@@ -20,5 +20,6 @@ Headers without standard inclusion guards:
 compat/djgpp/math.h
 compat/float/float.h
 compat/float/limits.h
+tools/decode_simple.h
 Use of av_clip() where av_clip_uintp2() could be used:
 Use of av_clip() where av_clip_intp2() could be used:
diff --git a/tools/Makefile b/tools/Makefile
index 82baa8eadb..ec260f254e 100644
--- a/tools/Makefile
+++ b/tools/Makefile
@@ -17,6 +17,8 @@ tools/target_dem_fuzzer.o: tools/target_dem_fuzzer.c
 tools/target_io_dem_fuzzer.o: tools/target_dem_fuzzer.c
$(COMPILE_C) -DIO_FLAT=0
 
+tools/venc_data_dump$(EXESUF): tools/decode_simple.o
+
 OUTDIRS += tools
 
 clean::
diff --git a/tools/decode_simple.c b/tools/decode_simple.c
new file mode 100644
index 00..b679fd7ce6
--- /dev/null
+++ b/tools/decode_simple.c
@@ -0,0 +1,157 @@
+/*
+ * This file is part of FFmpeg.
+ *
+ * FFmpeg is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU Lesser General Public
+ * License as published by the Free Software Foundation; either
+ * version 2.1 of the License, or (at your option) any later version.
+ *
+ * FFmpeg is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+ * Lesser General Public License for more details.
+ *
+ * You should have received a copy of the GNU Lesser General Public
+ * License along with FFmpeg; if not, write to the Free Software
+ * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA
+ */
+
+/* shared code for simple demux/decode tools */
+
+#include 
+#include 
+
+#include "decode_simple.h"
+
+#include "libavformat/avformat.h"
+
+#include "libavcodec/avcodec.h"
+#include "libavcodec/packet.h"
+
+#include "libavutil/dict.h"
+#include "libavutil/error.h"
+#include "libavutil/frame.h"
+
+static int decode_read(DecodeContext *dc, int flush)
+{
+const int ret_done = flush ? AVERROR_EOF : AVERROR(EAGAIN);
+int ret = 0;
+
+while (ret >= 0 &&
+   (dc->max_frames == 0 || dc->decoder->frame_number < 
dc->max_frames)) {
+ret = avcodec_receive_frame(dc->decoder, dc->frame);
+if (ret < 0) {
+if (ret == AVERROR_EOF) {
+int err = dc->process_frame(dc, NULL);
+if (err < 0)
+return err;
+}
+
+return (ret == ret_done) ? 0 : ret;
+}
+
+ret = dc->process_frame(dc, dc->frame);
+av_frame_unref(dc->frame);
+if (ret < 0)
+return ret;
+
+if (dc->max_frames && dc->decoder->frame_number == dc->max_frames)
+return 1;
+}
+
+return (dc->max_frames == 0 || dc->decoder->frame_number < dc->max_frames) 
? 0 : 1;
+}
+
+int ds_run(DecodeContext *dc)
+{
+int ret;
+
+ret = avcodec_open2(dc->decoder, NULL, &dc->decoder_opts);
+if (ret < 0)
+return ret;
+
+while (ret >= 0) {
+ret = av_read_frame(dc->demuxer, dc->pkt);
+if (ret < 0)
+goto flush;
+if (dc->pkt->stream_index != dc->stream->index) {
+av_packet_unref(dc->pkt);
+continue;
+}
+
+ret = avcodec_send_packet(dc->decoder, dc->pkt);
+if (ret < 0) {
+fprintf(stderr, "Error decoding: %d\n", ret);
+return ret;
+}
+av_packet_unref(dc->pkt);
+
+ret = decode_read(dc, 0);
+if (ret < 0) {
+fprintf(stderr, "Error decoding: %d\n", ret);
+return ret;
+} else if (ret > 0)
+return 0;
+}
+
+flush:
+avcodec_send_packet(dc->decoder, NULL);
+ret = decode_read(dc, 1);
+if (ret < 0) {
+fprintf(stderr, "Error flushing: %d\n", ret);
+return ret;
+}
+
+return 0;
+}
+
+void ds_free(DecodeContext *dc)
+{
+av_dict_free(&dc->decoder_opts);
+
+av_frame_free(&dc->frame);
+av_packet_free(&dc->pkt);
+
+avcodec_free_context(&dc->decoder);
+avformat_close_input(&dc->demuxer);
+}
+
+int ds_open(DecodeContext *dc, const char *url, int stream_idx)
+{
+const AVCodec *codec;
+int ret;
+
+memset(dc, 0, sizeof(*dc));
+
+dc->pkt   = av_packet_alloc();
+dc->frame = av_frame_alloc();
+if (!dc->pkt || !dc->frame) {
+ret = AVERROR(ENOMEM);
+goto fail;
+}
+
+ret = avformat_open_

[FFmpeg-devel] [PATCH 7/8] sws: implement slice threading

2021-07-12 Thread Anton Khirnov
---
 libswscale/options.c  |  3 ++
 libswscale/swscale.c  | 56 
 libswscale/swscale_internal.h | 14 ++
 libswscale/utils.c| 82 +++
 4 files changed, 155 insertions(+)

diff --git a/libswscale/options.c b/libswscale/options.c
index 7eb2752543..4b71a23e37 100644
--- a/libswscale/options.c
+++ b/libswscale/options.c
@@ -81,6 +81,9 @@ static const AVOption swscale_options[] = {
 { "uniform_color",   "blend onto a uniform color",0, 
AV_OPT_TYPE_CONST,  { .i64  = SWS_ALPHA_BLEND_UNIFORM},INT_MIN, INT_MAX, 
VE, "alphablend" },
 { "checkerboard","blend onto a checkerboard", 0, 
AV_OPT_TYPE_CONST,  { .i64  = SWS_ALPHA_BLEND_CHECKERBOARD},INT_MIN, INT_MAX,   
  VE, "alphablend" },
 
+{ "threads", "number of threads", OFFSET(nb_threads),  
 AV_OPT_TYPE_INT, {.i64 = 1 }, 0, INT_MAX, VE, "threads" },
+{ "auto",NULL,0,  
AV_OPT_TYPE_CONST, {.i64 = 0 },.flags = VE, "threads" },
+
 { NULL }
 };
 
diff --git a/libswscale/swscale.c b/libswscale/swscale.c
index 8b32ce5a40..ee57684675 100644
--- a/libswscale/swscale.c
+++ b/libswscale/swscale.c
@@ -1115,6 +1115,27 @@ int sws_receive_slice(struct SwsContext *c, unsigned int 
slice_start,
   c->src_ranges.ranges[0].len == c->srcH))
 return AVERROR(EAGAIN);
 
+if (c->slicethread) {
+int nb_jobs = c->slice_ctx[0]->dither == SWS_DITHER_ED ? 1 : 
c->nb_slice_ctx;
+int ret = 0;
+
+c->dst_slice_start  = slice_start;
+c->dst_slice_height = slice_height;
+
+avpriv_slicethread_execute(c->slicethread, nb_jobs, 0);
+
+for (int i = 0; i < c->nb_slice_ctx; i++) {
+if (c->slice_err[i] < 0) {
+ret = c->slice_err[i];
+break;
+}
+}
+
+memset(c->slice_err, 0, c->nb_slice_ctx * sizeof(*c->slice_err));
+
+return ret;
+}
+
 for (int i = 0; i < FF_ARRAY_ELEMS(dst) && c->frame_dst->data[i]; i++) {
 dst[i] = c->frame_dst->data[i] +
  c->frame_dst->linesize[i] * (slice_start >> 
c->chrDstVSubSample);
@@ -1152,6 +1173,41 @@ int attribute_align_arg sws_scale(struct SwsContext *c,
   int srcSliceH, uint8_t *const dst[],
   const int dstStride[])
 {
+if (c->nb_slice_ctx)
+c = c->slice_ctx[0];
+
 return scale_internal(c, srcSlice, srcStride, srcSliceY, srcSliceH,
   dst, dstStride, 0, c->dstH);
 }
+
+void ff_sws_slice_worker(void *priv, int jobnr, int threadnr,
+ int nb_jobs, int nb_threads)
+{
+SwsContext *parent = priv;
+SwsContext  *c = parent->slice_ctx[threadnr];
+
+const int slice_height = FFALIGN(FFMAX((parent->dst_slice_height + nb_jobs 
- 1) / nb_jobs, 1),
+ 1 << c->chrDstVSubSample);
+const int slice_start  = jobnr * slice_height;
+const int slice_end= FFMIN((jobnr + 1) * slice_height, 
parent->dst_slice_height);
+int err = 0;
+
+if (slice_end > slice_start) {
+uint8_t *dst[4] = { NULL };
+
+for (int i = 0; i < FF_ARRAY_ELEMS(dst) && parent->frame_dst->data[i]; 
i++) {
+const int vshift = (i == 1 || i == 2) ? c->chrDstVSubSample : 0;
+const ptrdiff_t offset = parent->frame_dst->linesize[i] *
+((slice_start + parent->dst_slice_start) >> vshift);
+
+dst[i] = parent->frame_dst->data[i] + offset;
+}
+
+err = scale_internal(c, (const uint8_t * const 
*)parent->frame_src->data,
+ parent->frame_src->linesize, 0, c->srcH,
+ dst, parent->frame_dst->linesize,
+ parent->dst_slice_start + slice_start, slice_end 
- slice_start);
+}
+
+parent->slice_err[threadnr] = err;
+}
diff --git a/libswscale/swscale_internal.h b/libswscale/swscale_internal.h
index c1098d6026..6ca44b710e 100644
--- a/libswscale/swscale_internal.h
+++ b/libswscale/swscale_internal.h
@@ -33,6 +33,7 @@
 #include "libavutil/mem_internal.h"
 #include "libavutil/pixfmt.h"
 #include "libavutil/pixdesc.h"
+#include "libavutil/slicethread.h"
 #include "libavutil/ppc/util_altivec.h"
 
 #define STR(s) AV_TOSTRING(s) // AV_STRINGIFY is too long
@@ -300,6 +301,15 @@ typedef struct SwsContext {
  */
 const AVClass *av_class;
 
+AVSliceThread  *slicethread;
+struct SwsContext **slice_ctx;
+int*slice_err;
+int  nb_slice_ctx;
+
+// values passed to current sws_receive_slice() call
+unsigned int dst_slice_start;
+unsigned int dst_slice_height;
+
 /**
  * Note that src, dst, srcStride, dstStride will be copied in the
  * sws_scale() wrapper so they can be freely modified here.
@@ -325,6 +335,7 @@ typedef

[FFmpeg-devel] [PATCH 6/8] lavfi/vf_scale: convert to the frame-based sws API

2021-07-12 Thread Anton Khirnov
---
 libavfilter/vf_scale.c | 73 --
 1 file changed, 49 insertions(+), 24 deletions(-)

diff --git a/libavfilter/vf_scale.c b/libavfilter/vf_scale.c
index 39ab3a4b28..cdff3ab7ed 100644
--- a/libavfilter/vf_scale.c
+++ b/libavfilter/vf_scale.c
@@ -620,29 +620,54 @@ static int request_frame_ref(AVFilterLink *outlink)
 return ff_request_frame(outlink->src->inputs[1]);
 }
 
-static int scale_slice(ScaleContext *scale, AVFrame *out_buf, AVFrame 
*cur_pic, struct SwsContext *sws, int y, int h, int mul, int field)
+static void frame_offset(AVFrame *frame, int dir, int is_pal)
 {
-const uint8_t *in[4];
-uint8_t *out[4];
-int in_stride[4],out_stride[4];
-int i;
-
-for (i=0; i<4; i++) {
-int vsub= ((i+1)&2) ? scale->vsub : 0;
-ptrdiff_t  in_offset = ((y>>vsub)+field) * cur_pic->linesize[i];
-ptrdiff_t out_offset =field  * out_buf->linesize[i];
- in_stride[i] = cur_pic->linesize[i] * mul;
-out_stride[i] = out_buf->linesize[i] * mul;
- in[i] = FF_PTR_ADD(cur_pic->data[i],  in_offset);
-out[i] = FF_PTR_ADD(out_buf->data[i], out_offset);
-}
-if (scale->input_is_pal)
- in[1] = cur_pic->data[1];
-if (scale->output_is_pal)
-out[1] = out_buf->data[1];
+for (int i = 0; i < 4 && frame->data[i]; i++) {
+if (i == 1 && is_pal)
+break;
+frame->data[i] += frame->linesize[i] * dir;
+}
+}
+
+static int scale_field(ScaleContext *scale, AVFrame *dst, AVFrame *src,
+   int field)
+{
+int orig_h_src = src->height;
+int orig_h_dst = dst->height;
+int ret;
+
+// offset the data pointers for the bottom field
+if (field) {
+frame_offset(src, 1, scale->input_is_pal);
+frame_offset(dst, 1, scale->output_is_pal);
+}
+
+// take every second line
+for (int i = 0; i < 4; i++) {
+src->linesize[i] *= 2;
+dst->linesize[i] *= 2;
+}
+src->height /= 2;
+dst->height /= 2;
 
-return sws_scale(sws, in, in_stride, y/mul, h,
- out,out_stride);
+ret = sws_scale_frame(scale->isws[field], dst, src);
+if (ret < 0)
+return ret;
+
+// undo the changes we made above
+for (int i = 0; i < 4; i++) {
+src->linesize[i] /= 2;
+dst->linesize[i] /= 2;
+}
+src->height = orig_h_src;
+dst->height = orig_h_dst;
+
+if (field) {
+frame_offset(src, -1, scale->input_is_pal);
+frame_offset(dst, -1, scale->output_is_pal);
+}
+
+return 0;
 }
 
 static int scale_frame(AVFilterLink *link, AVFrame *in, AVFrame **frame_out)
@@ -789,11 +814,11 @@ scale:
   INT_MAX);
 
 if (scale->interlaced>0 || (scale->interlaced<0 && in->interlaced_frame)) {
-ret = scale_slice(scale, out, in, scale->isws[0], 0, (link->h+1)/2, 2, 
0);
+ret = scale_field(scale, out, in, 0);
 if (ret >= 0)
-ret = scale_slice(scale, out, in, scale->isws[1], 0,  link->h   
/2, 2, 1);
+ret = scale_field(scale, out, in, 1);
 } else {
-ret = scale_slice(scale, out, in, scale->sws, 0, link->h, 1, 0);
+ret = sws_scale_frame(scale->sws, out, in);
 }
 
 av_frame_free(&in);
-- 
2.30.2

___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".


[FFmpeg-devel] [PATCH 8/8] lavfi/vf_scale: pass the thread count to the scaler

2021-07-12 Thread Anton Khirnov
---
 libavfilter/vf_scale.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/libavfilter/vf_scale.c b/libavfilter/vf_scale.c
index cdff3ab7ed..f676f5d82e 100644
--- a/libavfilter/vf_scale.c
+++ b/libavfilter/vf_scale.c
@@ -543,6 +543,7 @@ static int config_props(AVFilterLink *outlink)
 av_opt_set_int(*s, "sws_flags", scale->flags, 0);
 av_opt_set_int(*s, "param0", scale->param[0], 0);
 av_opt_set_int(*s, "param1", scale->param[1], 0);
+av_opt_set_int(*s, "threads", ff_filter_get_nb_threads(ctx), 0);
 if (scale->in_range != AVCOL_RANGE_UNSPECIFIED)
 av_opt_set_int(*s, "src_range",
scale->in_range == AVCOL_RANGE_JPEG, 0);
-- 
2.30.2

___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".


[FFmpeg-devel] [PATCH 5/8] sws: add a new scaling API

2021-07-12 Thread Anton Khirnov
---
 libswscale/swscale.c  | 263 ++
 libswscale/swscale.h  |  80 +++
 libswscale/swscale_internal.h |  19 +++
 libswscale/utils.c|  70 +
 4 files changed, 374 insertions(+), 58 deletions(-)

diff --git a/libswscale/swscale.c b/libswscale/swscale.c
index 61dfcb4dff..8b32ce5a40 100644
--- a/libswscale/swscale.c
+++ b/libswscale/swscale.c
@@ -236,13 +236,16 @@ static void lumRangeFromJpeg16_c(int16_t *_dst, int width)
 av_log(c, AV_LOG_DEBUG, __VA_ARGS__)
 
 static int swscale(SwsContext *c, const uint8_t *src[],
-   int srcStride[], int srcSliceY,
-   int srcSliceH, uint8_t *dst[], int dstStride[])
+   int srcStride[], int srcSliceY, int srcSliceH,
+   uint8_t *dst[], int dstStride[],
+   int dstSliceY, int dstSliceH)
 {
+const int scale_dst = dstSliceY > 0 || dstSliceH < c->dstH;
+
 /* load a few things into local vars to make the code more readable?
  * and faster */
 const int dstW   = c->dstW;
-const int dstH   = c->dstH;
+int dstH = c->dstH;
 
 const enum AVPixelFormat dstFormat = c->dstFormat;
 const int flags  = c->flags;
@@ -331,10 +334,15 @@ static int swscale(SwsContext *c, const uint8_t *src[],
 }
 }
 
-/* Note the user might start scaling the picture in the middle so this
- * will not get executed. This is not really intended but works
- * currently, so people might do it. */
-if (srcSliceY == 0) {
+if (scale_dst) {
+dstY = dstSliceY;
+dstH = dstY + dstSliceH;
+lastInLumBuf = -1;
+lastInChrBuf = -1;
+} else if (srcSliceY == 0) {
+/* Note the user might start scaling the picture in the middle so this
+ * will not get executed. This is not really intended but works
+ * currently, so people might do it. */
 dstY = 0;
 lastInLumBuf = -1;
 lastInChrBuf = -1;
@@ -352,8 +360,8 @@ static int swscale(SwsContext *c, const uint8_t *src[],
 srcSliceY, srcSliceH, chrSrcSliceY, chrSrcSliceH, 1);
 
 ff_init_slice_from_src(vout_slice, (uint8_t**)dst, dstStride, c->dstW,
-dstY, dstH, dstY >> c->chrDstVSubSample,
-AV_CEIL_RSHIFT(dstH, c->chrDstVSubSample), 0);
+dstY, dstSliceH, dstY >> c->chrDstVSubSample,
+AV_CEIL_RSHIFT(dstSliceH, c->chrDstVSubSample), scale_dst);
 if (srcSliceY == 0) {
 hout_slice->plane[0].sliceY = lastInLumBuf + 1;
 hout_slice->plane[1].sliceY = lastInChrBuf + 1;
@@ -373,7 +381,7 @@ static int swscale(SwsContext *c, const uint8_t *src[],
 
 // First line needed as input
 const int firstLumSrcY  = FFMAX(1 - vLumFilterSize, 
vLumFilterPos[dstY]);
-const int firstLumSrcY2 = FFMAX(1 - vLumFilterSize, 
vLumFilterPos[FFMIN(dstY | ((1 << c->chrDstVSubSample) - 1), dstH - 1)]);
+const int firstLumSrcY2 = FFMAX(1 - vLumFilterSize, 
vLumFilterPos[FFMIN(dstY | ((1 << c->chrDstVSubSample) - 1), c->dstH - 1)]);
 // First line needed as input
 const int firstChrSrcY  = FFMAX(1 - vChrFilterSize, 
vChrFilterPos[chrDstY]);
 
@@ -477,7 +485,7 @@ static int swscale(SwsContext *c, const uint8_t *src[],
 c->chrDither8 = ff_dither_8x8_128[chrDstY & 7];
 c->lumDither8 = ff_dither_8x8_128[dstY& 7];
 }
-if (dstY >= dstH - 2) {
+if (dstY >= c->dstH - 2) {
 /* hmm looks like we can't use MMX here without overwriting
  * this array's tail */
 ff_sws_init_output_funcs(c, &yuv2plane1, &yuv2planeX, &yuv2nv12cX,
@@ -491,21 +499,22 @@ static int swscale(SwsContext *c, const uint8_t *src[],
 desc[i].process(c, &desc[i], dstY, 1);
 }
 if (isPlanar(dstFormat) && isALPHA(dstFormat) && !needAlpha) {
+int offset = lastDstY - dstSliceY;
 int length = dstW;
 int height = dstY - lastDstY;
 
 if (is16BPS(dstFormat) || isNBPS(dstFormat)) {
 const AVPixFmtDescriptor *desc = av_pix_fmt_desc_get(dstFormat);
-fillPlane16(dst[3], dstStride[3], length, height, lastDstY,
+fillPlane16(dst[3], dstStride[3], length, height, offset,
 1, desc->comp[3].depth,
 isBE(dstFormat));
 } else if (is32BPS(dstFormat)) {
 const AVPixFmtDescriptor *desc = av_pix_fmt_desc_get(dstFormat);
-fillPlane32(dst[3], dstStride[3], length, height, lastDstY,
+fillPlane32(dst[3], dstStride[3], length, height, offset,
 1, desc->comp[3].depth,
 isBE(dstFormat), desc->flags & AV_PIX_FMT_FLAG_FLOAT);
 } else
-fillPlane(dst[3], dstStride[3], length, height, lastDstY, 255);
+fillPlane(dst[3], dstStride[3], length, height, 

Re: [FFmpeg-devel] [PATCH 24/24] lavfi/vf_scale: implement slice threading

2021-07-12 Thread Hendrik Leppkes
On Mon, Jul 12, 2021 at 12:35 PM Anton Khirnov  wrote:
>
> Quoting Michael Niedermayer (2021-07-03 18:27:36)
> > On Sat, Jul 03, 2021 at 03:27:36PM +0200, Anton Khirnov wrote:
> > > Quoting Michael Niedermayer (2021-06-01 11:35:13)
> > > > On Mon, May 31, 2021 at 09:55:15AM +0200, Anton Khirnov wrote:
> > > > > ---
> > > > >  libavfilter/vf_scale.c | 182 
> > > > > +++--
> > > > >  1 file changed, 141 insertions(+), 41 deletions(-)
> > > >
> > > > breaks: (lower 50% is bright green)
> > > > ./ffplay -i mm-short.mpg -an   -vf "tinterlace,scale=720:576:interl=1"
> > >
> > > Fixed locally, but I'm wondering why is interlaced scaling not done by
> > > default for interlaced videos.
> >
> > IIRC the flags where quite unreliable. If we have reliable knowledge about
> > interlacing it certainly should be used automatically
>
> You mean there is a significant amount of progressive content that is
> flagged as interlaced?
>

In my experience its not that bad. But I deal with every day content,
not fringe stuff.

- Hendrik
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".


Re: [FFmpeg-devel] [PATCH 24/24] lavfi/vf_scale: implement slice threading

2021-07-12 Thread Michael Niedermayer
On Mon, Jul 12, 2021 at 12:34:55PM +0200, Anton Khirnov wrote:
> Quoting Michael Niedermayer (2021-07-03 18:27:36)
> > On Sat, Jul 03, 2021 at 03:27:36PM +0200, Anton Khirnov wrote:
> > > Quoting Michael Niedermayer (2021-06-01 11:35:13)
> > > > On Mon, May 31, 2021 at 09:55:15AM +0200, Anton Khirnov wrote:
> > > > > ---
> > > > >  libavfilter/vf_scale.c | 182 
> > > > > +++--
> > > > >  1 file changed, 141 insertions(+), 41 deletions(-)
> > > > 
> > > > breaks: (lower 50% is bright green)
> > > > ./ffplay -i mm-short.mpg -an   -vf "tinterlace,scale=720:576:interl=1"
> > > 
> > > Fixed locally, but I'm wondering why is interlaced scaling not done by
> > > default for interlaced videos.
> > 
> > IIRC the flags where quite unreliable. If we have reliable knowledge about
> > interlacing it certainly should be used automatically
> 
> You mean there is a significant amount of progressive content that is
> flagged as interlaced?

Yes, thats from my memory though, this may have changed of course.
IIRC one source of this is progressive material intended for interlaced
displays

Most ideally IMHO we should reliably autodetect the material 
progressive/interlaced/telecined set the flags accordingly and then also
automatically optimally do what the flags say without user intervention.

thx

[...]

-- 
Michael GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB

If a bugfix only changes things apparently unrelated to the bug with no
further explanation, that is a good sign that the bugfix is wrong.


signature.asc
Description: PGP signature
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".


Re: [FFmpeg-devel] [PATCH 1/2] libavutil/cpu: Adds fast gather detection.

2021-07-12 Thread James Almer

On 7/12/2021 7:46 AM, Lynne wrote:

12 Jul 2021, 11:29 by alankelly-at-google@ffmpeg.org:


On Fri, Jun 25, 2021 at 1:24 PM Alan Kelly  wrote:


On Fri, Jun 25, 2021 at 10:40 AM Lynne  wrote:


Jun 25, 2021, 09:54 by alankelly-at-google@ffmpeg.org:


Broadwell and later and Zen3 and later have fast gather instructions.
---
  Gather requires between 9 and 12 cycles on Haswell, 5 to 7 on

Broadwell,

  and 2 to 5 on Skylake and newer. It is also slow on AMD before Zen 3.
  libavutil/cpu.h |  2 ++
  libavutil/x86/cpu.c | 18 --
  libavutil/x86/cpu.h |  1 +
  3 files changed, 19 insertions(+), 2 deletions(-)



No, we really don't need more FAST/SLOW flags, especially for
something like this which is just fixable by _not_using_vgather_.
Take a look at libavutil/x86/tx_float.asm, we only use vgather
if it's guaranteed to either be faster for what we're gathering or
is just as fast "slow". If neither is true, we use manual lookups,
which is actually advantageous since for AVX2 we can interleave
the lookups that happen in each lane.

Even if we disregard this, I've extensively benchmarked vgather
on Zen 3, Zen 2, Cascade Lake and Skylake, and there's hardly
a great vgather improvement to be found in Zen 3 to justify
using a new CPU flag for this.
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".



Thanks for your response. I'm not against finding a cleaner way of
enabling/disabling the code which will be protected by this flag. However,
the manual lookups solution proposed will not work in this case, the avx2
version of hscale will only be faster if fast gathers are available,
otherwise, the ssse3 version should be used.

I haven't got access to a Zen3 so I can't comment on the performance. I
have tested on a Zen 2 and it is slow. On Broadwell hscale avx2 is about
10% faster than the ssse3 version and on Skylake about 40% faster, Haswell
has similar performance to Zen2.

Is there a proxy which could be used for detecting Broadwell or Skylake
and later? AVX512 seems too strict as there are Skylake chips without
AVX512. Thanks



Hi,

I will paste the performance figures from the thread for the other part of
this patch here so that the justification for this flag is clearer:

Skylake Haswell
hscale_8_to_15_width4_ssse3 761.2 760
hscale_8_to_15_width4_avx2 468.7 957
hscale_8_to_15_width8_ssse3 1170.7 1032
hscale_8_to_15_width8_avx2 865.7 1979
hscale_8_to_15_width12_ssse3 2172.2 2472
hscale_8_to_15_width12_avx2 1245.7 2901
hscale_8_to_15_width16_ssse3 2244.2 2400
hscale_8_to_15_width16_avx2 1647.2 3681

As you can see, it is catastrophic on Haswell and older chips but the gains
on Skylake are impressive.
As I don't have performance figures for Zen 3, I can disable this feature
on all cpus apart from Broadwell and later as you say that there is no
worthwhile improvement on Zen3. Is this OK with you?



It's not that catastrophic. Since Haswell CPUs generally don't have
large AVX2 gains, could you just exclude Haswell only from
EXTERNAL_AVX2_FAST, and require EXTERNAL_AVX2_FAST
to enable those functions?


And disable all non gather AVX2 asm functions on Haswell? No. And it's a 
lie that Haswell doesn't have large gains with AVX2.



___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".



___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".


Re: [FFmpeg-devel] [PATCH 2/2] avformat/movenc: add support for TTML muxing

2021-07-12 Thread Martin Storsjö

On Tue, 22 Jun 2021, Jan Ekström wrote:


From: Jan Ekström 

Includes basic support for both the ISMV ('dfxp') and MP4 ('stpp')
methods. This initial version also foregoes fragmentation support
as this eases the initial review.


Hmm, I'm not sure I understand here, this seems to add at least some coe 
in mov_flush_fragment, so there's some initial support for fragmentation 
present still - can you elaborate?



Signed-off-by: Jan Ekström 
---
libavformat/Makefile  |   2 +-
libavformat/isom.h|   3 +
libavformat/movenc.c  | 180 +++-
libavformat/movenc.h  |   6 +
libavformat/movenc_ttml.c | 243 ++
libavformat/movenc_ttml.h |  31 +
6 files changed, 462 insertions(+), 3 deletions(-)
create mode 100644 libavformat/movenc_ttml.c
create mode 100644 libavformat/movenc_ttml.h

diff --git a/libavformat/Makefile b/libavformat/Makefile
index c9ef564523..931ad4ac45 100644
--- a/libavformat/Makefile
+++ b/libavformat/Makefile
@@ -337,7 +337,7 @@ OBJS-$(CONFIG_MOV_DEMUXER)   += mov.o 
mov_chan.o mov_esds.o \
qtpalette.o replaygain.o
OBJS-$(CONFIG_MOV_MUXER) += movenc.o av1.o avc.o hevc.o vpcc.o \
movenchint.o mov_chan.o rtp.o \
-movenccenc.o rawutils.o
+movenccenc.o movenc_ttml.o 
rawutils.o
OBJS-$(CONFIG_MP2_MUXER) += rawenc.o
OBJS-$(CONFIG_MP3_DEMUXER)   += mp3dec.o replaygain.o
OBJS-$(CONFIG_MP3_MUXER) += mp3enc.o rawenc.o id3v2enc.o
diff --git a/libavformat/isom.h b/libavformat/isom.h
index ac1b3f3d56..34a58c79b7 100644
--- a/libavformat/isom.h
+++ b/libavformat/isom.h
@@ -387,4 +387,7 @@ static inline enum AVCodecID ff_mov_get_lpcm_codec_id(int 
bps, int flags)
return ff_get_pcm_codec_id(bps, flags & 1, flags & 2, flags & 4 ? -1 : 0);
}

+#define MOV_ISMV_TTML_TAG MKTAG('d', 'f', 'x', 'p')
+#define MOV_MP4_TTML_TAG  MKTAG('s', 't', 'p', 'p')
+
#endif /* AVFORMAT_ISOM_H */
diff --git a/libavformat/movenc.c b/libavformat/movenc.c
index 04f3e94158..d4efb6217f 100644
--- a/libavformat/movenc.c
+++ b/libavformat/movenc.c
@@ -56,6 +56,8 @@
#include "hevc.h"
#include "rtpenc.h"
#include "mov_chan.h"
+#include "movenc_ttml.h"
+#include "ttmlenc.h"
#include "vpcc.h"

static const AVOption options[] = {
@@ -120,6 +122,7 @@ static const AVClass flavor ## _muxer_class = {\
};

static int get_moov_size(AVFormatContext *s);
+static int mov_write_single_packet(AVFormatContext *s, AVPacket *pkt);

static int utf8len(const uint8_t *b)
{
@@ -1788,7 +1791,29 @@ static int mov_write_subtitle_tag(AVIOContext *pb, 
MOVTrack *track)

if (track->par->codec_id == AV_CODEC_ID_DVD_SUBTITLE)
mov_write_esds_tag(pb, track);
-else if (track->par->extradata_size)
+else if (track->par->codec_id == AV_CODEC_ID_TTML) {
+switch (track->par->codec_tag) {
+case MOV_ISMV_TTML_TAG:
+// ye olde ISMV dfxp requires no extradata.


Nit: I'd prefer a more formal/serious wording in the comment than "ye 
olde" :P



+break;
+case MOV_MP4_TTML_TAG:
+// As specified in 14496-30, XMLSubtitleSampleEntry
+// Namespace
+avio_put_str(pb, "http://www.w3.org/ns/ttml";);
+// Empty schema_location
+avio_w8(pb, 0);
+// Empty auxiliary_mime_types
+avio_w8(pb, 0);
+break;
+default:
+av_log(NULL, AV_LOG_ERROR,
+   "Unknown codec tag '%s' utilized for TTML stream with "
+   "index %d (track id %d)!\n",
+   av_fourcc2str(track->par->codec_tag), track->st->index,
+   track->track_id);
+return AVERROR(EINVAL);
+}
+} else if (track->par->extradata_size)
avio_write(pb, track->par->extradata, track->par->extradata_size);

if (track->mode == MODE_MP4 &&
@@ -5254,6 +5279,71 @@ static int 
mov_flush_fragment_interleaving(AVFormatContext *s, MOVTrack *track)
return 0;
}

+static int mov_write_squashed_packet(AVFormatContext *s, MOVTrack *track)
+{
+AVPacket *squashed_packet = ((MOVMuxContext *)s->priv_data)->pkt;


Nit: Maybe spell out the intermediate MOVMuxContext pointer to a separate 
variable for clarity, even if it's used only once.



+int ret = AVERROR_BUG;
+
+switch (track->st->codecpar->codec_id) {
+case AV_CODEC_ID_TTML:
+{
+int we_had_packets = !!track->squashed_packet_queue;


Nit: We don't really need the strict 0/1 value of we_had_packets here, so 
we don't need the double negation. And maybe drop the "we_" prefix?



+
+if ((ret = ff_mov_generate_squashed_ttml_packet(s, track, 
squashed_packet)) < 0) {
+goto finish_squash;
+}
+
+// We have generated a padding packet (no a

Re: [FFmpeg-devel] [PATCH 1/1] libavformat/rtsp.c: Reply to GET_PARAMETER requests

2021-07-12 Thread Martin Storsjö

On Sat, 26 Jun 2021, Martin Storsjö wrote:


On Fri, 25 Jun 2021, Hayden Myers wrote:


Some encoders send GET_PARAMETER requests as a keep-alive mechanism.
If the client doesn't reply with an OK message, the encoder will close
the session.  This was encountered with the impath i5110 encoder, when
the RTSP Keep-Alive checkbox is enabled under streaming settings.
Alternatively one may set the X-No-Keepalive: 1 header, but this is more
of a workaround.  It's better practice to respond to an encoder's
keep-alive request, than disable the mechanism which may be manufacturer
specific.

Signed-off-by: Hayden Myers 
---
 libavformat/rtsp.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/libavformat/rtsp.c b/libavformat/rtsp.c
index 9f509a229f..dc660368f0 100644
--- a/libavformat/rtsp.c
+++ b/libavformat/rtsp.c
@@ -1259,7 +1259,9 @@ start:
         char base64buf[AV_BASE64_SIZE(sizeof(buf))];
         const char* ptr = buf;

-        if (!strcmp(reply->reason, "OPTIONS")) {
+        if (!strcmp(reply->reason, "OPTIONS") ||
+            !strcmp(reply->reason, "GET_PARAMETER")) {
+
             snprintf(buf, sizeof(buf), "RTSP/1.0 200 OK\r\n");
             if (reply->seq)
                 av_strlcatf(buf, sizeof(buf), "CSeq: %d\r\n", reply->seq);
--


LGTM, this sounds and seems reasonable to me (although untested in practice).


Pushed this patch now. Do note that the patch was badly mangled (the extra 
PNG attachments I think?) which made git unable to automatically apply it 
from the mail message, so I had to retype the patch manually.


// Martin
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".


Re: [FFmpeg-devel] [PATCH 1/2] libavutil/cpu: Adds fast gather detection.

2021-07-12 Thread Lynne
12 Jul 2021, 13:53 by jamr...@gmail.com:

> On 7/12/2021 7:46 AM, Lynne wrote:
>
>> 12 Jul 2021, 11:29 by alankelly-at-google@ffmpeg.org:
>>
>>> On Fri, Jun 25, 2021 at 1:24 PM Alan Kelly  wrote:
>>>
 On Fri, Jun 25, 2021 at 10:40 AM Lynne  wrote:

> Jun 25, 2021, 09:54 by alankelly-at-google@ffmpeg.org:
>
>> Broadwell and later and Zen3 and later have fast gather instructions.
>> ---
>>  Gather requires between 9 and 12 cycles on Haswell, 5 to 7 on
>>
> Broadwell,
>
>> and 2 to 5 on Skylake and newer. It is also slow on AMD before Zen 3.
>>  libavutil/cpu.h |  2 ++
>>  libavutil/x86/cpu.c | 18 --
>>  libavutil/x86/cpu.h |  1 +
>>  3 files changed, 19 insertions(+), 2 deletions(-)
>>
>
> No, we really don't need more FAST/SLOW flags, especially for
> something like this which is just fixable by _not_using_vgather_.
> Take a look at libavutil/x86/tx_float.asm, we only use vgather
> if it's guaranteed to either be faster for what we're gathering or
> is just as fast "slow". If neither is true, we use manual lookups,
> which is actually advantageous since for AVX2 we can interleave
> the lookups that happen in each lane.
>
> Even if we disregard this, I've extensively benchmarked vgather
> on Zen 3, Zen 2, Cascade Lake and Skylake, and there's hardly
> a great vgather improvement to be found in Zen 3 to justify
> using a new CPU flag for this.
> ___
> ffmpeg-devel mailing list
> ffmpeg-devel@ffmpeg.org
> https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
>
> To unsubscribe, visit link above, or email
> ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".
>

 Thanks for your response. I'm not against finding a cleaner way of
 enabling/disabling the code which will be protected by this flag. However,
 the manual lookups solution proposed will not work in this case, the avx2
 version of hscale will only be faster if fast gathers are available,
 otherwise, the ssse3 version should be used.

 I haven't got access to a Zen3 so I can't comment on the performance. I
 have tested on a Zen 2 and it is slow. On Broadwell hscale avx2 is about
 10% faster than the ssse3 version and on Skylake about 40% faster, Haswell
 has similar performance to Zen2.

 Is there a proxy which could be used for detecting Broadwell or Skylake
 and later? AVX512 seems too strict as there are Skylake chips without
 AVX512. Thanks

>>>
>>> Hi,
>>>
>>> I will paste the performance figures from the thread for the other part of
>>> this patch here so that the justification for this flag is clearer:
>>>
>>> Skylake Haswell
>>> hscale_8_to_15_width4_ssse3 761.2 760
>>> hscale_8_to_15_width4_avx2 468.7 957
>>> hscale_8_to_15_width8_ssse3 1170.7 1032
>>> hscale_8_to_15_width8_avx2 865.7 1979
>>> hscale_8_to_15_width12_ssse3 2172.2 2472
>>> hscale_8_to_15_width12_avx2 1245.7 2901
>>> hscale_8_to_15_width16_ssse3 2244.2 2400
>>> hscale_8_to_15_width16_avx2 1647.2 3681
>>>
>>> As you can see, it is catastrophic on Haswell and older chips but the gains
>>> on Skylake are impressive.
>>> As I don't have performance figures for Zen 3, I can disable this feature
>>> on all cpus apart from Broadwell and later as you say that there is no
>>> worthwhile improvement on Zen3. Is this OK with you?
>>>
>>
>> It's not that catastrophic. Since Haswell CPUs generally don't have
>> large AVX2 gains, could you just exclude Haswell only from
>> EXTERNAL_AVX2_FAST, and require EXTERNAL_AVX2_FAST
>> to enable those functions?
>>
>
> And disable all non gather AVX2 asm functions on Haswell? No. And it's a lie 
> that Haswell doesn't have large gains with AVX2.
>

It won't disable ALL of the AVX2, but it'll affect a few random components, the 
most
prominent of which is some (not all) hevc assembly.
But I think I'd rather just not do anything at all. Performance of vgather even 
on Haswell
is still above 2x the C version, and we barely have any vgathers in our code. 
And
Haswell use is in decline too.
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".


Re: [FFmpeg-devel] FFMPEG for V4L2 M2M devices ?

2021-07-12 Thread Andrii
>
> A quick Google implies that NVidia already has a stateful V4L2 M2M
> driver in their vendor kernel. Other than the strange choice of device
> node name (/dev/nvhost-nvdec), the details at [3] make it look like a
> normal V4L2 M2M decoder that has a good chance of working against
> h264_v4l2m2m.


Not only does it have a strange node name, it also uses two nodes. One for
decoding, another for converting. Capture plane of the decoder stores
frames in V4L2_PIX_FMT_NV12M format.
Converter able to convert it to a different format[1].

Could you point me at documentation of Pi V4L2 spec?

[1] https://docs.nvidia.com/jetson/l4t-multimedia/group__V4L2Conv.html

Andrii

On Mon, Jul 12, 2021 at 6:02 AM Dave Stevenson <
dave.steven...@raspberrypi.com> wrote:

> On Sat, 10 Jul 2021 at 00:56, Brad Hards  wrote:
> >
> > On Saturday, 10 July 2021 8:53:27 AM AEST Andrii wrote:
> > > I am working on porting a Kodi player to an NVidia Jetson Nano device.
> I've
> > > been developing a decoder for quite some time now, and realized that
> the
> > > best approach would be to have it inside of ffmpeg, instead of
> embedding
> > > the decoder into Kodi as it heavily relies on FFMPEG. Just wondering if
> > > there is any effort in making FFMPEG suppring M2M V4L devices ?
> >
> >
> https://git.ffmpeg.org/gitweb/ffmpeg.git/blob_plain/HEAD:/libavcodec/v4l2_m2m.c[1]
> >
> > I guess that would be the basis for further work as required to meet
> your needs.
>
> Do note that there are 2 V4L2 M2M decoder APIs - the stateful API[1] ,
> and the stateless API [2]. They differ in the amount of bitstream
> parsing and buffer management that the driver implements vs expecting
> the client to do.
>
> The *_v4l2m2m drivers within FFMPEG support the stateful API (ie the
> kernel driver has bitstream parsing). For Raspberry Pi we use that to
> support the (older) H264 implementation, and FFMPEG master does that
> very well.
>
> The Pi HEVC decoder uses the V4L2 stateless API. Stateless HEVC
> support hasn't been merged to the mainline kernel as yet, so there are
> downstream patches to support that.
>
> A quick Google implies that NVidia already has a stateful V4L2 M2M
> driver in their vendor kernel. Other than the strange choice of device
> node name (/dev/nvhost-nvdec), the details at [3] make it look like a
> normal V4L2 M2M decoder that has a good chance of working against
> h264_v4l2m2m.
>
> [1]
> https://www.kernel.org/doc/html/latest/userspace-api/media/v4l/dev-decoder.html
> [2]
> https://www.kernel.org/doc/html/latest/userspace-api/media/v4l/dev-stateless-decoder.html
> [3] https://docs.nvidia.com/jetson/l4t-multimedia/group__V4L2Dec.html
>
>   Dave
>
> > Brad
> >
> > 
> > [1]
> https://git.ffmpeg.org/gitweb/ffmpeg.git/blob_plain/HEAD:/libavcodec/v4l2_m2m.c
> > ___
> > ffmpeg-devel mailing list
> > ffmpeg-devel@ffmpeg.org
> > https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
> >
> > To unsubscribe, visit link above, or email
> > ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".
>
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".


Re: [FFmpeg-devel] FFMPEG for V4L2 M2M devices ?

2021-07-12 Thread Dave Stevenson
On Mon, 12 Jul 2021 at 14:51, Andrii  wrote:
>>
>> A quick Google implies that NVidia already has a stateful V4L2 M2M
>> driver in their vendor kernel. Other than the strange choice of device
>> node name (/dev/nvhost-nvdec), the details at [3] make it look like a
>> normal V4L2 M2M decoder that has a good chance of working against
>> h264_v4l2m2m.
>
>
> Not only does it have a strange node name, it also uses two nodes. One for 
> decoding, another for converting. Capture plane of the decoder stores frames 
> in V4L2_PIX_FMT_NV12M format.
> Converter able to convert it to a different format[1].

Those appear to be two different hardware blocks.
If you can consume NV12M (YUV420 with interleaved UV plane), then I
see no reason why you have to pass the data through the
"/dev/nvhost-vic" device.

We have a similar thing where /dev/video10 is the decoder (stateful
decode), /dev/video11 is the encoder, and /dev/video12 is the ISP
(Image Sensor Pipeline) wrapped in the V4L2 API.

> Could you point me at documentation of Pi V4L2 spec?

It just implements the relevant APIs that I've already linked to. If
it doesn't follow the API, then we fix it so that it does.

Stateful H264 implementation is
https://github.com/raspberrypi/linux/tree/rpi-5.10.y/drivers/staging/vc04_services/bcm2835-codec
Stateless HEVC is
https://github.com/raspberrypi/linux/tree/rpi-5.10.y/drivers/staging/media/rpivid

  Dave

> [1] https://docs.nvidia.com/jetson/l4t-multimedia/group__V4L2Conv.html
>
> Andrii
>
> On Mon, Jul 12, 2021 at 6:02 AM Dave Stevenson 
>  wrote:
>>
>> On Sat, 10 Jul 2021 at 00:56, Brad Hards  wrote:
>> >
>> > On Saturday, 10 July 2021 8:53:27 AM AEST Andrii wrote:
>> > > I am working on porting a Kodi player to an NVidia Jetson Nano device. 
>> > > I've
>> > > been developing a decoder for quite some time now, and realized that the
>> > > best approach would be to have it inside of ffmpeg, instead of embedding
>> > > the decoder into Kodi as it heavily relies on FFMPEG. Just wondering if
>> > > there is any effort in making FFMPEG suppring M2M V4L devices ?
>> >
>> > https://git.ffmpeg.org/gitweb/ffmpeg.git/blob_plain/HEAD:/libavcodec/v4l2_m2m.c[1]
>> >
>> > I guess that would be the basis for further work as required to meet your 
>> > needs.
>>
>> Do note that there are 2 V4L2 M2M decoder APIs - the stateful API[1] ,
>> and the stateless API [2]. They differ in the amount of bitstream
>> parsing and buffer management that the driver implements vs expecting
>> the client to do.
>>
>> The *_v4l2m2m drivers within FFMPEG support the stateful API (ie the
>> kernel driver has bitstream parsing). For Raspberry Pi we use that to
>> support the (older) H264 implementation, and FFMPEG master does that
>> very well.
>>
>> The Pi HEVC decoder uses the V4L2 stateless API. Stateless HEVC
>> support hasn't been merged to the mainline kernel as yet, so there are
>> downstream patches to support that.
>>
>> A quick Google implies that NVidia already has a stateful V4L2 M2M
>> driver in their vendor kernel. Other than the strange choice of device
>> node name (/dev/nvhost-nvdec), the details at [3] make it look like a
>> normal V4L2 M2M decoder that has a good chance of working against
>> h264_v4l2m2m.
>>
>> [1] 
>> https://www.kernel.org/doc/html/latest/userspace-api/media/v4l/dev-decoder.html
>> [2] 
>> https://www.kernel.org/doc/html/latest/userspace-api/media/v4l/dev-stateless-decoder.html
>> [3] https://docs.nvidia.com/jetson/l4t-multimedia/group__V4L2Dec.html
>>
>>   Dave
>>
>> > Brad
>> >
>> > 
>> > [1] 
>> > https://git.ffmpeg.org/gitweb/ffmpeg.git/blob_plain/HEAD:/libavcodec/v4l2_m2m.c
>> > ___
>> > ffmpeg-devel mailing list
>> > ffmpeg-devel@ffmpeg.org
>> > https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
>> >
>> > To unsubscribe, visit link above, or email
>> > ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".


Re: [FFmpeg-devel] FFMPEG for V4L2 M2M devices ?

2021-07-12 Thread Andriy Gelman
On Mon, 12. Jul 11:02, Dave Stevenson wrote:
> On Sat, 10 Jul 2021 at 00:56, Brad Hards  wrote:
> >
> > On Saturday, 10 July 2021 8:53:27 AM AEST Andrii wrote:
> > > I am working on porting a Kodi player to an NVidia Jetson Nano device. 
> > > I've
> > > been developing a decoder for quite some time now, and realized that the
> > > best approach would be to have it inside of ffmpeg, instead of embedding
> > > the decoder into Kodi as it heavily relies on FFMPEG. Just wondering if
> > > there is any effort in making FFMPEG suppring M2M V4L devices ?
> >
> > https://git.ffmpeg.org/gitweb/ffmpeg.git/blob_plain/HEAD:/libavcodec/v4l2_m2m.c[1]
> >
> > I guess that would be the basis for further work as required to meet your 
> > needs.
> 
> Do note that there are 2 V4L2 M2M decoder APIs - the stateful API[1] ,
> and the stateless API [2]. They differ in the amount of bitstream
> parsing and buffer management that the driver implements vs expecting
> the client to do.
> 
> The *_v4l2m2m drivers within FFMPEG support the stateful API (ie the
> kernel driver has bitstream parsing). For Raspberry Pi we use that to
> support the (older) H264 implementation, and FFMPEG master does that
> very well.
> 
> The Pi HEVC decoder uses the V4L2 stateless API. Stateless HEVC
> support hasn't been merged to the mainline kernel as yet, so there are
> downstream patches to support that.

> 
> A quick Google implies that NVidia already has a stateful V4L2 M2M
> driver in their vendor kernel. Other than the strange choice of device
> node name (/dev/nvhost-nvdec), the details at [3] make it look like a
> normal V4L2 M2M decoder that has a good chance of working against
> h264_v4l2m2m.

Some time ago I tried to set up the Jetson nano to work with our v4l2m2m code, 
but there were just too many problems. It wasn't properly spec compliant.

For reference here's link to Nvidia's patch to support decoding on the nano in
ffmpeg:
http://ffmpeg.org/pipermail/ffmpeg-devel/2020-June/263746.html

-- 
Andriy
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".


Re: [FFmpeg-devel] [PATCH 5/8] sws: add a new scaling API

2021-07-12 Thread Michael Niedermayer
On Mon, Jul 12, 2021 at 01:07:06PM +0200, Anton Khirnov wrote:
[...]
> diff --git a/libswscale/swscale.h b/libswscale/swscale.h
> index 50d6d46553..41eacd2dea 100644
> --- a/libswscale/swscale.h
> +++ b/libswscale/swscale.h
> @@ -30,6 +30,7 @@
>  #include 
>  
>  #include "libavutil/avutil.h"
> +#include "libavutil/frame.h"
>  #include "libavutil/log.h"
>  #include "libavutil/pixfmt.h"
>  #include "version.h"
> @@ -218,6 +219,85 @@ int sws_scale(struct SwsContext *c, const uint8_t *const 
> srcSlice[],
>const int srcStride[], int srcSliceY, int srcSliceH,
>uint8_t *const dst[], const int dstStride[]);
>  
> +/**
> + * Scale source data from src and write the output to dst.
> + *
> + * This is merely a convenience wrapper around
> + * - sws_frame_start()
> + * - sws_send_slice(0, src->height)
> + * - sws_receive_slice(0, dst->height)
> + * - sws_frame_end()
> + *
> + * @param dst The destination frame. See documentation for sws_frame_start() 
> for
> + *more details.
> + * @param src The source frame.
> + *
> + * @return 0 on success, a negative AVERROR code on failure
> + */
> +int sws_scale_frame(struct SwsContext *c, AVFrame *dst, const AVFrame *src);
> +
> +/**
> + * Initialize the scaling process for a given pair of source/destination 
> frames.
> + * Must be called before any calls to sws_send_slice() and 
> sws_receive_slice().
> + *
> + * This function will retain references to src and dst.
> + *
> + * @param dst The destination frame.
> + *
> + *The data buffers may either be already allocated by the caller 
> or
> + *left clear, in which case they will be allocated by the scaler.
> + *The latter may have performance advantages - e.g. in certain 
> cases
> + *some output planes may be references to input planes, rather 
> than
> + *copies.
> + *
> + *Output data will be written into this frame in successful
> + *sws_receive_slice() calls.
> + * @param src The source frame. The data buffers must be allocated, but the
> + *frame data does not have to be ready at this point. Data
> + *availability is then signalled by sws_send_slice().
> + * @return 0 on success, a negative AVERROR code on failure
> + *
> + * @see sws_frame_end()
> + */
> +int sws_frame_start(struct SwsContext *c, AVFrame *dst, const AVFrame *src);
> +
> +/**
> + * Finish the scaling process for a pair of source/destination frames 
> previously
> + * submitted with sws_frame_start(). Must be called after all 
> sws_send_slice()
> + * and sws_receive_slice() calls are done, before any new sws_frame_start()
> + * calls.
> + */
> +void sws_frame_end(struct SwsContext *c);
> +

> +/**
> + * Indicate that a horizontal slice of input data is available in the source
> + * frame previously provided to sws_frame_start(). The slices may be 
> provided in
> + * any order, but may not overlap. For vertically subsampled pixel formats, 
> the
> + * slices must be aligned according to subsampling.
> + *
> + * @param slice_start first row of the slice
> + * @param slice_height number of rows in the slice
> + *
> + * @return 0 on success, a negative AVERROR code on failure.
> + */
> +int sws_send_slice(struct SwsContext *c, unsigned int slice_start,
> +   unsigned int slice_height);

I suggest to use non 0 on success.
That could then be extended in the future for example to provide information
about how many lines have already been consumed and its memory be reused

thx

[...]

-- 
Michael GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB

If you think the mosad wants you dead since a long time then you are either
wrong or dead since a long time.


signature.asc
Description: PGP signature
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".


Re: [FFmpeg-devel] [PATCH 8/8] lavfi/vf_scale: pass the thread count to the scaler

2021-07-12 Thread Michael Niedermayer
On Mon, Jul 12, 2021 at 01:07:09PM +0200, Anton Khirnov wrote:
> ---
>  libavfilter/vf_scale.c | 1 +
>  1 file changed, 1 insertion(+)
> 
> diff --git a/libavfilter/vf_scale.c b/libavfilter/vf_scale.c
> index cdff3ab7ed..f676f5d82e 100644
> --- a/libavfilter/vf_scale.c
> +++ b/libavfilter/vf_scale.c
> @@ -543,6 +543,7 @@ static int config_props(AVFilterLink *outlink)
>  av_opt_set_int(*s, "sws_flags", scale->flags, 0);
>  av_opt_set_int(*s, "param0", scale->param[0], 0);
>  av_opt_set_int(*s, "param1", scale->param[1], 0);
> +av_opt_set_int(*s, "threads", ff_filter_get_nb_threads(ctx), 0);
>  if (scale->in_range != AVCOL_RANGE_UNSPECIFIED)
>  av_opt_set_int(*s, "src_range",
> scale->in_range == AVCOL_RANGE_JPEG, 0);
> -- 
> 2.30.2

seems to crash:

-f image2 -vcodec pgmyuv -i tests/vsynth1/01.pgm  -vf format=xyz12le -vcodec 
rawvideo -pix_fmt xyz12le -y file-xyz.j2k

==13394== Thread 35:
==13394== Invalid read of size 2
==13394==at 0x118F1BA: rgb48Toxyz12 (swscale.c:705)
==13394==by 0x1190A9A: scale_internal (swscale.c:1048)
==13394==by 0x11911C7: ff_sws_slice_worker (swscale.c:1206)
==13394==by 0x126205E: run_jobs (slicethread.c:61)
==13394==by 0x1262130: thread_worker (slicethread.c:85)
==13394==by 0xCB4E6DA: start_thread (pthread_create.c:463)
==13394==by 0xCE8771E: clone (clone.S:95)
==13394==  Address 0x370db89e is 608,286 bytes inside a block of size 608,287 
alloc'd
==13394==at 0x4C33E76: memalign (in 
/usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==13394==by 0x4C33F91: posix_memalign (in 
/usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==13394==by 0x123235A: av_malloc (mem.c:87)
==13394==by 0x1218455: av_buffer_alloc (buffer.c:72)
==13394==by 0x12184D4: av_buffer_allocz (buffer.c:85)
==13394==by 0x1218EA5: pool_alloc_buffer (buffer.c:351)
==13394==by 0x1218FED: av_buffer_pool_get (buffer.c:388)
==13394==by 0x2B47E9: ff_frame_pool_get (framepool.c:221)
==13394==by 0x472F03: ff_default_get_video_buffer (video.c:90)
==13394==by 0x472FBD: ff_get_video_buffer (video.c:109)
==13394==by 0x472D0E: ff_null_get_video_buffer (video.c:41)
==13394==by 0x472F9E: ff_get_video_buffer (video.c:106)
==13394==by 0x472D0E: ff_null_get_video_buffer (video.c:41)
==13394==by 0x472F9E: ff_get_video_buffer (video.c:106)
==13394==by 0x3E6FDD: scale_frame (vf_scale.c:755)
==13394==by 0x3E7557: filter_frame (vf_scale.c:838)
==13394==by 0x29B314: ff_filter_frame_framed (avfilter.c:969)
==13394==by 0x29BBCF: ff_filter_frame_to_filter (avfilter.c:1117)
==13394==by 0x29BDDF: ff_filter_activate_default (avfilter.c:1166)
==13394==by 0x29C003: ff_filter_activate (avfilter.c:1324)

[...]
-- 
Michael GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB

There will always be a question for which you do not know the correct answer.


signature.asc
Description: PGP signature
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".


[FFmpeg-devel] [PATCH] avcodec/h264_parser: remove key frame tagging heuristics

2021-07-12 Thread James Almer
None of these packets contain keyframes, and tagging them as such can result in
non spec compliant output when remuxing into containers like mp4 and Matroska,
where bogus samples would be marked as Sync Samples.

Some tests are updated to reflect this.

Suggested-by: ffm...@fb.com
Signed-off-by: James Almer 
---
 libavcodec/h264_parser.c   |  8 
 tests/fate-run.sh  |  4 ++--
 tests/fate/ffmpeg.mak  |  2 +-
 tests/fate/lavf-container.mak  | 12 ++--
 tests/fate/matroska.mak|  2 +-
 tests/ref/fate/copy-trac2211-avi   |  2 +-
 tests/ref/fate/matroska-h264-remux |  4 ++--
 tests/ref/fate/segment-mp4-to-ts   | 10 +-
 tests/ref/lavf-fate/h264.mp4   |  4 ++--
 9 files changed, 20 insertions(+), 28 deletions(-)

diff --git a/libavcodec/h264_parser.c b/libavcodec/h264_parser.c
index d3c56cc188..e78c3679fb 100644
--- a/libavcodec/h264_parser.c
+++ b/libavcodec/h264_parser.c
@@ -344,10 +344,6 @@ static inline int parse_nal_units(AVCodecParserContext *s,
 get_ue_golomb_long(&nal.gb);  // skip first_mb_in_slice
 slice_type   = get_ue_golomb_31(&nal.gb);
 s->pict_type = ff_h264_golomb_to_pict_type[slice_type % 5];
-if (p->sei.recovery_point.recovery_frame_cnt >= 0) {
-/* key frame, since recovery_frame_cnt is set */
-s->key_frame = 1;
-}
 pps_id = get_ue_golomb(&nal.gb);
 if (pps_id >= MAX_PPS_COUNT) {
 av_log(avctx, AV_LOG_ERROR,
@@ -370,10 +366,6 @@ static inline int parse_nal_units(AVCodecParserContext *s,
 p->ps.sps = p->ps.pps->sps;
 sps   = p->ps.sps;
 
-// heuristic to detect non marked keyframes
-if (p->ps.sps->ref_frame_count <= 1 && p->ps.pps->ref_count[0] <= 
1 && s->pict_type == AV_PICTURE_TYPE_I)
-s->key_frame = 1;
-
 p->poc.frame_num = get_bits(&nal.gb, sps->log2_max_frame_num);
 
 s->coded_width  = 16 * sps->mb_width;
diff --git a/tests/fate-run.sh b/tests/fate-run.sh
index ba437dfbb8..2117ca387e 100755
--- a/tests/fate-run.sh
+++ b/tests/fate-run.sh
@@ -339,8 +339,8 @@ lavf_container_fate()
 outdir="tests/data/lavf-fate"
 file=${outdir}/lavf.$t
 input="${target_samples}/$1"
-do_avconv $file -auto_conversion_filters $DEC_OPTS $2 -i "$input" 
"$ENC_OPTS -metadata title=lavftest" -vcodec copy -acodec copy
-do_avconv_crc $file -auto_conversion_filters $DEC_OPTS -i 
$target_path/$file $3
+do_avconv $file -auto_conversion_filters $DEC_OPTS $2 -i "$input" 
"$ENC_OPTS -metadata title=lavftest" -vcodec copy -acodec copy %3
+do_avconv_crc $file -auto_conversion_filters $DEC_OPTS -i 
$target_path/$file $4
 }
 
 lavf_image(){
diff --git a/tests/fate/ffmpeg.mak b/tests/fate/ffmpeg.mak
index 4dfb77d250..57d16fba6f 100644
--- a/tests/fate/ffmpeg.mak
+++ b/tests/fate/ffmpeg.mak
@@ -110,7 +110,7 @@ fate-copy-trac4914-avi: CMD = transcode mpegts 
$(TARGET_SAMPLES)/mpeg2/xdcam8mp2
 FATE_STREAMCOPY-$(call ALLYES, H264_DEMUXER AVI_MUXER) += 
fate-copy-trac2211-avi
 fate-copy-trac2211-avi: $(SAMPLES)/h264/bbc2.sample.h264
 fate-copy-trac2211-avi: CMD = transcode "h264 -r 14" 
$(TARGET_SAMPLES)/h264/bbc2.sample.h264\
-  avi "-c:a copy -c:v copy"
+  avi "-c:a copy -c:v copy -copyinkf"
 
 FATE_STREAMCOPY-$(call ENCDEC, APNG, APNG) += fate-copy-apng
 fate-copy-apng: fate-lavf-apng
diff --git a/tests/fate/lavf-container.mak b/tests/fate/lavf-container.mak
index 9e0eed4851..40250badc1 100644
--- a/tests/fate/lavf-container.mak
+++ b/tests/fate/lavf-container.mak
@@ -71,13 +71,13 @@ FATE_LAVF_CONTAINER_FATE = 
$(FATE_LAVF_CONTAINER_FATE-yes:%=fate-lavf-fate-%)
 $(FATE_LAVF_CONTAINER_FATE): REF = 
$(SRC_PATH)/tests/ref/lavf-fate/$(@:fate-lavf-fate-%=%)
 $(FATE_LAVF_CONTAINER_FATE): $(AREF) $(VREF)
 
-fate-lavf-fate-av1.mp4: CMD = lavf_container_fate 
"av1-test-vectors/av1-1-b8-05-mv.ivf" "" "-c:v copy"
-fate-lavf-fate-av1.mkv: CMD = lavf_container_fate 
"av1-test-vectors/av1-1-b8-05-mv.ivf" "" "-c:v copy"
-fate-lavf-fate-h264.mp4: CMD = lavf_container_fate "h264/intra_refresh.h264" 
"" "-c:v copy"
+fate-lavf-fate-av1.mp4: CMD = lavf_container_fate 
"av1-test-vectors/av1-1-b8-05-mv.ivf" "" "" "-c:v copy"
+fate-lavf-fate-av1.mkv: CMD = lavf_container_fate 
"av1-test-vectors/av1-1-b8-05-mv.ivf" "" "" "-c:v copy"
+fate-lavf-fate-h264.mp4: CMD = lavf_container_fate "h264/intra_refresh.h264" 
"" "-copyinkf" "-c:v copy -copyinkf"
 fate-lavf-fate-vp3.ogg: CMD = lavf_container_fate "vp3/coeff_level64.mkv" 
"-idct auto"
-fate-lavf-fate-vp8.ogg: CMD = lavf_container_fate "vp8/RRSF49-short.webm" "" 
"-acodec copy"
-fate-lavf-fate-latm: CMD = lavf_container_fate "aac/al04_44.mp4" "" "-acodec 
copy"
-fate-lavf-fate-mp3: CMD = lavf_container_fate "mp3-conformance/he_32khz.bit" 
"" "-acodec copy"
+fate-lavf-fate-vp8.ogg: CMD = lavf_container_fate "vp8/RRSF49-short.webm" "" 
"" 

Re: [FFmpeg-devel] [PATCH 6/8] lavfi/vf_scale: convert to the frame-based sws API

2021-07-12 Thread Michael Niedermayer
On Mon, Jul 12, 2021 at 01:07:07PM +0200, Anton Khirnov wrote:
> ---
>  libavfilter/vf_scale.c | 73 --
>  1 file changed, 49 insertions(+), 24 deletions(-)

crashes:

 ./ffmpeg  -i ~/tickets/5264/gbrap16.tif -vf 
format=yuva444p,scale=alphablend=checkerboard,format=yuv420p -y file.png

 Stream mapping:
  Stream #0:0 -> #0:0 (tiff (native) -> png (native))
Press [q] to stop, [?] for help
==19419== Invalid read of size 4
==19419==at 0x1223964: av_frame_ref (frame.c:330)
==19419==by 0x1190B34: sws_frame_start (swscale.c:1069)
==19419==by 0x1190EA4: sws_scale_frame (swscale.c:1153)
==19419==by 0x3E7493: scale_frame (vf_scale.c:821)
==19419==by 0x3E752D: filter_frame (vf_scale.c:837)
==19419==by 0x29B314: ff_filter_frame_framed (avfilter.c:969)
==19419==by 0x29BBCF: ff_filter_frame_to_filter (avfilter.c:1117)
==19419==by 0x29BDDF: ff_filter_activate_default (avfilter.c:1166)
==19419==by 0x29C003: ff_filter_activate (avfilter.c:1324)
==19419==by 0x2A0EBB: ff_filter_graph_run_once (avfiltergraph.c:1400)
==19419==by 0x2A2139: push_frame (buffersrc.c:157)
==19419==by 0x2A26B6: av_buffersrc_add_frame_flags (buffersrc.c:225)
==19419==by 0x24FC90: ifilter_send_frame (ffmpeg.c:2241)
==19419==by 0x24FF72: send_frame_to_filters (ffmpeg.c:2315)
==19419==by 0x250D26: decode_video (ffmpeg.c:2512)
==19419==by 0x2517BA: process_input_packet (ffmpeg.c:2674)
==19419==by 0x25799F: process_input (ffmpeg.c:4403)
==19419==by 0x2599A4: transcode_step (ffmpeg.c:4758)
==19419==by 0x259B0C: transcode (ffmpeg.c:4812)
==19419==by 0x25A470: main (ffmpeg.c:5017)
==19419==  Address 0x68 is not stack'd, malloc'd or (recently) free'd
==19419== 
==19419== 
 

[...]
-- 
Michael GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB

Complexity theory is the science of finding the exact solution to an
approximation. Benchmarking OTOH is finding an approximation of the exact


signature.asc
Description: PGP signature
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".


Re: [FFmpeg-devel] [PATCH 6/8] lavfi/vf_scale: convert to the frame-based sws API

2021-07-12 Thread James Almer

On 7/12/2021 4:39 PM, Michael Niedermayer wrote:

On Mon, Jul 12, 2021 at 01:07:07PM +0200, Anton Khirnov wrote:

---
  libavfilter/vf_scale.c | 73 --
  1 file changed, 49 insertions(+), 24 deletions(-)


crashes:

  ./ffmpeg  -i ~/tickets/5264/gbrap16.tif -vf 
format=yuva444p,scale=alphablend=checkerboard,format=yuv420p -y file.png

  Stream mapping:
   Stream #0:0 -> #0:0 (tiff (native) -> png (native))
Press [q] to stop, [?] for help
==19419== Invalid read of size 4
==19419==at 0x1223964: av_frame_ref (frame.c:330)
==19419==by 0x1190B34: sws_frame_start (swscale.c:1069)
==19419==by 0x1190EA4: sws_scale_frame (swscale.c:1153)
==19419==by 0x3E7493: scale_frame (vf_scale.c:821)
==19419==by 0x3E752D: filter_frame (vf_scale.c:837)
==19419==by 0x29B314: ff_filter_frame_framed (avfilter.c:969)
==19419==by 0x29BBCF: ff_filter_frame_to_filter (avfilter.c:1117)
==19419==by 0x29BDDF: ff_filter_activate_default (avfilter.c:1166)
==19419==by 0x29C003: ff_filter_activate (avfilter.c:1324)
==19419==by 0x2A0EBB: ff_filter_graph_run_once (avfiltergraph.c:1400)
==19419==by 0x2A2139: push_frame (buffersrc.c:157)
==19419==by 0x2A26B6: av_buffersrc_add_frame_flags (buffersrc.c:225)
==19419==by 0x24FC90: ifilter_send_frame (ffmpeg.c:2241)
==19419==by 0x24FF72: send_frame_to_filters (ffmpeg.c:2315)
==19419==by 0x250D26: decode_video (ffmpeg.c:2512)
==19419==by 0x2517BA: process_input_packet (ffmpeg.c:2674)
==19419==by 0x25799F: process_input (ffmpeg.c:4403)
==19419==by 0x2599A4: transcode_step (ffmpeg.c:4758)
==19419==by 0x259B0C: transcode (ffmpeg.c:4812)
==19419==by 0x25A470: main (ffmpeg.c:5017)
==19419==  Address 0x68 is not stack'd, malloc'd or (recently) free'd
==19419==
==19419==


Both c->frame_src and c->frame_dst need to be allocated earlier in 
sws_init_context(). There seem to be some cases where that function will 
return early with a success code.

___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".


[FFmpeg-devel] [PATCH] web/security: Add CVE-2021-30123 (never affected a release)

2021-07-12 Thread Michael Niedermayer
Thanks to Jan Ekström for details
---
 src/security | 1 +
 1 file changed, 1 insertion(+)

diff --git a/src/security b/src/security
index 935823b..1248018 100644
--- a/src/security
+++ b/src/security
@@ -15,6 +15,7 @@ CVE-2020-21041, 5d9f44da460f781a1604d537d0555b78e29438ba, 
ticket/7989
 CVE-2020-22038, 7c32e9cf93b712f8463573a59ed4e98fd10fa013, ticket/8285
 CVE-2020-22042, 426c16d61a9b5056a157a1a2a057a4e4d13eef84, ticket/8267
 CVE-2020-24020, 584f396132aa19d21bb1e38ad9a5d428869290cb, ticket/8718
+CVE-2021-30123, d6f293353c94c7ce200f6e0975ae3de49787f91f, ticket/8845, never 
affected a release
 CVE-2020-35965, 3e5959b3457f7f1856d997261e6ac672bba49e8b
 CVE-2020-35965, b0a8b40294ea212c1938348ff112ef1b9bf16bb3
 
-- 
2.17.1

___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".


Re: [FFmpeg-devel] [PATCH] avcodec/h264_parser: remove key frame tagging heuristics

2021-07-12 Thread Kieran Kunhya
On Mon, 12 Jul 2021 at 20:33, James Almer  wrote:

> None of these packets contain keyframes, and tagging them as such can
> result in
> non spec compliant output when remuxing into containers like mp4 and
> Matroska,
> where bogus samples would be marked as Sync Samples.
>
> Some tests are updated to reflect this.
>
> Suggested-by: ffm...@fb.com
> Signed-off-by: James Almer 
> ---
>  libavcodec/h264_parser.c   |  8 
>  tests/fate-run.sh  |  4 ++--
>  tests/fate/ffmpeg.mak  |  2 +-
>  tests/fate/lavf-container.mak  | 12 ++--
>  tests/fate/matroska.mak|  2 +-
>  tests/ref/fate/copy-trac2211-avi   |  2 +-
>  tests/ref/fate/matroska-h264-remux |  4 ++--
>  tests/ref/fate/segment-mp4-to-ts   | 10 +-
>  tests/ref/lavf-fate/h264.mp4   |  4 ++--
>  9 files changed, 20 insertions(+), 28 deletions(-)
>
> diff --git a/libavcodec/h264_parser.c b/libavcodec/h264_parser.c
> index d3c56cc188..e78c3679fb 100644
> --- a/libavcodec/h264_parser.c
> +++ b/libavcodec/h264_parser.c
> @@ -344,10 +344,6 @@ static inline int
> parse_nal_units(AVCodecParserContext *s,
>  get_ue_golomb_long(&nal.gb);  // skip first_mb_in_slice
>  slice_type   = get_ue_golomb_31(&nal.gb);
>  s->pict_type = ff_h264_golomb_to_pict_type[slice_type % 5];
> -if (p->sei.recovery_point.recovery_frame_cnt >= 0) {
> -/* key frame, since recovery_frame_cnt is set */
> -s->key_frame = 1;
> -}
>

Why remove this, this is a reasonable check for a key frame?

Kieran
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".


Re: [FFmpeg-devel] [PATCH] avcodec/h264_parser: remove key frame tagging heuristics

2021-07-12 Thread James Almer

On 7/12/2021 8:53 PM, Kieran Kunhya wrote:

On Mon, 12 Jul 2021 at 20:33, James Almer  wrote:


None of these packets contain keyframes, and tagging them as such can
result in
non spec compliant output when remuxing into containers like mp4 and
Matroska,
where bogus samples would be marked as Sync Samples.

Some tests are updated to reflect this.

Suggested-by: ffm...@fb.com
Signed-off-by: James Almer 
---
  libavcodec/h264_parser.c   |  8 
  tests/fate-run.sh  |  4 ++--
  tests/fate/ffmpeg.mak  |  2 +-
  tests/fate/lavf-container.mak  | 12 ++--
  tests/fate/matroska.mak|  2 +-
  tests/ref/fate/copy-trac2211-avi   |  2 +-
  tests/ref/fate/matroska-h264-remux |  4 ++--
  tests/ref/fate/segment-mp4-to-ts   | 10 +-
  tests/ref/lavf-fate/h264.mp4   |  4 ++--
  9 files changed, 20 insertions(+), 28 deletions(-)

diff --git a/libavcodec/h264_parser.c b/libavcodec/h264_parser.c
index d3c56cc188..e78c3679fb 100644
--- a/libavcodec/h264_parser.c
+++ b/libavcodec/h264_parser.c
@@ -344,10 +344,6 @@ static inline int
parse_nal_units(AVCodecParserContext *s,
  get_ue_golomb_long(&nal.gb);  // skip first_mb_in_slice
  slice_type   = get_ue_golomb_31(&nal.gb);
  s->pict_type = ff_h264_golomb_to_pict_type[slice_type % 5];
-if (p->sei.recovery_point.recovery_frame_cnt >= 0) {
-/* key frame, since recovery_frame_cnt is set */
-s->key_frame = 1;
-}



Why remove this, this is a reasonable check for a key frame?


Because it isn't something that should be marked as a keyframe as coded 
bitstream in any kind of container, like it's the case of mp4 sync samples.




Kieran
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".



___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".


Re: [FFmpeg-devel] [PATCH] avcodec/h264_parser: remove key frame tagging heuristics

2021-07-12 Thread Kieran Kunhya
>
> Because it isn't something that should be marked as a keyframe as coded
> bitstream in any kind of container, like it's the case of mp4 sync samples.
>

MPEG-TS Random Access Indicator expects keyframes to be signalled like this.
With intra-refresh and this code removed, there will be no random access
points at all.

Kieran
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".


Re: [FFmpeg-devel] [PATCH] avcodec/h264_parser: remove key frame tagging heuristics

2021-07-12 Thread James Almer

On 7/12/2021 10:01 PM, Kieran Kunhya wrote:


Because it isn't something that should be marked as a keyframe as coded
bitstream in any kind of container, like it's the case of mp4 sync samples.



MPEG-TS Random Access Indicator expects keyframes to be signalled like this.
With intra-refresh and this code removed, there will be no random access
points at all.


If MPEG-TS wants to tag packets containing things other than IDR access 
units as RAPs, then it should analyze the bitstream itself in order to 
tag them itself as such in the output.
This parser as is is generating invalid output for other containers that 
are strict about key frames, and signal recovery points (like those 
indicated by the use of this SEI) by other means.




Kieran
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".



___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".


Re: [FFmpeg-devel] [PATCH] avcodec/h264_parser: remove key frame tagging heuristics

2021-07-12 Thread Kieran Kunhya
On Tue, 13 Jul 2021, 02:45 James Almer,  wrote:

> On 7/12/2021 10:01 PM, Kieran Kunhya wrote:
> >>
> >> Because it isn't something that should be marked as a keyframe as coded
> >> bitstream in any kind of container, like it's the case of mp4 sync
> samples.
> >>
> >
> > MPEG-TS Random Access Indicator expects keyframes to be signalled like
> this.
> > With intra-refresh and this code removed, there will be no random access
> > points at all.
>
> If MPEG-TS wants to tag packets containing things other than IDR access
> units as RAPs, then it should analyze the bitstream itself in order to
> tag them itself as such in the output.
> This parser as is is generating invalid output for other containers that
> are strict about key frames, and signal recovery points (like those
> indicated by the use of this SEI) by other means.
>

Why not just detect IDR in containers that only care about that (which is a
mistake because if things like open gop)? Doing that's is relatively simple
compared to adding bitstream parsing into MPEGTS.

Kieran
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".