Re: [FFmpeg-devel] [PATCH] fate/oggvorbis: Fix tests after fixing AV_PKT_DATA_SKIP_SAMPLES
10 Jul 2021, 02:12 by sunguangy...@gmail.com: > After fixing AV_PKT_DATA_SKIP_SAMPLES for reading vorbis packets from ogg, > the actual decoded samples become fewer. Three fate tests are failing: > > fate-vorbis-20: > The samples in 6.ogg are not frame aligned. 6.pcm file was generated by > ffmpeg before the fix. After the fix, the decoded pcm file does not match > anymore. Ideally the ref file 6.pcm should be updated but it is probably > not worth it including another copy of the same file, only smaller. > SIZE_TOLERANCE is added for this test case. > > fate-webm-dash-chapters: > The original vorbis_chapter_extension_demo.ogg is transmuxed to dash-webm. > The ref file webm-dash-chapters needs to be updated. > > fate-vorbis-encode: > This exposes another bug in the vorbis encoder that initial_padding is not > correctly set. It is fixed in the previous patch. > > Signed-off-by: Guangyu Sun > --- > tests/fate/vorbis.mak | 1 + > tests/ref/fate/webm-dash-chapters | 4 ++-- > 2 files changed, 3 insertions(+), 2 deletions(-) > Pushed the patchset, thanks ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org https://ffmpeg.org/mailman/listinfo/ffmpeg-devel To unsubscribe, visit link above, or email ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".
Re: [FFmpeg-devel] [PATCH] cafenc: fill in avg. packet size later if unknown
10 Jul 2021, 09:42 by d...@lynne.ee: > 10 Jul 2021, 03:10 by roman.bera...@prusa3d.cz: > >> Frame size of Opus stream was previously presumed here to be 960 samples >> (20ms), however sizes of 120, 240, 480, 1920, and 2880 are also allowed. >> It can also alter on a per-packet basis and even multiple frames may be >> present in a single packet according to the specification, for the sake >> of simplicity however, let us assume that this doesn't occur. >> > > Actually 120ms frames are the maximum, so 5760 samples, but that's > irrelevant to the patch. > > >> if (pb->seekable & AVIO_SEEKABLE_NORMAL) { >> int64_t file_size = avio_tell(pb); >> >> avio_seek(pb, caf->data, SEEK_SET); >> avio_wb64(pb, file_size - caf->data - 8); >> -avio_seek(pb, file_size, SEEK_SET); >> if (!par->block_align) { >> +int packet_size = samples_per_packet(par->codec_id, >> par->channels, par->block_align); >> +if (!packet_size) { >> +packet_size = st->duration / (caf->packets - 1); >> +avio_seek(pb, FRAME_SIZE_OFFSET, SEEK_SET); >> +avio_wb32(pb, packet_size); >> +} >> +avio_seek(pb, file_size, SEEK_SET); >> ffio_wfourcc(pb, "pakt"); >> avio_wb64(pb, caf->size_entries_used + 24); >> avio_wb64(pb, caf->packets); ///< mNumberPackets >> -avio_wb64(pb, caf->packets * samples_per_packet(par->codec_id, >> par->channels, par->block_align)); ///< mNumberValidFrames >> +avio_wb64(pb, caf->packets * packet_size); ///< >> mNumberValidFrames >> avio_wb32(pb, 0); ///< mPrimingFrames >> avio_wb32(pb, 0); ///< mRemainderFrames >> avio_write(pb, caf->pkt_sizes, caf->size_entries_used); >> > > This doesn't move the pointer back to the file end if par->block_align is set. > I think that's fine though, since the function writes the trailer, which > should > mean that nothing more needs to be written. > Patch LGTM. But please, someone yell at Apple to support Opus in MP4, > WebM and OGG, as terrible as that is. > Patch pushed, thanks ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org https://ffmpeg.org/mailman/listinfo/ffmpeg-devel To unsubscribe, visit link above, or email ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".
[FFmpeg-devel] [PATCH v2] libavcodec/libx265: add user data unregistered SEI encoding
MISB ST 0604 and ST 2101 require user data unregistered SEI messages (precision timestamps and sensor identifiers) to be included. That currently isn't supported for libx265. This patch adds support for user data unregistered SEI messages in accordance with ISO/IEC 23008-2:2020 Section D.2.7 The design is based on nvenc, with support finished up at 57de80673cb --- libavcodec/libx265.c | 33 + 1 file changed, 33 insertions(+) diff --git a/libavcodec/libx265.c b/libavcodec/libx265.c index 90658d3d9e..9395120471 100644 --- a/libavcodec/libx265.c +++ b/libavcodec/libx265.c @@ -35,6 +35,7 @@ #include "encode.h" #include "internal.h" #include "packet_internal.h" +#include "sei.h" typedef struct libx265Context { const AVClass *class; @@ -51,6 +52,9 @@ typedef struct libx265Context { char *profile; AVDictionary *x265_opts; +void *sei_data; +int sei_data_size; + /** * If the encoder does not support ROI then warn the first time we * encounter a frame with ROI side data. @@ -78,6 +82,7 @@ static av_cold int libx265_encode_close(AVCodecContext *avctx) libx265Context *ctx = avctx->priv_data; ctx->api->param_free(ctx->params); +av_freep(&ctx->sei_data); if (ctx->encoder) ctx->api->encoder_close(ctx->encoder); @@ -489,6 +494,8 @@ static int libx265_encode_frame(AVCodecContext *avctx, AVPacket *pkt, ctx->api->picture_init(ctx->params, &x265pic); if (pic) { +x265_sei *sei = &x265pic.userSEI; +sei->numPayloads = 0; for (i = 0; i < 3; i++) { x265pic.planes[i] = pic->data[i]; x265pic.stride[i] = pic->linesize[i]; @@ -516,6 +523,32 @@ static int libx265_encode_frame(AVCodecContext *avctx, AVPacket *pkt, memcpy(x265pic.userData, &pic->reordered_opaque, sizeof(pic->reordered_opaque)); } + +for (i = 0; i < pic->nb_side_data; i++) { +AVFrameSideData *side_data = pic->side_data[i]; +void *tmp; +x265_sei_payload *sei_payload; + +if (side_data->type != AV_FRAME_DATA_SEI_UNREGISTERED) +continue; + +tmp = av_fast_realloc(ctx->sei_data, + &ctx->sei_data_size, + (sei->numPayloads + 1) * sizeof(*sei_payload)); +if (!tmp) { +av_freep(&x265pic.userData); +av_freep(&x265pic.quantOffsets); +return AVERROR(ENOMEM); +} +ctx->sei_data = tmp; +sei->payloads = ctx->sei_data; +sei_payload = &sei->payloads[sei->numPayloads]; +sei_payload->payload = side_data->data; +sei_payload->payloadSize = side_data->size; +/* Equal to libx265 USER_DATA_UNREGISTERED */ +sei_payload->payloadType = SEI_TYPE_USER_DATA_UNREGISTERED; +sei->numPayloads++; +} } ret = ctx->api->encoder_encode(ctx->encoder, &nal, &nnal, -- 2.27.0 ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org https://ffmpeg.org/mailman/listinfo/ffmpeg-devel To unsubscribe, visit link above, or email ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".
Re: [FFmpeg-devel] Request for review - x265 User Data Unregistered SEI patch
On Sunday, 11 July 2021 10:01:47 PM AEST Derek Buitenhuis wrote: > Can you amend the commit message to contain the reasoning from [1]? Amended. > A quick review: > > +void *sei_data; > > +int sei_data_size; > > I don't see sei_data freed anywhere at the end of decoding? Fixed in v2. Included in the _close(). > > if (pic) { > > > > +x265_sei *sei = &(x265pic.userSEI); > > Drop the paren for consistency with the rest of the codebase. Fixed in v2. > > +tmp = av_fast_realloc(ctx->sei_data, > > + &ctx->sei_data_size, > > + (sei->numPayloads + 1) * > > sizeof(x265_sei_payload)); > Convention in FFmpeg is to do sizeof(*var). Fixed in v2. > > +if (!tmp) { > > +av_freep(&x265pic.userData); > > +av_freep(&x265pic.quantOffsets); > > +return AVERROR(ENOMEM); > > +} else { > > This else statement is not needed. Fixed in v2. > > +sei_payload = &(sei->payloads[sei->numPayloads]); > > Drop the paren. Fixed in v2. > > +sei_payload->payloadType = USER_DATA_UNREGISTERED; > > I'm surprised x265 has un-namespaced enums... gross. I took Timo's suggest in v2, although I conceptually wanted to say "the x265 value". So there is a comment. Brad ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org https://ffmpeg.org/mailman/listinfo/ffmpeg-devel To unsubscribe, visit link above, or email ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".
Re: [FFmpeg-devel] [PATCH v3 0/2] libx264 configure check clean-up
On Sat, Jul 10, 2021 at 8:55 PM James Almer wrote: > > On 7/10/2021 1:26 PM, Jan Ekström wrote: > > On Wed, Jul 7, 2021, 22:01 Jan Ekström wrote: > > > >> Changes compared to v2: > >> - Kept the CONFIG_LIBX264RGB_ENCODER define check for ff_libx264rgb_encoder > >>and the AVClass for libx264rgb. > >> - Removed the libx264rgb removal from this patch set since while I hoped I > >>would be getting the initial two fixups reviewed even if people would > >> oppose > >>the libx264rgb removal, so at least those could get in - that didn't > >> seem > >>to be happening. This way I hope people would be more likely to focus on > >>that bit at first. > >> > >> The patch set contains two improvements to the libx264rgb configure checks, > >> as I found out that for all the time I had been building FFmpeg with a > >> custom > >> prefix and utilizing pkg-config - it never got enabled due to the configure > >> check relying on the header being in the default include paths or in > >> extra-cflags. > >> > >> - The first change fixes libx264rgb enablement without having x264.h > >>in the system default include path, such as with custom prefixes. > >> > >> - The second change removes the separate X264_CSP_BGR check as x264.h > >>has this define unconditionally defined with the required X264_BUILD > >>118 or newer (it was added a few X264_BUILD versions before). > >> > >>This change was checked by bumping the require_cpp_condition > >>check to X264_BUILD >= 255 and checking with both pkg-config > >>as well as by not having PKG_CONFIG_PATH defined as well as > >>making the non-pkg-config check pass with > >>`--extra-cflags="-I/prefix/include" --extra-ldflags="-L/prefix/lib > >> -ldl"` > >>So the X264_BUILD check should properly fail the enablement in > >>case X264_BUILD is older than the one requested in the relevant > >>require_cpp_condition. > >> > >> Best regards, > >> Jan > >> > >> Jan Ekström (2): > >>configure: move x264_csp_bgr check under general libx264 checks > >>{configure,avcodec/libx264}: remove separate x264_csp_bgr check > >> > >> configure| 3 +-- > >> libavcodec/libx264.c | 2 -- > >> 2 files changed, 1 insertion(+), 4 deletions(-) > >> > >> -- > >> 2.31.1 > >> > > > > Ping on this patch set. > > > > These should be relatively straightforward changes that enable x264rgb when > > it is searched through pkg-config, and testable by installing x264 into a > > specific prefix and not having its headers in the default search path (but > > setting PKG_CONFIG_PATH accordingly). > > > > Jan > > Should be ok. > Thanks, applied as 25d28f297b755d3cb6a3e036a1e251148d0e4d5c and f32f56468c6caa03f4ebbf6cf58b2bb7bc775216 . > And removing libx264rgb altogether should be done at some point if you > can get rgb output with the normal wrapper. We purely limit it by the accessible color spaces defined for the AVCodec, so by removing that limitation you most definitely can get RGB output from the normal wrapper. I had that patch as part of v1/2 of this set. I will now post a separate patch for it. If there are some steps that have to be done for it, I'm fine with that (like enabling libx264 to handle RGB in a first commit, then marking libx264rgb for deprecation etc). But the only comments I got last time were: - "-c:v libx264rgb" no longer works. (well yes, the idea of the patch was to remove it) - the current libx264 default is more supported by non-libavcodec decoders (which generally with RGB input at the moment is and probably always was 4:4:4 YCbCr - which I am not sure if it is in any way or form a major difference against 4:4:4 + RGB color space metadata?). The original ticket against the backdrop of which libx264rgb was originally split was 100% talking about yuv420p as the default for HW support (https://trac.ffmpeg.org/ticket/658), which is not what the split is doing. Jan ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org https://ffmpeg.org/mailman/listinfo/ffmpeg-devel To unsubscribe, visit link above, or email ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".
[FFmpeg-devel] [PATCH] cafenc: fill in avg. packet size later if unknown
In-Reply-To: References: <20210710011006.3383868-1-roman.bera...@prusa3d.cz> Reply-To: FFmpeg development discussions and patches > This doesn't move the pointer back to the file end if par->block_align is set. > I think that's fine though, since the function writes the trailer, which > should > mean that nothing more needs to be written. Is it a convention to leave the pointer positioned at the end of the file? ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org https://ffmpeg.org/mailman/listinfo/ffmpeg-devel To unsubscribe, visit link above, or email ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".
Re: [FFmpeg-devel] FFMPEG for V4L2 M2M devices ?
On Sat, 10 Jul 2021 at 00:56, Brad Hards wrote: > > On Saturday, 10 July 2021 8:53:27 AM AEST Andrii wrote: > > I am working on porting a Kodi player to an NVidia Jetson Nano device. I've > > been developing a decoder for quite some time now, and realized that the > > best approach would be to have it inside of ffmpeg, instead of embedding > > the decoder into Kodi as it heavily relies on FFMPEG. Just wondering if > > there is any effort in making FFMPEG suppring M2M V4L devices ? > > https://git.ffmpeg.org/gitweb/ffmpeg.git/blob_plain/HEAD:/libavcodec/v4l2_m2m.c[1] > > I guess that would be the basis for further work as required to meet your > needs. Do note that there are 2 V4L2 M2M decoder APIs - the stateful API[1] , and the stateless API [2]. They differ in the amount of bitstream parsing and buffer management that the driver implements vs expecting the client to do. The *_v4l2m2m drivers within FFMPEG support the stateful API (ie the kernel driver has bitstream parsing). For Raspberry Pi we use that to support the (older) H264 implementation, and FFMPEG master does that very well. The Pi HEVC decoder uses the V4L2 stateless API. Stateless HEVC support hasn't been merged to the mainline kernel as yet, so there are downstream patches to support that. A quick Google implies that NVidia already has a stateful V4L2 M2M driver in their vendor kernel. Other than the strange choice of device node name (/dev/nvhost-nvdec), the details at [3] make it look like a normal V4L2 M2M decoder that has a good chance of working against h264_v4l2m2m. [1] https://www.kernel.org/doc/html/latest/userspace-api/media/v4l/dev-decoder.html [2] https://www.kernel.org/doc/html/latest/userspace-api/media/v4l/dev-stateless-decoder.html [3] https://docs.nvidia.com/jetson/l4t-multimedia/group__V4L2Dec.html Dave > Brad > > > [1] > https://git.ffmpeg.org/gitweb/ffmpeg.git/blob_plain/HEAD:/libavcodec/v4l2_m2m.c > ___ > ffmpeg-devel mailing list > ffmpeg-devel@ffmpeg.org > https://ffmpeg.org/mailman/listinfo/ffmpeg-devel > > To unsubscribe, visit link above, or email > ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe". ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org https://ffmpeg.org/mailman/listinfo/ffmpeg-devel To unsubscribe, visit link above, or email ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".
[FFmpeg-devel] [PATCH] cafenc: fill in avg. packet size later if unknown
> This doesn't move the pointer back to the file end if par->block_align is set. > I think that's fine though, since the function writes the trailer, which > should > mean that nothing more needs to be written. Is it a convention to leave the pointer positioned at the end of the file? ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org https://ffmpeg.org/mailman/listinfo/ffmpeg-devel To unsubscribe, visit link above, or email ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".
Re: [FFmpeg-devel] [PATCH 1/2] libavutil/cpu: Adds fast gather detection.
On Fri, Jun 25, 2021 at 1:24 PM Alan Kelly wrote: > On Fri, Jun 25, 2021 at 10:40 AM Lynne wrote: > >> Jun 25, 2021, 09:54 by alankelly-at-google@ffmpeg.org: >> >> > Broadwell and later and Zen3 and later have fast gather instructions. >> > --- >> > Gather requires between 9 and 12 cycles on Haswell, 5 to 7 on >> Broadwell, >> > and 2 to 5 on Skylake and newer. It is also slow on AMD before Zen 3. >> > libavutil/cpu.h | 2 ++ >> > libavutil/x86/cpu.c | 18 -- >> > libavutil/x86/cpu.h | 1 + >> > 3 files changed, 19 insertions(+), 2 deletions(-) >> > >> >> No, we really don't need more FAST/SLOW flags, especially for >> something like this which is just fixable by _not_using_vgather_. >> Take a look at libavutil/x86/tx_float.asm, we only use vgather >> if it's guaranteed to either be faster for what we're gathering or >> is just as fast "slow". If neither is true, we use manual lookups, >> which is actually advantageous since for AVX2 we can interleave >> the lookups that happen in each lane. >> >> Even if we disregard this, I've extensively benchmarked vgather >> on Zen 3, Zen 2, Cascade Lake and Skylake, and there's hardly >> a great vgather improvement to be found in Zen 3 to justify >> using a new CPU flag for this. >> ___ >> ffmpeg-devel mailing list >> ffmpeg-devel@ffmpeg.org >> https://ffmpeg.org/mailman/listinfo/ffmpeg-devel >> >> To unsubscribe, visit link above, or email >> ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe". >> > > Thanks for your response. I'm not against finding a cleaner way of > enabling/disabling the code which will be protected by this flag. However, > the manual lookups solution proposed will not work in this case, the avx2 > version of hscale will only be faster if fast gathers are available, > otherwise, the ssse3 version should be used. > > I haven't got access to a Zen3 so I can't comment on the performance. I > have tested on a Zen 2 and it is slow. On Broadwell hscale avx2 is about > 10% faster than the ssse3 version and on Skylake about 40% faster, Haswell > has similar performance to Zen2. > > Is there a proxy which could be used for detecting Broadwell or Skylake > and later? AVX512 seems too strict as there are Skylake chips without > AVX512. Thanks > Hi, I will paste the performance figures from the thread for the other part of this patch here so that the justification for this flag is clearer: Skylake Haswell hscale_8_to_15_width4_ssse3 761.2 760 hscale_8_to_15_width4_avx2 468.7 957 hscale_8_to_15_width8_ssse3 1170.7 1032 hscale_8_to_15_width8_avx2 865.7 1979 hscale_8_to_15_width12_ssse3 2172.2 2472 hscale_8_to_15_width12_avx2 1245.7 2901 hscale_8_to_15_width16_ssse3 2244.2 2400 hscale_8_to_15_width16_avx2 1647.2 3681 As you can see, it is catastrophic on Haswell and older chips but the gains on Skylake are impressive. As I don't have performance figures for Zen 3, I can disable this feature on all cpus apart from Broadwell and later as you say that there is no worthwhile improvement on Zen3. Is this OK with you? Thanks ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org https://ffmpeg.org/mailman/listinfo/ffmpeg-devel To unsubscribe, visit link above, or email ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".
Re: [FFmpeg-devel] [PATCH 24/24] lavfi/vf_scale: implement slice threading
Quoting Michael Niedermayer (2021-07-03 18:27:36) > On Sat, Jul 03, 2021 at 03:27:36PM +0200, Anton Khirnov wrote: > > Quoting Michael Niedermayer (2021-06-01 11:35:13) > > > On Mon, May 31, 2021 at 09:55:15AM +0200, Anton Khirnov wrote: > > > > --- > > > > libavfilter/vf_scale.c | 182 +++-- > > > > 1 file changed, 141 insertions(+), 41 deletions(-) > > > > > > breaks: (lower 50% is bright green) > > > ./ffplay -i mm-short.mpg -an -vf "tinterlace,scale=720:576:interl=1" > > > > Fixed locally, but I'm wondering why is interlaced scaling not done by > > default for interlaced videos. > > IIRC the flags where quite unreliable. If we have reliable knowledge about > interlacing it certainly should be used automatically You mean there is a significant amount of progressive content that is flagged as interlaced? -- Anton Khirnov ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org https://ffmpeg.org/mailman/listinfo/ffmpeg-devel To unsubscribe, visit link above, or email ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".
Re: [FFmpeg-devel] [PATCH 1/2] libavutil/cpu: Adds fast gather detection.
12 Jul 2021, 11:29 by alankelly-at-google@ffmpeg.org: > On Fri, Jun 25, 2021 at 1:24 PM Alan Kelly wrote: > >> On Fri, Jun 25, 2021 at 10:40 AM Lynne wrote: >> >>> Jun 25, 2021, 09:54 by alankelly-at-google@ffmpeg.org: >>> >>> > Broadwell and later and Zen3 and later have fast gather instructions. >>> > --- >>> > Gather requires between 9 and 12 cycles on Haswell, 5 to 7 on >>> Broadwell, >>> > and 2 to 5 on Skylake and newer. It is also slow on AMD before Zen 3. >>> > libavutil/cpu.h | 2 ++ >>> > libavutil/x86/cpu.c | 18 -- >>> > libavutil/x86/cpu.h | 1 + >>> > 3 files changed, 19 insertions(+), 2 deletions(-) >>> > >>> >>> No, we really don't need more FAST/SLOW flags, especially for >>> something like this which is just fixable by _not_using_vgather_. >>> Take a look at libavutil/x86/tx_float.asm, we only use vgather >>> if it's guaranteed to either be faster for what we're gathering or >>> is just as fast "slow". If neither is true, we use manual lookups, >>> which is actually advantageous since for AVX2 we can interleave >>> the lookups that happen in each lane. >>> >>> Even if we disregard this, I've extensively benchmarked vgather >>> on Zen 3, Zen 2, Cascade Lake and Skylake, and there's hardly >>> a great vgather improvement to be found in Zen 3 to justify >>> using a new CPU flag for this. >>> ___ >>> ffmpeg-devel mailing list >>> ffmpeg-devel@ffmpeg.org >>> https://ffmpeg.org/mailman/listinfo/ffmpeg-devel >>> >>> To unsubscribe, visit link above, or email >>> ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe". >>> >> >> Thanks for your response. I'm not against finding a cleaner way of >> enabling/disabling the code which will be protected by this flag. However, >> the manual lookups solution proposed will not work in this case, the avx2 >> version of hscale will only be faster if fast gathers are available, >> otherwise, the ssse3 version should be used. >> >> I haven't got access to a Zen3 so I can't comment on the performance. I >> have tested on a Zen 2 and it is slow. On Broadwell hscale avx2 is about >> 10% faster than the ssse3 version and on Skylake about 40% faster, Haswell >> has similar performance to Zen2. >> >> Is there a proxy which could be used for detecting Broadwell or Skylake >> and later? AVX512 seems too strict as there are Skylake chips without >> AVX512. Thanks >> > > Hi, > > I will paste the performance figures from the thread for the other part of > this patch here so that the justification for this flag is clearer: > > Skylake Haswell > hscale_8_to_15_width4_ssse3 761.2 760 > hscale_8_to_15_width4_avx2 468.7 957 > hscale_8_to_15_width8_ssse3 1170.7 1032 > hscale_8_to_15_width8_avx2 865.7 1979 > hscale_8_to_15_width12_ssse3 2172.2 2472 > hscale_8_to_15_width12_avx2 1245.7 2901 > hscale_8_to_15_width16_ssse3 2244.2 2400 > hscale_8_to_15_width16_avx2 1647.2 3681 > > As you can see, it is catastrophic on Haswell and older chips but the gains > on Skylake are impressive. > As I don't have performance figures for Zen 3, I can disable this feature > on all cpus apart from Broadwell and later as you say that there is no > worthwhile improvement on Zen3. Is this OK with you? > It's not that catastrophic. Since Haswell CPUs generally don't have large AVX2 gains, could you just exclude Haswell only from EXTERNAL_AVX2_FAST, and require EXTERNAL_AVX2_FAST to enable those functions? ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org https://ffmpeg.org/mailman/listinfo/ffmpeg-devel To unsubscribe, visit link above, or email ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".
Re: [FFmpeg-devel] [PATCH] mxfdec.c: fixed frame wrapping detection for MXFGCP1FrameWrappedPicture essence container
sön 2021-07-11 klockan 09:47 -0700 skrev p...@sandflow.com: > From: Pierre-Anthony Lemieux > > Signed-off-by: Pierre-Anthony Lemieux > --- > > Notes: > For JPEG 2000 essence, the MXF input format module currently uses the > value of byte 14 of the essence container UL to determines whether the J2K > essence is clip- (byte 14 is 0x02) > or frame-wrapped (byte 14 is 0x01). This approach does work when the > essence container UL is equal to MXFGCP1FrameWrappedPicture, in which case > the essence is always frame-wrapped. > > libavformat/mxf.h| 3 ++- > libavformat/mxfdec.c | 4 > 2 files changed, 6 insertions(+), 1 deletion(-) > > diff --git a/libavformat/mxf.h b/libavformat/mxf.h > index b1b1fedac7..ca510f5a2f 100644 > --- a/libavformat/mxf.h > +++ b/libavformat/mxf.h > @@ -75,7 +75,8 @@ typedef enum { > NormalWrap = 0, > D10D11Wrap, > RawAWrap, > -RawVWrap > +RawVWrap, > +AlwaysFrameWrap > } MXFWrappingIndicatorType; > > typedef struct MXFLocalTagPair { > diff --git a/libavformat/mxfdec.c b/libavformat/mxfdec.c > index 3bf480a3a6..7024d2ea7d 100644 > --- a/libavformat/mxfdec.c > +++ b/libavformat/mxfdec.c > @@ -1413,6 +1413,7 @@ static void *mxf_resolve_strong_ref(MXFContext *mxf, > UID *strong_ref, enum MXFMe > > static const MXFCodecUL mxf_picture_essence_container_uls[] = { > // video essence container uls > +{ { > 0x06,0x0e,0x2b,0x34,0x04,0x01,0x01,0x07,0x0d,0x01,0x03,0x01,0x02,0x0c,0x06,0x00 > }, 15, AV_CODEC_ID_JPEG2000, NULL, 16, AlwaysFrameWrap }, /* MXF-GC P1 > Frame-Wrapped JPEG 2000 */ > { { > 0x06,0x0e,0x2b,0x34,0x04,0x01,0x01,0x07,0x0d,0x01,0x03,0x01,0x02,0x0c,0x01,0x00 > }, 14, AV_CODEC_ID_JPEG2000, NULL, 14 }, > { { > 0x06,0x0e,0x2b,0x34,0x04,0x01,0x01,0x02,0x0d,0x01,0x03,0x01,0x02,0x10,0x60,0x01 > }, 14, AV_CODEC_ID_H264, NULL, 15 }, /* H.264 */ > { { > 0x06,0x0e,0x2b,0x34,0x04,0x01,0x01,0x02,0x0d,0x01,0x03,0x01,0x02,0x11,0x01,0x00 > }, 14, AV_CODEC_ID_DNXHD, NULL, 14 }, /* VC-3 */ > @@ -1497,6 +1498,9 @@ static MXFWrappingScheme mxf_get_wrapping_kind(UID > *essence_container_ul) > if (val == 0x02) > val = 0x01; > break; > +case AlwaysFrameWrap: > +val = 0x01; > +break; Looks OK. Still passes FATE. /Tomas ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org https://ffmpeg.org/mailman/listinfo/ffmpeg-devel To unsubscribe, visit link above, or email ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".
[FFmpeg-devel] [RFC/PATCH v2] swscale slice threading
Hi, here is a new iteration of $subj. Compared to the first version, threading has been moved into sws using lavu slicethread. There is also a new AVFrame-based API that allows submitting and receiving partial slices (at least API-wise, the implementation will still wait for complete input). There is still no way for the caller to know how much input is required for a given amount of output, but that may be implemented in a new function in the future, if there is a use case for it. The set still needs some polishing, but I am sending it now to see if the general shape of the API is now acceptable. Please comment. -- Anton Khirnov ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org https://ffmpeg.org/mailman/listinfo/ffmpeg-devel To unsubscribe, visit link above, or email ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".
[FFmpeg-devel] [PATCH 2/8] FATE: add a test for sliced scaling
--- Makefile | 2 + tests/Makefile| 1 + tests/fate/libswscale.mak | 11 +++ tools/Makefile| 3 +- tools/scale_slice_test.c | 190 ++ 5 files changed, 206 insertions(+), 1 deletion(-) create mode 100644 tools/scale_slice_test.c diff --git a/Makefile b/Makefile index 1e3da6271b..26c9107237 100644 --- a/Makefile +++ b/Makefile @@ -64,6 +64,8 @@ tools/target_io_dem_fuzzer$(EXESUF): tools/target_io_dem_fuzzer.o $(FF_DEP_LIBS) tools/enum_options$(EXESUF): ELIBS = $(FF_EXTRALIBS) tools/enum_options$(EXESUF): $(FF_DEP_LIBS) +tools/scale_slice_test$(EXESUF): $(FF_DEP_LIBS) +tools/scale_slice_test$(EXESUF): ELIBS = $(FF_EXTRALIBS) tools/sofa2wavs$(EXESUF): ELIBS = $(FF_EXTRALIBS) tools/uncoded_frame$(EXESUF): $(FF_DEP_LIBS) tools/uncoded_frame$(EXESUF): ELIBS = $(FF_EXTRALIBS) diff --git a/tests/Makefile b/tests/Makefile index d726484b3a..e42e66d81b 100644 --- a/tests/Makefile +++ b/tests/Makefile @@ -221,6 +221,7 @@ $(FATE_FFPROBE) $(FATE_FFMPEG_FFPROBE) $(FATE_SAMPLES_FFPROBE) $(FATE_SAMPLES_FF $(FATE_SAMPLES_FASTSTART): tools/qt-faststart$(EXESUF) $(FATE_SAMPLES_DUMP_DATA): tools/venc_data_dump$(EXESUF) +$(FATE_SAMPLES_SCALE_SLICE): tools/scale_slice_test$(EXESUF) ifdef SAMPLES FATE += $(FATE_EXTERN) diff --git a/tests/fate/libswscale.mak b/tests/fate/libswscale.mak index 5ec5f34cc4..599d27b0a5 100644 --- a/tests/fate/libswscale.mak +++ b/tests/fate/libswscale.mak @@ -6,6 +6,17 @@ FATE_LIBSWSCALE += fate-sws-floatimg-cmp fate-sws-floatimg-cmp: libswscale/tests/floatimg_cmp$(EXESUF) fate-sws-floatimg-cmp: CMD = run libswscale/tests/floatimg_cmp$(EXESUF) +SWS_SLICE_TEST-$(call DEMDEC, MATROSKA, VP9) += fate-sws-slice-yuv422-12bit-rgb48 +fate-sws-slice-yuv422-12bit-rgb48: CMD = run tools/scale_slice_test$(EXESUF) $(TARGET_SAMPLES)/vp9-test-vectors/vp93-2-20-12bit-yuv422.webm 150 100 rgb48 + +SWS_SLICE_TEST-$(call DEMDEC, IMAGE_BMP_PIPE, BMP) += fate-sws-slice-bgr0-nv12 +fate-sws-slice-bgr0-nv12: CMD = run tools/scale_slice_test$(EXESUF) $(TARGET_SAMPLES)/bmp/test32bf.bmp 32 64 nv12 + +fate-sws-slice: $(SWS_SLICE_TEST-yes) +$(SWS_SLICE_TEST-yes): tools/scale_slice_test$(EXESUF) +$(SWS_SLICE_TEST-yes): REF = /dev/null +FATE_LIBSWSCALE += $(SWS_SLICE_TEST-yes) + FATE_LIBSWSCALE += $(FATE_LIBSWSCALE-yes) FATE-$(CONFIG_SWSCALE) += $(FATE_LIBSWSCALE) fate-libswscale: $(FATE_LIBSWSCALE) diff --git a/tools/Makefile b/tools/Makefile index ec260f254e..f4d1327b9f 100644 --- a/tools/Makefile +++ b/tools/Makefile @@ -1,4 +1,4 @@ -TOOLS = enum_options qt-faststart trasher uncoded_frame +TOOLS = enum_options qt-faststart scale_slice_test trasher uncoded_frame TOOLS-$(CONFIG_LIBMYSOFA) += sofa2wavs TOOLS-$(CONFIG_ZLIB) += cws2fws @@ -18,6 +18,7 @@ tools/target_io_dem_fuzzer.o: tools/target_dem_fuzzer.c $(COMPILE_C) -DIO_FLAT=0 tools/venc_data_dump$(EXESUF): tools/decode_simple.o +tools/scale_slice_test$(EXESUF): tools/decode_simple.o OUTDIRS += tools diff --git a/tools/scale_slice_test.c b/tools/scale_slice_test.c new file mode 100644 index 00..d869eaae74 --- /dev/null +++ b/tools/scale_slice_test.c @@ -0,0 +1,190 @@ +/* + * This file is part of FFmpeg. + * + * FFmpeg is free software; you can redistribute it and/or + * modify it under the terms of the GNU Lesser General Public + * License as published by the Free Software Foundation; either + * version 2.1 of the License, or (at your option) any later version. + * + * FFmpeg is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + * Lesser General Public License for more details. + * + * You should have received a copy of the GNU Lesser General Public + * License along with FFmpeg; if not, write to the Free Software + * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA + */ + +#include +#include +#include + +#include "decode_simple.h" + +#include "libavutil/common.h" +#include "libavutil/pixdesc.h" +#include "libavutil/error.h" +#include "libavutil/lfg.h" +#include "libavutil/random_seed.h" +#include "libavutil/video_enc_params.h" + +#include "libavformat/avformat.h" + +#include "libavcodec/avcodec.h" + +#include "libswscale/swscale.h" + +typedef struct PrivData { +unsigned int random_seed; +AVLFGlfg; + +struct SwsContext *scaler; + +int v_shift_dst, h_shift_dst; +int v_shift_src, h_shift_src; + +AVFrame *frame_ref; +AVFrame *frame_dst; +} PrivData; + +static int process_frame(DecodeContext *dc, AVFrame *frame) +{ +PrivData *pd = dc->opaque; +int slice_start = 0; +int ret; + +if (!frame) +return 0; + +if (!pd->scaler) { +pd->scaler = sws_getContext(frame->width, frame->height, frame->format, +pd->frame_ref->width, pd->frame_ref->height, +
[FFmpeg-devel] [PATCH 3/8] lavfi/vf_scale: remove the nb_slices option
It was intended for debugging only and has been superseded by the standalone tool for testing sliced scaling. --- libavfilter/vf_scale.c | 14 -- 1 file changed, 14 deletions(-) diff --git a/libavfilter/vf_scale.c b/libavfilter/vf_scale.c index 71d7fa2890..39ab3a4b28 100644 --- a/libavfilter/vf_scale.c +++ b/libavfilter/vf_scale.c @@ -149,8 +149,6 @@ typedef struct ScaleContext { int force_original_aspect_ratio; int force_divisible_by; -int nb_slices; - int eval_mode; ///< expression evaluation mode } ScaleContext; @@ -794,17 +792,6 @@ scale: ret = scale_slice(scale, out, in, scale->isws[0], 0, (link->h+1)/2, 2, 0); if (ret >= 0) ret = scale_slice(scale, out, in, scale->isws[1], 0, link->h /2, 2, 1); -} else if (scale->nb_slices) { -int i, slice_h, slice_start, slice_end = 0; -const int nb_slices = FFMIN(scale->nb_slices, link->h); -for (i = 0; i < nb_slices; i++) { -slice_start = slice_end; -slice_end = (link->h * (i+1)) / nb_slices; -slice_h = slice_end - slice_start; -ret = scale_slice(scale, out, in, scale->sws, slice_start, slice_h, 1, 0); -if (ret < 0) -break; -} } else { ret = scale_slice(scale, out, in, scale->sws, 0, link->h, 1, 0); } @@ -936,7 +923,6 @@ static const AVOption scale_options[] = { { "force_divisible_by", "enforce that the output resolution is divisible by a defined integer when force_original_aspect_ratio is used", OFFSET(force_divisible_by), AV_OPT_TYPE_INT, { .i64 = 1}, 1, 256, FLAGS }, { "param0", "Scaler param 0", OFFSET(param[0]), AV_OPT_TYPE_DOUBLE, { .dbl = SWS_PARAM_DEFAULT }, INT_MIN, INT_MAX, FLAGS }, { "param1", "Scaler param 1", OFFSET(param[1]), AV_OPT_TYPE_DOUBLE, { .dbl = SWS_PARAM_DEFAULT }, INT_MIN, INT_MAX, FLAGS }, -{ "nb_slices", "set the number of slices (debug purpose only)", OFFSET(nb_slices), AV_OPT_TYPE_INT, { .i64 = 0 }, 0, INT_MAX, FLAGS }, { "eval", "specify when to evaluate expressions", OFFSET(eval_mode), AV_OPT_TYPE_INT, {.i64 = EVAL_MODE_INIT}, 0, EVAL_MODE_NB-1, FLAGS, "eval" }, { "init", "eval expressions once during initialization", 0, AV_OPT_TYPE_CONST, {.i64=EVAL_MODE_INIT}, .flags = FLAGS, .unit = "eval" }, { "frame", "eval expressions during initialization and per-frame", 0, AV_OPT_TYPE_CONST, {.i64=EVAL_MODE_FRAME}, .flags = FLAGS, .unit = "eval" }, -- 2.30.2 ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org https://ffmpeg.org/mailman/listinfo/ffmpeg-devel To unsubscribe, visit link above, or email ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".
[FFmpeg-devel] [PATCH 4/8] lavu/slicethread: return ENOSYS rather than EINVAL in the dummy func
EINVAL is the wrong error code here, since the arguments passed to the function are valid. The error is that the function is not implemented in the build, which corresponds to ENOSYS. --- libavutil/slicethread.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/libavutil/slicethread.c b/libavutil/slicethread.c index dfbe551ef2..fd2145040d 100644 --- a/libavutil/slicethread.c +++ b/libavutil/slicethread.c @@ -239,7 +239,7 @@ int avpriv_slicethread_create(AVSliceThread **pctx, void *priv, int nb_threads) { *pctx = NULL; -return AVERROR(EINVAL); +return AVERROR(ENOSYS); } void avpriv_slicethread_execute(AVSliceThread *ctx, int nb_jobs, int execute_main) -- 2.30.2 ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org https://ffmpeg.org/mailman/listinfo/ffmpeg-devel To unsubscribe, visit link above, or email ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".
[FFmpeg-devel] [PATCH 1/8] tools/venc_data_dump: factor out demux/decode code
It can be shared with other simple demux/decode tools. --- tests/ref/fate/source | 1 + tools/Makefile | 2 + tools/decode_simple.c | 157 + tools/decode_simple.h | 53 ++ tools/venc_data_dump.c | 156 +--- 5 files changed, 248 insertions(+), 121 deletions(-) create mode 100644 tools/decode_simple.c create mode 100644 tools/decode_simple.h diff --git a/tests/ref/fate/source b/tests/ref/fate/source index c64bc05241..69dcdc4f27 100644 --- a/tests/ref/fate/source +++ b/tests/ref/fate/source @@ -20,5 +20,6 @@ Headers without standard inclusion guards: compat/djgpp/math.h compat/float/float.h compat/float/limits.h +tools/decode_simple.h Use of av_clip() where av_clip_uintp2() could be used: Use of av_clip() where av_clip_intp2() could be used: diff --git a/tools/Makefile b/tools/Makefile index 82baa8eadb..ec260f254e 100644 --- a/tools/Makefile +++ b/tools/Makefile @@ -17,6 +17,8 @@ tools/target_dem_fuzzer.o: tools/target_dem_fuzzer.c tools/target_io_dem_fuzzer.o: tools/target_dem_fuzzer.c $(COMPILE_C) -DIO_FLAT=0 +tools/venc_data_dump$(EXESUF): tools/decode_simple.o + OUTDIRS += tools clean:: diff --git a/tools/decode_simple.c b/tools/decode_simple.c new file mode 100644 index 00..b679fd7ce6 --- /dev/null +++ b/tools/decode_simple.c @@ -0,0 +1,157 @@ +/* + * This file is part of FFmpeg. + * + * FFmpeg is free software; you can redistribute it and/or + * modify it under the terms of the GNU Lesser General Public + * License as published by the Free Software Foundation; either + * version 2.1 of the License, or (at your option) any later version. + * + * FFmpeg is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + * Lesser General Public License for more details. + * + * You should have received a copy of the GNU Lesser General Public + * License along with FFmpeg; if not, write to the Free Software + * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA + */ + +/* shared code for simple demux/decode tools */ + +#include +#include + +#include "decode_simple.h" + +#include "libavformat/avformat.h" + +#include "libavcodec/avcodec.h" +#include "libavcodec/packet.h" + +#include "libavutil/dict.h" +#include "libavutil/error.h" +#include "libavutil/frame.h" + +static int decode_read(DecodeContext *dc, int flush) +{ +const int ret_done = flush ? AVERROR_EOF : AVERROR(EAGAIN); +int ret = 0; + +while (ret >= 0 && + (dc->max_frames == 0 || dc->decoder->frame_number < dc->max_frames)) { +ret = avcodec_receive_frame(dc->decoder, dc->frame); +if (ret < 0) { +if (ret == AVERROR_EOF) { +int err = dc->process_frame(dc, NULL); +if (err < 0) +return err; +} + +return (ret == ret_done) ? 0 : ret; +} + +ret = dc->process_frame(dc, dc->frame); +av_frame_unref(dc->frame); +if (ret < 0) +return ret; + +if (dc->max_frames && dc->decoder->frame_number == dc->max_frames) +return 1; +} + +return (dc->max_frames == 0 || dc->decoder->frame_number < dc->max_frames) ? 0 : 1; +} + +int ds_run(DecodeContext *dc) +{ +int ret; + +ret = avcodec_open2(dc->decoder, NULL, &dc->decoder_opts); +if (ret < 0) +return ret; + +while (ret >= 0) { +ret = av_read_frame(dc->demuxer, dc->pkt); +if (ret < 0) +goto flush; +if (dc->pkt->stream_index != dc->stream->index) { +av_packet_unref(dc->pkt); +continue; +} + +ret = avcodec_send_packet(dc->decoder, dc->pkt); +if (ret < 0) { +fprintf(stderr, "Error decoding: %d\n", ret); +return ret; +} +av_packet_unref(dc->pkt); + +ret = decode_read(dc, 0); +if (ret < 0) { +fprintf(stderr, "Error decoding: %d\n", ret); +return ret; +} else if (ret > 0) +return 0; +} + +flush: +avcodec_send_packet(dc->decoder, NULL); +ret = decode_read(dc, 1); +if (ret < 0) { +fprintf(stderr, "Error flushing: %d\n", ret); +return ret; +} + +return 0; +} + +void ds_free(DecodeContext *dc) +{ +av_dict_free(&dc->decoder_opts); + +av_frame_free(&dc->frame); +av_packet_free(&dc->pkt); + +avcodec_free_context(&dc->decoder); +avformat_close_input(&dc->demuxer); +} + +int ds_open(DecodeContext *dc, const char *url, int stream_idx) +{ +const AVCodec *codec; +int ret; + +memset(dc, 0, sizeof(*dc)); + +dc->pkt = av_packet_alloc(); +dc->frame = av_frame_alloc(); +if (!dc->pkt || !dc->frame) { +ret = AVERROR(ENOMEM); +goto fail; +} + +ret = avformat_open_
[FFmpeg-devel] [PATCH 7/8] sws: implement slice threading
--- libswscale/options.c | 3 ++ libswscale/swscale.c | 56 libswscale/swscale_internal.h | 14 ++ libswscale/utils.c| 82 +++ 4 files changed, 155 insertions(+) diff --git a/libswscale/options.c b/libswscale/options.c index 7eb2752543..4b71a23e37 100644 --- a/libswscale/options.c +++ b/libswscale/options.c @@ -81,6 +81,9 @@ static const AVOption swscale_options[] = { { "uniform_color", "blend onto a uniform color",0, AV_OPT_TYPE_CONST, { .i64 = SWS_ALPHA_BLEND_UNIFORM},INT_MIN, INT_MAX, VE, "alphablend" }, { "checkerboard","blend onto a checkerboard", 0, AV_OPT_TYPE_CONST, { .i64 = SWS_ALPHA_BLEND_CHECKERBOARD},INT_MIN, INT_MAX, VE, "alphablend" }, +{ "threads", "number of threads", OFFSET(nb_threads), AV_OPT_TYPE_INT, {.i64 = 1 }, 0, INT_MAX, VE, "threads" }, +{ "auto",NULL,0, AV_OPT_TYPE_CONST, {.i64 = 0 },.flags = VE, "threads" }, + { NULL } }; diff --git a/libswscale/swscale.c b/libswscale/swscale.c index 8b32ce5a40..ee57684675 100644 --- a/libswscale/swscale.c +++ b/libswscale/swscale.c @@ -1115,6 +1115,27 @@ int sws_receive_slice(struct SwsContext *c, unsigned int slice_start, c->src_ranges.ranges[0].len == c->srcH)) return AVERROR(EAGAIN); +if (c->slicethread) { +int nb_jobs = c->slice_ctx[0]->dither == SWS_DITHER_ED ? 1 : c->nb_slice_ctx; +int ret = 0; + +c->dst_slice_start = slice_start; +c->dst_slice_height = slice_height; + +avpriv_slicethread_execute(c->slicethread, nb_jobs, 0); + +for (int i = 0; i < c->nb_slice_ctx; i++) { +if (c->slice_err[i] < 0) { +ret = c->slice_err[i]; +break; +} +} + +memset(c->slice_err, 0, c->nb_slice_ctx * sizeof(*c->slice_err)); + +return ret; +} + for (int i = 0; i < FF_ARRAY_ELEMS(dst) && c->frame_dst->data[i]; i++) { dst[i] = c->frame_dst->data[i] + c->frame_dst->linesize[i] * (slice_start >> c->chrDstVSubSample); @@ -1152,6 +1173,41 @@ int attribute_align_arg sws_scale(struct SwsContext *c, int srcSliceH, uint8_t *const dst[], const int dstStride[]) { +if (c->nb_slice_ctx) +c = c->slice_ctx[0]; + return scale_internal(c, srcSlice, srcStride, srcSliceY, srcSliceH, dst, dstStride, 0, c->dstH); } + +void ff_sws_slice_worker(void *priv, int jobnr, int threadnr, + int nb_jobs, int nb_threads) +{ +SwsContext *parent = priv; +SwsContext *c = parent->slice_ctx[threadnr]; + +const int slice_height = FFALIGN(FFMAX((parent->dst_slice_height + nb_jobs - 1) / nb_jobs, 1), + 1 << c->chrDstVSubSample); +const int slice_start = jobnr * slice_height; +const int slice_end= FFMIN((jobnr + 1) * slice_height, parent->dst_slice_height); +int err = 0; + +if (slice_end > slice_start) { +uint8_t *dst[4] = { NULL }; + +for (int i = 0; i < FF_ARRAY_ELEMS(dst) && parent->frame_dst->data[i]; i++) { +const int vshift = (i == 1 || i == 2) ? c->chrDstVSubSample : 0; +const ptrdiff_t offset = parent->frame_dst->linesize[i] * +((slice_start + parent->dst_slice_start) >> vshift); + +dst[i] = parent->frame_dst->data[i] + offset; +} + +err = scale_internal(c, (const uint8_t * const *)parent->frame_src->data, + parent->frame_src->linesize, 0, c->srcH, + dst, parent->frame_dst->linesize, + parent->dst_slice_start + slice_start, slice_end - slice_start); +} + +parent->slice_err[threadnr] = err; +} diff --git a/libswscale/swscale_internal.h b/libswscale/swscale_internal.h index c1098d6026..6ca44b710e 100644 --- a/libswscale/swscale_internal.h +++ b/libswscale/swscale_internal.h @@ -33,6 +33,7 @@ #include "libavutil/mem_internal.h" #include "libavutil/pixfmt.h" #include "libavutil/pixdesc.h" +#include "libavutil/slicethread.h" #include "libavutil/ppc/util_altivec.h" #define STR(s) AV_TOSTRING(s) // AV_STRINGIFY is too long @@ -300,6 +301,15 @@ typedef struct SwsContext { */ const AVClass *av_class; +AVSliceThread *slicethread; +struct SwsContext **slice_ctx; +int*slice_err; +int nb_slice_ctx; + +// values passed to current sws_receive_slice() call +unsigned int dst_slice_start; +unsigned int dst_slice_height; + /** * Note that src, dst, srcStride, dstStride will be copied in the * sws_scale() wrapper so they can be freely modified here. @@ -325,6 +335,7 @@ typedef
[FFmpeg-devel] [PATCH 6/8] lavfi/vf_scale: convert to the frame-based sws API
--- libavfilter/vf_scale.c | 73 -- 1 file changed, 49 insertions(+), 24 deletions(-) diff --git a/libavfilter/vf_scale.c b/libavfilter/vf_scale.c index 39ab3a4b28..cdff3ab7ed 100644 --- a/libavfilter/vf_scale.c +++ b/libavfilter/vf_scale.c @@ -620,29 +620,54 @@ static int request_frame_ref(AVFilterLink *outlink) return ff_request_frame(outlink->src->inputs[1]); } -static int scale_slice(ScaleContext *scale, AVFrame *out_buf, AVFrame *cur_pic, struct SwsContext *sws, int y, int h, int mul, int field) +static void frame_offset(AVFrame *frame, int dir, int is_pal) { -const uint8_t *in[4]; -uint8_t *out[4]; -int in_stride[4],out_stride[4]; -int i; - -for (i=0; i<4; i++) { -int vsub= ((i+1)&2) ? scale->vsub : 0; -ptrdiff_t in_offset = ((y>>vsub)+field) * cur_pic->linesize[i]; -ptrdiff_t out_offset =field * out_buf->linesize[i]; - in_stride[i] = cur_pic->linesize[i] * mul; -out_stride[i] = out_buf->linesize[i] * mul; - in[i] = FF_PTR_ADD(cur_pic->data[i], in_offset); -out[i] = FF_PTR_ADD(out_buf->data[i], out_offset); -} -if (scale->input_is_pal) - in[1] = cur_pic->data[1]; -if (scale->output_is_pal) -out[1] = out_buf->data[1]; +for (int i = 0; i < 4 && frame->data[i]; i++) { +if (i == 1 && is_pal) +break; +frame->data[i] += frame->linesize[i] * dir; +} +} + +static int scale_field(ScaleContext *scale, AVFrame *dst, AVFrame *src, + int field) +{ +int orig_h_src = src->height; +int orig_h_dst = dst->height; +int ret; + +// offset the data pointers for the bottom field +if (field) { +frame_offset(src, 1, scale->input_is_pal); +frame_offset(dst, 1, scale->output_is_pal); +} + +// take every second line +for (int i = 0; i < 4; i++) { +src->linesize[i] *= 2; +dst->linesize[i] *= 2; +} +src->height /= 2; +dst->height /= 2; -return sws_scale(sws, in, in_stride, y/mul, h, - out,out_stride); +ret = sws_scale_frame(scale->isws[field], dst, src); +if (ret < 0) +return ret; + +// undo the changes we made above +for (int i = 0; i < 4; i++) { +src->linesize[i] /= 2; +dst->linesize[i] /= 2; +} +src->height = orig_h_src; +dst->height = orig_h_dst; + +if (field) { +frame_offset(src, -1, scale->input_is_pal); +frame_offset(dst, -1, scale->output_is_pal); +} + +return 0; } static int scale_frame(AVFilterLink *link, AVFrame *in, AVFrame **frame_out) @@ -789,11 +814,11 @@ scale: INT_MAX); if (scale->interlaced>0 || (scale->interlaced<0 && in->interlaced_frame)) { -ret = scale_slice(scale, out, in, scale->isws[0], 0, (link->h+1)/2, 2, 0); +ret = scale_field(scale, out, in, 0); if (ret >= 0) -ret = scale_slice(scale, out, in, scale->isws[1], 0, link->h /2, 2, 1); +ret = scale_field(scale, out, in, 1); } else { -ret = scale_slice(scale, out, in, scale->sws, 0, link->h, 1, 0); +ret = sws_scale_frame(scale->sws, out, in); } av_frame_free(&in); -- 2.30.2 ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org https://ffmpeg.org/mailman/listinfo/ffmpeg-devel To unsubscribe, visit link above, or email ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".
[FFmpeg-devel] [PATCH 8/8] lavfi/vf_scale: pass the thread count to the scaler
--- libavfilter/vf_scale.c | 1 + 1 file changed, 1 insertion(+) diff --git a/libavfilter/vf_scale.c b/libavfilter/vf_scale.c index cdff3ab7ed..f676f5d82e 100644 --- a/libavfilter/vf_scale.c +++ b/libavfilter/vf_scale.c @@ -543,6 +543,7 @@ static int config_props(AVFilterLink *outlink) av_opt_set_int(*s, "sws_flags", scale->flags, 0); av_opt_set_int(*s, "param0", scale->param[0], 0); av_opt_set_int(*s, "param1", scale->param[1], 0); +av_opt_set_int(*s, "threads", ff_filter_get_nb_threads(ctx), 0); if (scale->in_range != AVCOL_RANGE_UNSPECIFIED) av_opt_set_int(*s, "src_range", scale->in_range == AVCOL_RANGE_JPEG, 0); -- 2.30.2 ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org https://ffmpeg.org/mailman/listinfo/ffmpeg-devel To unsubscribe, visit link above, or email ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".
[FFmpeg-devel] [PATCH 5/8] sws: add a new scaling API
--- libswscale/swscale.c | 263 ++ libswscale/swscale.h | 80 +++ libswscale/swscale_internal.h | 19 +++ libswscale/utils.c| 70 + 4 files changed, 374 insertions(+), 58 deletions(-) diff --git a/libswscale/swscale.c b/libswscale/swscale.c index 61dfcb4dff..8b32ce5a40 100644 --- a/libswscale/swscale.c +++ b/libswscale/swscale.c @@ -236,13 +236,16 @@ static void lumRangeFromJpeg16_c(int16_t *_dst, int width) av_log(c, AV_LOG_DEBUG, __VA_ARGS__) static int swscale(SwsContext *c, const uint8_t *src[], - int srcStride[], int srcSliceY, - int srcSliceH, uint8_t *dst[], int dstStride[]) + int srcStride[], int srcSliceY, int srcSliceH, + uint8_t *dst[], int dstStride[], + int dstSliceY, int dstSliceH) { +const int scale_dst = dstSliceY > 0 || dstSliceH < c->dstH; + /* load a few things into local vars to make the code more readable? * and faster */ const int dstW = c->dstW; -const int dstH = c->dstH; +int dstH = c->dstH; const enum AVPixelFormat dstFormat = c->dstFormat; const int flags = c->flags; @@ -331,10 +334,15 @@ static int swscale(SwsContext *c, const uint8_t *src[], } } -/* Note the user might start scaling the picture in the middle so this - * will not get executed. This is not really intended but works - * currently, so people might do it. */ -if (srcSliceY == 0) { +if (scale_dst) { +dstY = dstSliceY; +dstH = dstY + dstSliceH; +lastInLumBuf = -1; +lastInChrBuf = -1; +} else if (srcSliceY == 0) { +/* Note the user might start scaling the picture in the middle so this + * will not get executed. This is not really intended but works + * currently, so people might do it. */ dstY = 0; lastInLumBuf = -1; lastInChrBuf = -1; @@ -352,8 +360,8 @@ static int swscale(SwsContext *c, const uint8_t *src[], srcSliceY, srcSliceH, chrSrcSliceY, chrSrcSliceH, 1); ff_init_slice_from_src(vout_slice, (uint8_t**)dst, dstStride, c->dstW, -dstY, dstH, dstY >> c->chrDstVSubSample, -AV_CEIL_RSHIFT(dstH, c->chrDstVSubSample), 0); +dstY, dstSliceH, dstY >> c->chrDstVSubSample, +AV_CEIL_RSHIFT(dstSliceH, c->chrDstVSubSample), scale_dst); if (srcSliceY == 0) { hout_slice->plane[0].sliceY = lastInLumBuf + 1; hout_slice->plane[1].sliceY = lastInChrBuf + 1; @@ -373,7 +381,7 @@ static int swscale(SwsContext *c, const uint8_t *src[], // First line needed as input const int firstLumSrcY = FFMAX(1 - vLumFilterSize, vLumFilterPos[dstY]); -const int firstLumSrcY2 = FFMAX(1 - vLumFilterSize, vLumFilterPos[FFMIN(dstY | ((1 << c->chrDstVSubSample) - 1), dstH - 1)]); +const int firstLumSrcY2 = FFMAX(1 - vLumFilterSize, vLumFilterPos[FFMIN(dstY | ((1 << c->chrDstVSubSample) - 1), c->dstH - 1)]); // First line needed as input const int firstChrSrcY = FFMAX(1 - vChrFilterSize, vChrFilterPos[chrDstY]); @@ -477,7 +485,7 @@ static int swscale(SwsContext *c, const uint8_t *src[], c->chrDither8 = ff_dither_8x8_128[chrDstY & 7]; c->lumDither8 = ff_dither_8x8_128[dstY& 7]; } -if (dstY >= dstH - 2) { +if (dstY >= c->dstH - 2) { /* hmm looks like we can't use MMX here without overwriting * this array's tail */ ff_sws_init_output_funcs(c, &yuv2plane1, &yuv2planeX, &yuv2nv12cX, @@ -491,21 +499,22 @@ static int swscale(SwsContext *c, const uint8_t *src[], desc[i].process(c, &desc[i], dstY, 1); } if (isPlanar(dstFormat) && isALPHA(dstFormat) && !needAlpha) { +int offset = lastDstY - dstSliceY; int length = dstW; int height = dstY - lastDstY; if (is16BPS(dstFormat) || isNBPS(dstFormat)) { const AVPixFmtDescriptor *desc = av_pix_fmt_desc_get(dstFormat); -fillPlane16(dst[3], dstStride[3], length, height, lastDstY, +fillPlane16(dst[3], dstStride[3], length, height, offset, 1, desc->comp[3].depth, isBE(dstFormat)); } else if (is32BPS(dstFormat)) { const AVPixFmtDescriptor *desc = av_pix_fmt_desc_get(dstFormat); -fillPlane32(dst[3], dstStride[3], length, height, lastDstY, +fillPlane32(dst[3], dstStride[3], length, height, offset, 1, desc->comp[3].depth, isBE(dstFormat), desc->flags & AV_PIX_FMT_FLAG_FLOAT); } else -fillPlane(dst[3], dstStride[3], length, height, lastDstY, 255); +fillPlane(dst[3], dstStride[3], length, height,
Re: [FFmpeg-devel] [PATCH 24/24] lavfi/vf_scale: implement slice threading
On Mon, Jul 12, 2021 at 12:35 PM Anton Khirnov wrote: > > Quoting Michael Niedermayer (2021-07-03 18:27:36) > > On Sat, Jul 03, 2021 at 03:27:36PM +0200, Anton Khirnov wrote: > > > Quoting Michael Niedermayer (2021-06-01 11:35:13) > > > > On Mon, May 31, 2021 at 09:55:15AM +0200, Anton Khirnov wrote: > > > > > --- > > > > > libavfilter/vf_scale.c | 182 > > > > > +++-- > > > > > 1 file changed, 141 insertions(+), 41 deletions(-) > > > > > > > > breaks: (lower 50% is bright green) > > > > ./ffplay -i mm-short.mpg -an -vf "tinterlace,scale=720:576:interl=1" > > > > > > Fixed locally, but I'm wondering why is interlaced scaling not done by > > > default for interlaced videos. > > > > IIRC the flags where quite unreliable. If we have reliable knowledge about > > interlacing it certainly should be used automatically > > You mean there is a significant amount of progressive content that is > flagged as interlaced? > In my experience its not that bad. But I deal with every day content, not fringe stuff. - Hendrik ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org https://ffmpeg.org/mailman/listinfo/ffmpeg-devel To unsubscribe, visit link above, or email ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".
Re: [FFmpeg-devel] [PATCH 24/24] lavfi/vf_scale: implement slice threading
On Mon, Jul 12, 2021 at 12:34:55PM +0200, Anton Khirnov wrote: > Quoting Michael Niedermayer (2021-07-03 18:27:36) > > On Sat, Jul 03, 2021 at 03:27:36PM +0200, Anton Khirnov wrote: > > > Quoting Michael Niedermayer (2021-06-01 11:35:13) > > > > On Mon, May 31, 2021 at 09:55:15AM +0200, Anton Khirnov wrote: > > > > > --- > > > > > libavfilter/vf_scale.c | 182 > > > > > +++-- > > > > > 1 file changed, 141 insertions(+), 41 deletions(-) > > > > > > > > breaks: (lower 50% is bright green) > > > > ./ffplay -i mm-short.mpg -an -vf "tinterlace,scale=720:576:interl=1" > > > > > > Fixed locally, but I'm wondering why is interlaced scaling not done by > > > default for interlaced videos. > > > > IIRC the flags where quite unreliable. If we have reliable knowledge about > > interlacing it certainly should be used automatically > > You mean there is a significant amount of progressive content that is > flagged as interlaced? Yes, thats from my memory though, this may have changed of course. IIRC one source of this is progressive material intended for interlaced displays Most ideally IMHO we should reliably autodetect the material progressive/interlaced/telecined set the flags accordingly and then also automatically optimally do what the flags say without user intervention. thx [...] -- Michael GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB If a bugfix only changes things apparently unrelated to the bug with no further explanation, that is a good sign that the bugfix is wrong. signature.asc Description: PGP signature ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org https://ffmpeg.org/mailman/listinfo/ffmpeg-devel To unsubscribe, visit link above, or email ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".
Re: [FFmpeg-devel] [PATCH 1/2] libavutil/cpu: Adds fast gather detection.
On 7/12/2021 7:46 AM, Lynne wrote: 12 Jul 2021, 11:29 by alankelly-at-google@ffmpeg.org: On Fri, Jun 25, 2021 at 1:24 PM Alan Kelly wrote: On Fri, Jun 25, 2021 at 10:40 AM Lynne wrote: Jun 25, 2021, 09:54 by alankelly-at-google@ffmpeg.org: Broadwell and later and Zen3 and later have fast gather instructions. --- Gather requires between 9 and 12 cycles on Haswell, 5 to 7 on Broadwell, and 2 to 5 on Skylake and newer. It is also slow on AMD before Zen 3. libavutil/cpu.h | 2 ++ libavutil/x86/cpu.c | 18 -- libavutil/x86/cpu.h | 1 + 3 files changed, 19 insertions(+), 2 deletions(-) No, we really don't need more FAST/SLOW flags, especially for something like this which is just fixable by _not_using_vgather_. Take a look at libavutil/x86/tx_float.asm, we only use vgather if it's guaranteed to either be faster for what we're gathering or is just as fast "slow". If neither is true, we use manual lookups, which is actually advantageous since for AVX2 we can interleave the lookups that happen in each lane. Even if we disregard this, I've extensively benchmarked vgather on Zen 3, Zen 2, Cascade Lake and Skylake, and there's hardly a great vgather improvement to be found in Zen 3 to justify using a new CPU flag for this. ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org https://ffmpeg.org/mailman/listinfo/ffmpeg-devel To unsubscribe, visit link above, or email ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe". Thanks for your response. I'm not against finding a cleaner way of enabling/disabling the code which will be protected by this flag. However, the manual lookups solution proposed will not work in this case, the avx2 version of hscale will only be faster if fast gathers are available, otherwise, the ssse3 version should be used. I haven't got access to a Zen3 so I can't comment on the performance. I have tested on a Zen 2 and it is slow. On Broadwell hscale avx2 is about 10% faster than the ssse3 version and on Skylake about 40% faster, Haswell has similar performance to Zen2. Is there a proxy which could be used for detecting Broadwell or Skylake and later? AVX512 seems too strict as there are Skylake chips without AVX512. Thanks Hi, I will paste the performance figures from the thread for the other part of this patch here so that the justification for this flag is clearer: Skylake Haswell hscale_8_to_15_width4_ssse3 761.2 760 hscale_8_to_15_width4_avx2 468.7 957 hscale_8_to_15_width8_ssse3 1170.7 1032 hscale_8_to_15_width8_avx2 865.7 1979 hscale_8_to_15_width12_ssse3 2172.2 2472 hscale_8_to_15_width12_avx2 1245.7 2901 hscale_8_to_15_width16_ssse3 2244.2 2400 hscale_8_to_15_width16_avx2 1647.2 3681 As you can see, it is catastrophic on Haswell and older chips but the gains on Skylake are impressive. As I don't have performance figures for Zen 3, I can disable this feature on all cpus apart from Broadwell and later as you say that there is no worthwhile improvement on Zen3. Is this OK with you? It's not that catastrophic. Since Haswell CPUs generally don't have large AVX2 gains, could you just exclude Haswell only from EXTERNAL_AVX2_FAST, and require EXTERNAL_AVX2_FAST to enable those functions? And disable all non gather AVX2 asm functions on Haswell? No. And it's a lie that Haswell doesn't have large gains with AVX2. ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org https://ffmpeg.org/mailman/listinfo/ffmpeg-devel To unsubscribe, visit link above, or email ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe". ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org https://ffmpeg.org/mailman/listinfo/ffmpeg-devel To unsubscribe, visit link above, or email ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".
Re: [FFmpeg-devel] [PATCH 2/2] avformat/movenc: add support for TTML muxing
On Tue, 22 Jun 2021, Jan Ekström wrote: From: Jan Ekström Includes basic support for both the ISMV ('dfxp') and MP4 ('stpp') methods. This initial version also foregoes fragmentation support as this eases the initial review. Hmm, I'm not sure I understand here, this seems to add at least some coe in mov_flush_fragment, so there's some initial support for fragmentation present still - can you elaborate? Signed-off-by: Jan Ekström --- libavformat/Makefile | 2 +- libavformat/isom.h| 3 + libavformat/movenc.c | 180 +++- libavformat/movenc.h | 6 + libavformat/movenc_ttml.c | 243 ++ libavformat/movenc_ttml.h | 31 + 6 files changed, 462 insertions(+), 3 deletions(-) create mode 100644 libavformat/movenc_ttml.c create mode 100644 libavformat/movenc_ttml.h diff --git a/libavformat/Makefile b/libavformat/Makefile index c9ef564523..931ad4ac45 100644 --- a/libavformat/Makefile +++ b/libavformat/Makefile @@ -337,7 +337,7 @@ OBJS-$(CONFIG_MOV_DEMUXER) += mov.o mov_chan.o mov_esds.o \ qtpalette.o replaygain.o OBJS-$(CONFIG_MOV_MUXER) += movenc.o av1.o avc.o hevc.o vpcc.o \ movenchint.o mov_chan.o rtp.o \ -movenccenc.o rawutils.o +movenccenc.o movenc_ttml.o rawutils.o OBJS-$(CONFIG_MP2_MUXER) += rawenc.o OBJS-$(CONFIG_MP3_DEMUXER) += mp3dec.o replaygain.o OBJS-$(CONFIG_MP3_MUXER) += mp3enc.o rawenc.o id3v2enc.o diff --git a/libavformat/isom.h b/libavformat/isom.h index ac1b3f3d56..34a58c79b7 100644 --- a/libavformat/isom.h +++ b/libavformat/isom.h @@ -387,4 +387,7 @@ static inline enum AVCodecID ff_mov_get_lpcm_codec_id(int bps, int flags) return ff_get_pcm_codec_id(bps, flags & 1, flags & 2, flags & 4 ? -1 : 0); } +#define MOV_ISMV_TTML_TAG MKTAG('d', 'f', 'x', 'p') +#define MOV_MP4_TTML_TAG MKTAG('s', 't', 'p', 'p') + #endif /* AVFORMAT_ISOM_H */ diff --git a/libavformat/movenc.c b/libavformat/movenc.c index 04f3e94158..d4efb6217f 100644 --- a/libavformat/movenc.c +++ b/libavformat/movenc.c @@ -56,6 +56,8 @@ #include "hevc.h" #include "rtpenc.h" #include "mov_chan.h" +#include "movenc_ttml.h" +#include "ttmlenc.h" #include "vpcc.h" static const AVOption options[] = { @@ -120,6 +122,7 @@ static const AVClass flavor ## _muxer_class = {\ }; static int get_moov_size(AVFormatContext *s); +static int mov_write_single_packet(AVFormatContext *s, AVPacket *pkt); static int utf8len(const uint8_t *b) { @@ -1788,7 +1791,29 @@ static int mov_write_subtitle_tag(AVIOContext *pb, MOVTrack *track) if (track->par->codec_id == AV_CODEC_ID_DVD_SUBTITLE) mov_write_esds_tag(pb, track); -else if (track->par->extradata_size) +else if (track->par->codec_id == AV_CODEC_ID_TTML) { +switch (track->par->codec_tag) { +case MOV_ISMV_TTML_TAG: +// ye olde ISMV dfxp requires no extradata. Nit: I'd prefer a more formal/serious wording in the comment than "ye olde" :P +break; +case MOV_MP4_TTML_TAG: +// As specified in 14496-30, XMLSubtitleSampleEntry +// Namespace +avio_put_str(pb, "http://www.w3.org/ns/ttml";); +// Empty schema_location +avio_w8(pb, 0); +// Empty auxiliary_mime_types +avio_w8(pb, 0); +break; +default: +av_log(NULL, AV_LOG_ERROR, + "Unknown codec tag '%s' utilized for TTML stream with " + "index %d (track id %d)!\n", + av_fourcc2str(track->par->codec_tag), track->st->index, + track->track_id); +return AVERROR(EINVAL); +} +} else if (track->par->extradata_size) avio_write(pb, track->par->extradata, track->par->extradata_size); if (track->mode == MODE_MP4 && @@ -5254,6 +5279,71 @@ static int mov_flush_fragment_interleaving(AVFormatContext *s, MOVTrack *track) return 0; } +static int mov_write_squashed_packet(AVFormatContext *s, MOVTrack *track) +{ +AVPacket *squashed_packet = ((MOVMuxContext *)s->priv_data)->pkt; Nit: Maybe spell out the intermediate MOVMuxContext pointer to a separate variable for clarity, even if it's used only once. +int ret = AVERROR_BUG; + +switch (track->st->codecpar->codec_id) { +case AV_CODEC_ID_TTML: +{ +int we_had_packets = !!track->squashed_packet_queue; Nit: We don't really need the strict 0/1 value of we_had_packets here, so we don't need the double negation. And maybe drop the "we_" prefix? + +if ((ret = ff_mov_generate_squashed_ttml_packet(s, track, squashed_packet)) < 0) { +goto finish_squash; +} + +// We have generated a padding packet (no a
Re: [FFmpeg-devel] [PATCH 1/1] libavformat/rtsp.c: Reply to GET_PARAMETER requests
On Sat, 26 Jun 2021, Martin Storsjö wrote: On Fri, 25 Jun 2021, Hayden Myers wrote: Some encoders send GET_PARAMETER requests as a keep-alive mechanism. If the client doesn't reply with an OK message, the encoder will close the session. This was encountered with the impath i5110 encoder, when the RTSP Keep-Alive checkbox is enabled under streaming settings. Alternatively one may set the X-No-Keepalive: 1 header, but this is more of a workaround. It's better practice to respond to an encoder's keep-alive request, than disable the mechanism which may be manufacturer specific. Signed-off-by: Hayden Myers --- libavformat/rtsp.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/libavformat/rtsp.c b/libavformat/rtsp.c index 9f509a229f..dc660368f0 100644 --- a/libavformat/rtsp.c +++ b/libavformat/rtsp.c @@ -1259,7 +1259,9 @@ start: char base64buf[AV_BASE64_SIZE(sizeof(buf))]; const char* ptr = buf; - if (!strcmp(reply->reason, "OPTIONS")) { + if (!strcmp(reply->reason, "OPTIONS") || + !strcmp(reply->reason, "GET_PARAMETER")) { + snprintf(buf, sizeof(buf), "RTSP/1.0 200 OK\r\n"); if (reply->seq) av_strlcatf(buf, sizeof(buf), "CSeq: %d\r\n", reply->seq); -- LGTM, this sounds and seems reasonable to me (although untested in practice). Pushed this patch now. Do note that the patch was badly mangled (the extra PNG attachments I think?) which made git unable to automatically apply it from the mail message, so I had to retype the patch manually. // Martin ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org https://ffmpeg.org/mailman/listinfo/ffmpeg-devel To unsubscribe, visit link above, or email ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".
Re: [FFmpeg-devel] [PATCH 1/2] libavutil/cpu: Adds fast gather detection.
12 Jul 2021, 13:53 by jamr...@gmail.com: > On 7/12/2021 7:46 AM, Lynne wrote: > >> 12 Jul 2021, 11:29 by alankelly-at-google@ffmpeg.org: >> >>> On Fri, Jun 25, 2021 at 1:24 PM Alan Kelly wrote: >>> On Fri, Jun 25, 2021 at 10:40 AM Lynne wrote: > Jun 25, 2021, 09:54 by alankelly-at-google@ffmpeg.org: > >> Broadwell and later and Zen3 and later have fast gather instructions. >> --- >> Gather requires between 9 and 12 cycles on Haswell, 5 to 7 on >> > Broadwell, > >> and 2 to 5 on Skylake and newer. It is also slow on AMD before Zen 3. >> libavutil/cpu.h | 2 ++ >> libavutil/x86/cpu.c | 18 -- >> libavutil/x86/cpu.h | 1 + >> 3 files changed, 19 insertions(+), 2 deletions(-) >> > > No, we really don't need more FAST/SLOW flags, especially for > something like this which is just fixable by _not_using_vgather_. > Take a look at libavutil/x86/tx_float.asm, we only use vgather > if it's guaranteed to either be faster for what we're gathering or > is just as fast "slow". If neither is true, we use manual lookups, > which is actually advantageous since for AVX2 we can interleave > the lookups that happen in each lane. > > Even if we disregard this, I've extensively benchmarked vgather > on Zen 3, Zen 2, Cascade Lake and Skylake, and there's hardly > a great vgather improvement to be found in Zen 3 to justify > using a new CPU flag for this. > ___ > ffmpeg-devel mailing list > ffmpeg-devel@ffmpeg.org > https://ffmpeg.org/mailman/listinfo/ffmpeg-devel > > To unsubscribe, visit link above, or email > ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe". > Thanks for your response. I'm not against finding a cleaner way of enabling/disabling the code which will be protected by this flag. However, the manual lookups solution proposed will not work in this case, the avx2 version of hscale will only be faster if fast gathers are available, otherwise, the ssse3 version should be used. I haven't got access to a Zen3 so I can't comment on the performance. I have tested on a Zen 2 and it is slow. On Broadwell hscale avx2 is about 10% faster than the ssse3 version and on Skylake about 40% faster, Haswell has similar performance to Zen2. Is there a proxy which could be used for detecting Broadwell or Skylake and later? AVX512 seems too strict as there are Skylake chips without AVX512. Thanks >>> >>> Hi, >>> >>> I will paste the performance figures from the thread for the other part of >>> this patch here so that the justification for this flag is clearer: >>> >>> Skylake Haswell >>> hscale_8_to_15_width4_ssse3 761.2 760 >>> hscale_8_to_15_width4_avx2 468.7 957 >>> hscale_8_to_15_width8_ssse3 1170.7 1032 >>> hscale_8_to_15_width8_avx2 865.7 1979 >>> hscale_8_to_15_width12_ssse3 2172.2 2472 >>> hscale_8_to_15_width12_avx2 1245.7 2901 >>> hscale_8_to_15_width16_ssse3 2244.2 2400 >>> hscale_8_to_15_width16_avx2 1647.2 3681 >>> >>> As you can see, it is catastrophic on Haswell and older chips but the gains >>> on Skylake are impressive. >>> As I don't have performance figures for Zen 3, I can disable this feature >>> on all cpus apart from Broadwell and later as you say that there is no >>> worthwhile improvement on Zen3. Is this OK with you? >>> >> >> It's not that catastrophic. Since Haswell CPUs generally don't have >> large AVX2 gains, could you just exclude Haswell only from >> EXTERNAL_AVX2_FAST, and require EXTERNAL_AVX2_FAST >> to enable those functions? >> > > And disable all non gather AVX2 asm functions on Haswell? No. And it's a lie > that Haswell doesn't have large gains with AVX2. > It won't disable ALL of the AVX2, but it'll affect a few random components, the most prominent of which is some (not all) hevc assembly. But I think I'd rather just not do anything at all. Performance of vgather even on Haswell is still above 2x the C version, and we barely have any vgathers in our code. And Haswell use is in decline too. ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org https://ffmpeg.org/mailman/listinfo/ffmpeg-devel To unsubscribe, visit link above, or email ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".
Re: [FFmpeg-devel] FFMPEG for V4L2 M2M devices ?
> > A quick Google implies that NVidia already has a stateful V4L2 M2M > driver in their vendor kernel. Other than the strange choice of device > node name (/dev/nvhost-nvdec), the details at [3] make it look like a > normal V4L2 M2M decoder that has a good chance of working against > h264_v4l2m2m. Not only does it have a strange node name, it also uses two nodes. One for decoding, another for converting. Capture plane of the decoder stores frames in V4L2_PIX_FMT_NV12M format. Converter able to convert it to a different format[1]. Could you point me at documentation of Pi V4L2 spec? [1] https://docs.nvidia.com/jetson/l4t-multimedia/group__V4L2Conv.html Andrii On Mon, Jul 12, 2021 at 6:02 AM Dave Stevenson < dave.steven...@raspberrypi.com> wrote: > On Sat, 10 Jul 2021 at 00:56, Brad Hards wrote: > > > > On Saturday, 10 July 2021 8:53:27 AM AEST Andrii wrote: > > > I am working on porting a Kodi player to an NVidia Jetson Nano device. > I've > > > been developing a decoder for quite some time now, and realized that > the > > > best approach would be to have it inside of ffmpeg, instead of > embedding > > > the decoder into Kodi as it heavily relies on FFMPEG. Just wondering if > > > there is any effort in making FFMPEG suppring M2M V4L devices ? > > > > > https://git.ffmpeg.org/gitweb/ffmpeg.git/blob_plain/HEAD:/libavcodec/v4l2_m2m.c[1] > > > > I guess that would be the basis for further work as required to meet > your needs. > > Do note that there are 2 V4L2 M2M decoder APIs - the stateful API[1] , > and the stateless API [2]. They differ in the amount of bitstream > parsing and buffer management that the driver implements vs expecting > the client to do. > > The *_v4l2m2m drivers within FFMPEG support the stateful API (ie the > kernel driver has bitstream parsing). For Raspberry Pi we use that to > support the (older) H264 implementation, and FFMPEG master does that > very well. > > The Pi HEVC decoder uses the V4L2 stateless API. Stateless HEVC > support hasn't been merged to the mainline kernel as yet, so there are > downstream patches to support that. > > A quick Google implies that NVidia already has a stateful V4L2 M2M > driver in their vendor kernel. Other than the strange choice of device > node name (/dev/nvhost-nvdec), the details at [3] make it look like a > normal V4L2 M2M decoder that has a good chance of working against > h264_v4l2m2m. > > [1] > https://www.kernel.org/doc/html/latest/userspace-api/media/v4l/dev-decoder.html > [2] > https://www.kernel.org/doc/html/latest/userspace-api/media/v4l/dev-stateless-decoder.html > [3] https://docs.nvidia.com/jetson/l4t-multimedia/group__V4L2Dec.html > > Dave > > > Brad > > > > > > [1] > https://git.ffmpeg.org/gitweb/ffmpeg.git/blob_plain/HEAD:/libavcodec/v4l2_m2m.c > > ___ > > ffmpeg-devel mailing list > > ffmpeg-devel@ffmpeg.org > > https://ffmpeg.org/mailman/listinfo/ffmpeg-devel > > > > To unsubscribe, visit link above, or email > > ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe". > ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org https://ffmpeg.org/mailman/listinfo/ffmpeg-devel To unsubscribe, visit link above, or email ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".
Re: [FFmpeg-devel] FFMPEG for V4L2 M2M devices ?
On Mon, 12 Jul 2021 at 14:51, Andrii wrote: >> >> A quick Google implies that NVidia already has a stateful V4L2 M2M >> driver in their vendor kernel. Other than the strange choice of device >> node name (/dev/nvhost-nvdec), the details at [3] make it look like a >> normal V4L2 M2M decoder that has a good chance of working against >> h264_v4l2m2m. > > > Not only does it have a strange node name, it also uses two nodes. One for > decoding, another for converting. Capture plane of the decoder stores frames > in V4L2_PIX_FMT_NV12M format. > Converter able to convert it to a different format[1]. Those appear to be two different hardware blocks. If you can consume NV12M (YUV420 with interleaved UV plane), then I see no reason why you have to pass the data through the "/dev/nvhost-vic" device. We have a similar thing where /dev/video10 is the decoder (stateful decode), /dev/video11 is the encoder, and /dev/video12 is the ISP (Image Sensor Pipeline) wrapped in the V4L2 API. > Could you point me at documentation of Pi V4L2 spec? It just implements the relevant APIs that I've already linked to. If it doesn't follow the API, then we fix it so that it does. Stateful H264 implementation is https://github.com/raspberrypi/linux/tree/rpi-5.10.y/drivers/staging/vc04_services/bcm2835-codec Stateless HEVC is https://github.com/raspberrypi/linux/tree/rpi-5.10.y/drivers/staging/media/rpivid Dave > [1] https://docs.nvidia.com/jetson/l4t-multimedia/group__V4L2Conv.html > > Andrii > > On Mon, Jul 12, 2021 at 6:02 AM Dave Stevenson > wrote: >> >> On Sat, 10 Jul 2021 at 00:56, Brad Hards wrote: >> > >> > On Saturday, 10 July 2021 8:53:27 AM AEST Andrii wrote: >> > > I am working on porting a Kodi player to an NVidia Jetson Nano device. >> > > I've >> > > been developing a decoder for quite some time now, and realized that the >> > > best approach would be to have it inside of ffmpeg, instead of embedding >> > > the decoder into Kodi as it heavily relies on FFMPEG. Just wondering if >> > > there is any effort in making FFMPEG suppring M2M V4L devices ? >> > >> > https://git.ffmpeg.org/gitweb/ffmpeg.git/blob_plain/HEAD:/libavcodec/v4l2_m2m.c[1] >> > >> > I guess that would be the basis for further work as required to meet your >> > needs. >> >> Do note that there are 2 V4L2 M2M decoder APIs - the stateful API[1] , >> and the stateless API [2]. They differ in the amount of bitstream >> parsing and buffer management that the driver implements vs expecting >> the client to do. >> >> The *_v4l2m2m drivers within FFMPEG support the stateful API (ie the >> kernel driver has bitstream parsing). For Raspberry Pi we use that to >> support the (older) H264 implementation, and FFMPEG master does that >> very well. >> >> The Pi HEVC decoder uses the V4L2 stateless API. Stateless HEVC >> support hasn't been merged to the mainline kernel as yet, so there are >> downstream patches to support that. >> >> A quick Google implies that NVidia already has a stateful V4L2 M2M >> driver in their vendor kernel. Other than the strange choice of device >> node name (/dev/nvhost-nvdec), the details at [3] make it look like a >> normal V4L2 M2M decoder that has a good chance of working against >> h264_v4l2m2m. >> >> [1] >> https://www.kernel.org/doc/html/latest/userspace-api/media/v4l/dev-decoder.html >> [2] >> https://www.kernel.org/doc/html/latest/userspace-api/media/v4l/dev-stateless-decoder.html >> [3] https://docs.nvidia.com/jetson/l4t-multimedia/group__V4L2Dec.html >> >> Dave >> >> > Brad >> > >> > >> > [1] >> > https://git.ffmpeg.org/gitweb/ffmpeg.git/blob_plain/HEAD:/libavcodec/v4l2_m2m.c >> > ___ >> > ffmpeg-devel mailing list >> > ffmpeg-devel@ffmpeg.org >> > https://ffmpeg.org/mailman/listinfo/ffmpeg-devel >> > >> > To unsubscribe, visit link above, or email >> > ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe". ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org https://ffmpeg.org/mailman/listinfo/ffmpeg-devel To unsubscribe, visit link above, or email ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".
Re: [FFmpeg-devel] FFMPEG for V4L2 M2M devices ?
On Mon, 12. Jul 11:02, Dave Stevenson wrote: > On Sat, 10 Jul 2021 at 00:56, Brad Hards wrote: > > > > On Saturday, 10 July 2021 8:53:27 AM AEST Andrii wrote: > > > I am working on porting a Kodi player to an NVidia Jetson Nano device. > > > I've > > > been developing a decoder for quite some time now, and realized that the > > > best approach would be to have it inside of ffmpeg, instead of embedding > > > the decoder into Kodi as it heavily relies on FFMPEG. Just wondering if > > > there is any effort in making FFMPEG suppring M2M V4L devices ? > > > > https://git.ffmpeg.org/gitweb/ffmpeg.git/blob_plain/HEAD:/libavcodec/v4l2_m2m.c[1] > > > > I guess that would be the basis for further work as required to meet your > > needs. > > Do note that there are 2 V4L2 M2M decoder APIs - the stateful API[1] , > and the stateless API [2]. They differ in the amount of bitstream > parsing and buffer management that the driver implements vs expecting > the client to do. > > The *_v4l2m2m drivers within FFMPEG support the stateful API (ie the > kernel driver has bitstream parsing). For Raspberry Pi we use that to > support the (older) H264 implementation, and FFMPEG master does that > very well. > > The Pi HEVC decoder uses the V4L2 stateless API. Stateless HEVC > support hasn't been merged to the mainline kernel as yet, so there are > downstream patches to support that. > > A quick Google implies that NVidia already has a stateful V4L2 M2M > driver in their vendor kernel. Other than the strange choice of device > node name (/dev/nvhost-nvdec), the details at [3] make it look like a > normal V4L2 M2M decoder that has a good chance of working against > h264_v4l2m2m. Some time ago I tried to set up the Jetson nano to work with our v4l2m2m code, but there were just too many problems. It wasn't properly spec compliant. For reference here's link to Nvidia's patch to support decoding on the nano in ffmpeg: http://ffmpeg.org/pipermail/ffmpeg-devel/2020-June/263746.html -- Andriy ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org https://ffmpeg.org/mailman/listinfo/ffmpeg-devel To unsubscribe, visit link above, or email ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".
Re: [FFmpeg-devel] [PATCH 5/8] sws: add a new scaling API
On Mon, Jul 12, 2021 at 01:07:06PM +0200, Anton Khirnov wrote: [...] > diff --git a/libswscale/swscale.h b/libswscale/swscale.h > index 50d6d46553..41eacd2dea 100644 > --- a/libswscale/swscale.h > +++ b/libswscale/swscale.h > @@ -30,6 +30,7 @@ > #include > > #include "libavutil/avutil.h" > +#include "libavutil/frame.h" > #include "libavutil/log.h" > #include "libavutil/pixfmt.h" > #include "version.h" > @@ -218,6 +219,85 @@ int sws_scale(struct SwsContext *c, const uint8_t *const > srcSlice[], >const int srcStride[], int srcSliceY, int srcSliceH, >uint8_t *const dst[], const int dstStride[]); > > +/** > + * Scale source data from src and write the output to dst. > + * > + * This is merely a convenience wrapper around > + * - sws_frame_start() > + * - sws_send_slice(0, src->height) > + * - sws_receive_slice(0, dst->height) > + * - sws_frame_end() > + * > + * @param dst The destination frame. See documentation for sws_frame_start() > for > + *more details. > + * @param src The source frame. > + * > + * @return 0 on success, a negative AVERROR code on failure > + */ > +int sws_scale_frame(struct SwsContext *c, AVFrame *dst, const AVFrame *src); > + > +/** > + * Initialize the scaling process for a given pair of source/destination > frames. > + * Must be called before any calls to sws_send_slice() and > sws_receive_slice(). > + * > + * This function will retain references to src and dst. > + * > + * @param dst The destination frame. > + * > + *The data buffers may either be already allocated by the caller > or > + *left clear, in which case they will be allocated by the scaler. > + *The latter may have performance advantages - e.g. in certain > cases > + *some output planes may be references to input planes, rather > than > + *copies. > + * > + *Output data will be written into this frame in successful > + *sws_receive_slice() calls. > + * @param src The source frame. The data buffers must be allocated, but the > + *frame data does not have to be ready at this point. Data > + *availability is then signalled by sws_send_slice(). > + * @return 0 on success, a negative AVERROR code on failure > + * > + * @see sws_frame_end() > + */ > +int sws_frame_start(struct SwsContext *c, AVFrame *dst, const AVFrame *src); > + > +/** > + * Finish the scaling process for a pair of source/destination frames > previously > + * submitted with sws_frame_start(). Must be called after all > sws_send_slice() > + * and sws_receive_slice() calls are done, before any new sws_frame_start() > + * calls. > + */ > +void sws_frame_end(struct SwsContext *c); > + > +/** > + * Indicate that a horizontal slice of input data is available in the source > + * frame previously provided to sws_frame_start(). The slices may be > provided in > + * any order, but may not overlap. For vertically subsampled pixel formats, > the > + * slices must be aligned according to subsampling. > + * > + * @param slice_start first row of the slice > + * @param slice_height number of rows in the slice > + * > + * @return 0 on success, a negative AVERROR code on failure. > + */ > +int sws_send_slice(struct SwsContext *c, unsigned int slice_start, > + unsigned int slice_height); I suggest to use non 0 on success. That could then be extended in the future for example to provide information about how many lines have already been consumed and its memory be reused thx [...] -- Michael GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB If you think the mosad wants you dead since a long time then you are either wrong or dead since a long time. signature.asc Description: PGP signature ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org https://ffmpeg.org/mailman/listinfo/ffmpeg-devel To unsubscribe, visit link above, or email ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".
Re: [FFmpeg-devel] [PATCH 8/8] lavfi/vf_scale: pass the thread count to the scaler
On Mon, Jul 12, 2021 at 01:07:09PM +0200, Anton Khirnov wrote: > --- > libavfilter/vf_scale.c | 1 + > 1 file changed, 1 insertion(+) > > diff --git a/libavfilter/vf_scale.c b/libavfilter/vf_scale.c > index cdff3ab7ed..f676f5d82e 100644 > --- a/libavfilter/vf_scale.c > +++ b/libavfilter/vf_scale.c > @@ -543,6 +543,7 @@ static int config_props(AVFilterLink *outlink) > av_opt_set_int(*s, "sws_flags", scale->flags, 0); > av_opt_set_int(*s, "param0", scale->param[0], 0); > av_opt_set_int(*s, "param1", scale->param[1], 0); > +av_opt_set_int(*s, "threads", ff_filter_get_nb_threads(ctx), 0); > if (scale->in_range != AVCOL_RANGE_UNSPECIFIED) > av_opt_set_int(*s, "src_range", > scale->in_range == AVCOL_RANGE_JPEG, 0); > -- > 2.30.2 seems to crash: -f image2 -vcodec pgmyuv -i tests/vsynth1/01.pgm -vf format=xyz12le -vcodec rawvideo -pix_fmt xyz12le -y file-xyz.j2k ==13394== Thread 35: ==13394== Invalid read of size 2 ==13394==at 0x118F1BA: rgb48Toxyz12 (swscale.c:705) ==13394==by 0x1190A9A: scale_internal (swscale.c:1048) ==13394==by 0x11911C7: ff_sws_slice_worker (swscale.c:1206) ==13394==by 0x126205E: run_jobs (slicethread.c:61) ==13394==by 0x1262130: thread_worker (slicethread.c:85) ==13394==by 0xCB4E6DA: start_thread (pthread_create.c:463) ==13394==by 0xCE8771E: clone (clone.S:95) ==13394== Address 0x370db89e is 608,286 bytes inside a block of size 608,287 alloc'd ==13394==at 0x4C33E76: memalign (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so) ==13394==by 0x4C33F91: posix_memalign (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so) ==13394==by 0x123235A: av_malloc (mem.c:87) ==13394==by 0x1218455: av_buffer_alloc (buffer.c:72) ==13394==by 0x12184D4: av_buffer_allocz (buffer.c:85) ==13394==by 0x1218EA5: pool_alloc_buffer (buffer.c:351) ==13394==by 0x1218FED: av_buffer_pool_get (buffer.c:388) ==13394==by 0x2B47E9: ff_frame_pool_get (framepool.c:221) ==13394==by 0x472F03: ff_default_get_video_buffer (video.c:90) ==13394==by 0x472FBD: ff_get_video_buffer (video.c:109) ==13394==by 0x472D0E: ff_null_get_video_buffer (video.c:41) ==13394==by 0x472F9E: ff_get_video_buffer (video.c:106) ==13394==by 0x472D0E: ff_null_get_video_buffer (video.c:41) ==13394==by 0x472F9E: ff_get_video_buffer (video.c:106) ==13394==by 0x3E6FDD: scale_frame (vf_scale.c:755) ==13394==by 0x3E7557: filter_frame (vf_scale.c:838) ==13394==by 0x29B314: ff_filter_frame_framed (avfilter.c:969) ==13394==by 0x29BBCF: ff_filter_frame_to_filter (avfilter.c:1117) ==13394==by 0x29BDDF: ff_filter_activate_default (avfilter.c:1166) ==13394==by 0x29C003: ff_filter_activate (avfilter.c:1324) [...] -- Michael GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB There will always be a question for which you do not know the correct answer. signature.asc Description: PGP signature ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org https://ffmpeg.org/mailman/listinfo/ffmpeg-devel To unsubscribe, visit link above, or email ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".
[FFmpeg-devel] [PATCH] avcodec/h264_parser: remove key frame tagging heuristics
None of these packets contain keyframes, and tagging them as such can result in non spec compliant output when remuxing into containers like mp4 and Matroska, where bogus samples would be marked as Sync Samples. Some tests are updated to reflect this. Suggested-by: ffm...@fb.com Signed-off-by: James Almer --- libavcodec/h264_parser.c | 8 tests/fate-run.sh | 4 ++-- tests/fate/ffmpeg.mak | 2 +- tests/fate/lavf-container.mak | 12 ++-- tests/fate/matroska.mak| 2 +- tests/ref/fate/copy-trac2211-avi | 2 +- tests/ref/fate/matroska-h264-remux | 4 ++-- tests/ref/fate/segment-mp4-to-ts | 10 +- tests/ref/lavf-fate/h264.mp4 | 4 ++-- 9 files changed, 20 insertions(+), 28 deletions(-) diff --git a/libavcodec/h264_parser.c b/libavcodec/h264_parser.c index d3c56cc188..e78c3679fb 100644 --- a/libavcodec/h264_parser.c +++ b/libavcodec/h264_parser.c @@ -344,10 +344,6 @@ static inline int parse_nal_units(AVCodecParserContext *s, get_ue_golomb_long(&nal.gb); // skip first_mb_in_slice slice_type = get_ue_golomb_31(&nal.gb); s->pict_type = ff_h264_golomb_to_pict_type[slice_type % 5]; -if (p->sei.recovery_point.recovery_frame_cnt >= 0) { -/* key frame, since recovery_frame_cnt is set */ -s->key_frame = 1; -} pps_id = get_ue_golomb(&nal.gb); if (pps_id >= MAX_PPS_COUNT) { av_log(avctx, AV_LOG_ERROR, @@ -370,10 +366,6 @@ static inline int parse_nal_units(AVCodecParserContext *s, p->ps.sps = p->ps.pps->sps; sps = p->ps.sps; -// heuristic to detect non marked keyframes -if (p->ps.sps->ref_frame_count <= 1 && p->ps.pps->ref_count[0] <= 1 && s->pict_type == AV_PICTURE_TYPE_I) -s->key_frame = 1; - p->poc.frame_num = get_bits(&nal.gb, sps->log2_max_frame_num); s->coded_width = 16 * sps->mb_width; diff --git a/tests/fate-run.sh b/tests/fate-run.sh index ba437dfbb8..2117ca387e 100755 --- a/tests/fate-run.sh +++ b/tests/fate-run.sh @@ -339,8 +339,8 @@ lavf_container_fate() outdir="tests/data/lavf-fate" file=${outdir}/lavf.$t input="${target_samples}/$1" -do_avconv $file -auto_conversion_filters $DEC_OPTS $2 -i "$input" "$ENC_OPTS -metadata title=lavftest" -vcodec copy -acodec copy -do_avconv_crc $file -auto_conversion_filters $DEC_OPTS -i $target_path/$file $3 +do_avconv $file -auto_conversion_filters $DEC_OPTS $2 -i "$input" "$ENC_OPTS -metadata title=lavftest" -vcodec copy -acodec copy %3 +do_avconv_crc $file -auto_conversion_filters $DEC_OPTS -i $target_path/$file $4 } lavf_image(){ diff --git a/tests/fate/ffmpeg.mak b/tests/fate/ffmpeg.mak index 4dfb77d250..57d16fba6f 100644 --- a/tests/fate/ffmpeg.mak +++ b/tests/fate/ffmpeg.mak @@ -110,7 +110,7 @@ fate-copy-trac4914-avi: CMD = transcode mpegts $(TARGET_SAMPLES)/mpeg2/xdcam8mp2 FATE_STREAMCOPY-$(call ALLYES, H264_DEMUXER AVI_MUXER) += fate-copy-trac2211-avi fate-copy-trac2211-avi: $(SAMPLES)/h264/bbc2.sample.h264 fate-copy-trac2211-avi: CMD = transcode "h264 -r 14" $(TARGET_SAMPLES)/h264/bbc2.sample.h264\ - avi "-c:a copy -c:v copy" + avi "-c:a copy -c:v copy -copyinkf" FATE_STREAMCOPY-$(call ENCDEC, APNG, APNG) += fate-copy-apng fate-copy-apng: fate-lavf-apng diff --git a/tests/fate/lavf-container.mak b/tests/fate/lavf-container.mak index 9e0eed4851..40250badc1 100644 --- a/tests/fate/lavf-container.mak +++ b/tests/fate/lavf-container.mak @@ -71,13 +71,13 @@ FATE_LAVF_CONTAINER_FATE = $(FATE_LAVF_CONTAINER_FATE-yes:%=fate-lavf-fate-%) $(FATE_LAVF_CONTAINER_FATE): REF = $(SRC_PATH)/tests/ref/lavf-fate/$(@:fate-lavf-fate-%=%) $(FATE_LAVF_CONTAINER_FATE): $(AREF) $(VREF) -fate-lavf-fate-av1.mp4: CMD = lavf_container_fate "av1-test-vectors/av1-1-b8-05-mv.ivf" "" "-c:v copy" -fate-lavf-fate-av1.mkv: CMD = lavf_container_fate "av1-test-vectors/av1-1-b8-05-mv.ivf" "" "-c:v copy" -fate-lavf-fate-h264.mp4: CMD = lavf_container_fate "h264/intra_refresh.h264" "" "-c:v copy" +fate-lavf-fate-av1.mp4: CMD = lavf_container_fate "av1-test-vectors/av1-1-b8-05-mv.ivf" "" "" "-c:v copy" +fate-lavf-fate-av1.mkv: CMD = lavf_container_fate "av1-test-vectors/av1-1-b8-05-mv.ivf" "" "" "-c:v copy" +fate-lavf-fate-h264.mp4: CMD = lavf_container_fate "h264/intra_refresh.h264" "" "-copyinkf" "-c:v copy -copyinkf" fate-lavf-fate-vp3.ogg: CMD = lavf_container_fate "vp3/coeff_level64.mkv" "-idct auto" -fate-lavf-fate-vp8.ogg: CMD = lavf_container_fate "vp8/RRSF49-short.webm" "" "-acodec copy" -fate-lavf-fate-latm: CMD = lavf_container_fate "aac/al04_44.mp4" "" "-acodec copy" -fate-lavf-fate-mp3: CMD = lavf_container_fate "mp3-conformance/he_32khz.bit" "" "-acodec copy" +fate-lavf-fate-vp8.ogg: CMD = lavf_container_fate "vp8/RRSF49-short.webm" "" ""
Re: [FFmpeg-devel] [PATCH 6/8] lavfi/vf_scale: convert to the frame-based sws API
On Mon, Jul 12, 2021 at 01:07:07PM +0200, Anton Khirnov wrote: > --- > libavfilter/vf_scale.c | 73 -- > 1 file changed, 49 insertions(+), 24 deletions(-) crashes: ./ffmpeg -i ~/tickets/5264/gbrap16.tif -vf format=yuva444p,scale=alphablend=checkerboard,format=yuv420p -y file.png Stream mapping: Stream #0:0 -> #0:0 (tiff (native) -> png (native)) Press [q] to stop, [?] for help ==19419== Invalid read of size 4 ==19419==at 0x1223964: av_frame_ref (frame.c:330) ==19419==by 0x1190B34: sws_frame_start (swscale.c:1069) ==19419==by 0x1190EA4: sws_scale_frame (swscale.c:1153) ==19419==by 0x3E7493: scale_frame (vf_scale.c:821) ==19419==by 0x3E752D: filter_frame (vf_scale.c:837) ==19419==by 0x29B314: ff_filter_frame_framed (avfilter.c:969) ==19419==by 0x29BBCF: ff_filter_frame_to_filter (avfilter.c:1117) ==19419==by 0x29BDDF: ff_filter_activate_default (avfilter.c:1166) ==19419==by 0x29C003: ff_filter_activate (avfilter.c:1324) ==19419==by 0x2A0EBB: ff_filter_graph_run_once (avfiltergraph.c:1400) ==19419==by 0x2A2139: push_frame (buffersrc.c:157) ==19419==by 0x2A26B6: av_buffersrc_add_frame_flags (buffersrc.c:225) ==19419==by 0x24FC90: ifilter_send_frame (ffmpeg.c:2241) ==19419==by 0x24FF72: send_frame_to_filters (ffmpeg.c:2315) ==19419==by 0x250D26: decode_video (ffmpeg.c:2512) ==19419==by 0x2517BA: process_input_packet (ffmpeg.c:2674) ==19419==by 0x25799F: process_input (ffmpeg.c:4403) ==19419==by 0x2599A4: transcode_step (ffmpeg.c:4758) ==19419==by 0x259B0C: transcode (ffmpeg.c:4812) ==19419==by 0x25A470: main (ffmpeg.c:5017) ==19419== Address 0x68 is not stack'd, malloc'd or (recently) free'd ==19419== ==19419== [...] -- Michael GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB Complexity theory is the science of finding the exact solution to an approximation. Benchmarking OTOH is finding an approximation of the exact signature.asc Description: PGP signature ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org https://ffmpeg.org/mailman/listinfo/ffmpeg-devel To unsubscribe, visit link above, or email ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".
Re: [FFmpeg-devel] [PATCH 6/8] lavfi/vf_scale: convert to the frame-based sws API
On 7/12/2021 4:39 PM, Michael Niedermayer wrote: On Mon, Jul 12, 2021 at 01:07:07PM +0200, Anton Khirnov wrote: --- libavfilter/vf_scale.c | 73 -- 1 file changed, 49 insertions(+), 24 deletions(-) crashes: ./ffmpeg -i ~/tickets/5264/gbrap16.tif -vf format=yuva444p,scale=alphablend=checkerboard,format=yuv420p -y file.png Stream mapping: Stream #0:0 -> #0:0 (tiff (native) -> png (native)) Press [q] to stop, [?] for help ==19419== Invalid read of size 4 ==19419==at 0x1223964: av_frame_ref (frame.c:330) ==19419==by 0x1190B34: sws_frame_start (swscale.c:1069) ==19419==by 0x1190EA4: sws_scale_frame (swscale.c:1153) ==19419==by 0x3E7493: scale_frame (vf_scale.c:821) ==19419==by 0x3E752D: filter_frame (vf_scale.c:837) ==19419==by 0x29B314: ff_filter_frame_framed (avfilter.c:969) ==19419==by 0x29BBCF: ff_filter_frame_to_filter (avfilter.c:1117) ==19419==by 0x29BDDF: ff_filter_activate_default (avfilter.c:1166) ==19419==by 0x29C003: ff_filter_activate (avfilter.c:1324) ==19419==by 0x2A0EBB: ff_filter_graph_run_once (avfiltergraph.c:1400) ==19419==by 0x2A2139: push_frame (buffersrc.c:157) ==19419==by 0x2A26B6: av_buffersrc_add_frame_flags (buffersrc.c:225) ==19419==by 0x24FC90: ifilter_send_frame (ffmpeg.c:2241) ==19419==by 0x24FF72: send_frame_to_filters (ffmpeg.c:2315) ==19419==by 0x250D26: decode_video (ffmpeg.c:2512) ==19419==by 0x2517BA: process_input_packet (ffmpeg.c:2674) ==19419==by 0x25799F: process_input (ffmpeg.c:4403) ==19419==by 0x2599A4: transcode_step (ffmpeg.c:4758) ==19419==by 0x259B0C: transcode (ffmpeg.c:4812) ==19419==by 0x25A470: main (ffmpeg.c:5017) ==19419== Address 0x68 is not stack'd, malloc'd or (recently) free'd ==19419== ==19419== Both c->frame_src and c->frame_dst need to be allocated earlier in sws_init_context(). There seem to be some cases where that function will return early with a success code. ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org https://ffmpeg.org/mailman/listinfo/ffmpeg-devel To unsubscribe, visit link above, or email ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".
[FFmpeg-devel] [PATCH] web/security: Add CVE-2021-30123 (never affected a release)
Thanks to Jan Ekström for details --- src/security | 1 + 1 file changed, 1 insertion(+) diff --git a/src/security b/src/security index 935823b..1248018 100644 --- a/src/security +++ b/src/security @@ -15,6 +15,7 @@ CVE-2020-21041, 5d9f44da460f781a1604d537d0555b78e29438ba, ticket/7989 CVE-2020-22038, 7c32e9cf93b712f8463573a59ed4e98fd10fa013, ticket/8285 CVE-2020-22042, 426c16d61a9b5056a157a1a2a057a4e4d13eef84, ticket/8267 CVE-2020-24020, 584f396132aa19d21bb1e38ad9a5d428869290cb, ticket/8718 +CVE-2021-30123, d6f293353c94c7ce200f6e0975ae3de49787f91f, ticket/8845, never affected a release CVE-2020-35965, 3e5959b3457f7f1856d997261e6ac672bba49e8b CVE-2020-35965, b0a8b40294ea212c1938348ff112ef1b9bf16bb3 -- 2.17.1 ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org https://ffmpeg.org/mailman/listinfo/ffmpeg-devel To unsubscribe, visit link above, or email ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".
Re: [FFmpeg-devel] [PATCH] avcodec/h264_parser: remove key frame tagging heuristics
On Mon, 12 Jul 2021 at 20:33, James Almer wrote: > None of these packets contain keyframes, and tagging them as such can > result in > non spec compliant output when remuxing into containers like mp4 and > Matroska, > where bogus samples would be marked as Sync Samples. > > Some tests are updated to reflect this. > > Suggested-by: ffm...@fb.com > Signed-off-by: James Almer > --- > libavcodec/h264_parser.c | 8 > tests/fate-run.sh | 4 ++-- > tests/fate/ffmpeg.mak | 2 +- > tests/fate/lavf-container.mak | 12 ++-- > tests/fate/matroska.mak| 2 +- > tests/ref/fate/copy-trac2211-avi | 2 +- > tests/ref/fate/matroska-h264-remux | 4 ++-- > tests/ref/fate/segment-mp4-to-ts | 10 +- > tests/ref/lavf-fate/h264.mp4 | 4 ++-- > 9 files changed, 20 insertions(+), 28 deletions(-) > > diff --git a/libavcodec/h264_parser.c b/libavcodec/h264_parser.c > index d3c56cc188..e78c3679fb 100644 > --- a/libavcodec/h264_parser.c > +++ b/libavcodec/h264_parser.c > @@ -344,10 +344,6 @@ static inline int > parse_nal_units(AVCodecParserContext *s, > get_ue_golomb_long(&nal.gb); // skip first_mb_in_slice > slice_type = get_ue_golomb_31(&nal.gb); > s->pict_type = ff_h264_golomb_to_pict_type[slice_type % 5]; > -if (p->sei.recovery_point.recovery_frame_cnt >= 0) { > -/* key frame, since recovery_frame_cnt is set */ > -s->key_frame = 1; > -} > Why remove this, this is a reasonable check for a key frame? Kieran ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org https://ffmpeg.org/mailman/listinfo/ffmpeg-devel To unsubscribe, visit link above, or email ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".
Re: [FFmpeg-devel] [PATCH] avcodec/h264_parser: remove key frame tagging heuristics
On 7/12/2021 8:53 PM, Kieran Kunhya wrote: On Mon, 12 Jul 2021 at 20:33, James Almer wrote: None of these packets contain keyframes, and tagging them as such can result in non spec compliant output when remuxing into containers like mp4 and Matroska, where bogus samples would be marked as Sync Samples. Some tests are updated to reflect this. Suggested-by: ffm...@fb.com Signed-off-by: James Almer --- libavcodec/h264_parser.c | 8 tests/fate-run.sh | 4 ++-- tests/fate/ffmpeg.mak | 2 +- tests/fate/lavf-container.mak | 12 ++-- tests/fate/matroska.mak| 2 +- tests/ref/fate/copy-trac2211-avi | 2 +- tests/ref/fate/matroska-h264-remux | 4 ++-- tests/ref/fate/segment-mp4-to-ts | 10 +- tests/ref/lavf-fate/h264.mp4 | 4 ++-- 9 files changed, 20 insertions(+), 28 deletions(-) diff --git a/libavcodec/h264_parser.c b/libavcodec/h264_parser.c index d3c56cc188..e78c3679fb 100644 --- a/libavcodec/h264_parser.c +++ b/libavcodec/h264_parser.c @@ -344,10 +344,6 @@ static inline int parse_nal_units(AVCodecParserContext *s, get_ue_golomb_long(&nal.gb); // skip first_mb_in_slice slice_type = get_ue_golomb_31(&nal.gb); s->pict_type = ff_h264_golomb_to_pict_type[slice_type % 5]; -if (p->sei.recovery_point.recovery_frame_cnt >= 0) { -/* key frame, since recovery_frame_cnt is set */ -s->key_frame = 1; -} Why remove this, this is a reasonable check for a key frame? Because it isn't something that should be marked as a keyframe as coded bitstream in any kind of container, like it's the case of mp4 sync samples. Kieran ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org https://ffmpeg.org/mailman/listinfo/ffmpeg-devel To unsubscribe, visit link above, or email ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe". ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org https://ffmpeg.org/mailman/listinfo/ffmpeg-devel To unsubscribe, visit link above, or email ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".
Re: [FFmpeg-devel] [PATCH] avcodec/h264_parser: remove key frame tagging heuristics
> > Because it isn't something that should be marked as a keyframe as coded > bitstream in any kind of container, like it's the case of mp4 sync samples. > MPEG-TS Random Access Indicator expects keyframes to be signalled like this. With intra-refresh and this code removed, there will be no random access points at all. Kieran ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org https://ffmpeg.org/mailman/listinfo/ffmpeg-devel To unsubscribe, visit link above, or email ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".
Re: [FFmpeg-devel] [PATCH] avcodec/h264_parser: remove key frame tagging heuristics
On 7/12/2021 10:01 PM, Kieran Kunhya wrote: Because it isn't something that should be marked as a keyframe as coded bitstream in any kind of container, like it's the case of mp4 sync samples. MPEG-TS Random Access Indicator expects keyframes to be signalled like this. With intra-refresh and this code removed, there will be no random access points at all. If MPEG-TS wants to tag packets containing things other than IDR access units as RAPs, then it should analyze the bitstream itself in order to tag them itself as such in the output. This parser as is is generating invalid output for other containers that are strict about key frames, and signal recovery points (like those indicated by the use of this SEI) by other means. Kieran ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org https://ffmpeg.org/mailman/listinfo/ffmpeg-devel To unsubscribe, visit link above, or email ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe". ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org https://ffmpeg.org/mailman/listinfo/ffmpeg-devel To unsubscribe, visit link above, or email ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".
Re: [FFmpeg-devel] [PATCH] avcodec/h264_parser: remove key frame tagging heuristics
On Tue, 13 Jul 2021, 02:45 James Almer, wrote: > On 7/12/2021 10:01 PM, Kieran Kunhya wrote: > >> > >> Because it isn't something that should be marked as a keyframe as coded > >> bitstream in any kind of container, like it's the case of mp4 sync > samples. > >> > > > > MPEG-TS Random Access Indicator expects keyframes to be signalled like > this. > > With intra-refresh and this code removed, there will be no random access > > points at all. > > If MPEG-TS wants to tag packets containing things other than IDR access > units as RAPs, then it should analyze the bitstream itself in order to > tag them itself as such in the output. > This parser as is is generating invalid output for other containers that > are strict about key frames, and signal recovery points (like those > indicated by the use of this SEI) by other means. > Why not just detect IDR in containers that only care about that (which is a mistake because if things like open gop)? Doing that's is relatively simple compared to adding bitstream parsing into MPEGTS. Kieran ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org https://ffmpeg.org/mailman/listinfo/ffmpeg-devel To unsubscribe, visit link above, or email ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".