Re: [FFmpeg-devel] [PATCH v2] avcodec/dxva2: add support for HEVC RExt DXVA profiles

2024-11-14 Thread Hendrik Leppkes
On Thu, Nov 14, 2024 at 10:57 AM Steve Lhomme  wrote:
>
> Hi,
>
> For the record we have been running this in VLC for quite some time,
> only for Intel hardware.
> https://code.videolan.org/videolan/vlc/-/blob/3.0.x/contrib/src/ffmpeg/0001-avcodec-dxva2_hevc-add-support-for-parsing-HEVC-Rang.patch?ref_type=heads
> https://code.videolan.org/videolan/vlc/-/blob/3.0.x/contrib/src/ffmpeg/0002-avcodec-hevcdec-allow-HEVC-444-8-10-12-bits-decoding.patch?ref_type=heads
> https://code.videolan.org/videolan/vlc/-/blob/3.0.x/contrib/src/ffmpeg/0003-avcodec-hevcdec-allow-HEVC-422-10-12-bits-decoding-w.patch?ref_type=heads
>
> It seems the MS GUID are not the same as Intel...
>

The GUIDs and the struct itself are not identical to Intels
proprietary format. Luckily Intel supports the official MS version in
their latest drivers, so the proprietary version can just be buried
and forgotten.

- Hendrik
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".


Re: [FFmpeg-devel] [PATCH v3 1/3] checkasm/diracdsp: test add_dirac_obmc

2024-11-14 Thread Ronald S. Bultje
Hi,

On Thu, Nov 14, 2024 at 10:18 AM James Almer  wrote:

> On 11/14/2024 11:30 AM, Kyosuke Kawakami wrote:
> > Signed-off-by: Kyosuke Kawakami 
> > ---
> >   tests/checkasm/Makefile   |  1 +
> >   tests/checkasm/checkasm.c |  3 ++
> >   tests/checkasm/checkasm.h |  1 +
> >   tests/checkasm/diracdsp.c | 86 +++
> >   tests/fate/checkasm.mak   |  1 +
> >   5 files changed, 92 insertions(+)
> >   create mode 100644 tests/checkasm/diracdsp.c
>
> [...]
>
> > diff --git a/tests/checkasm/diracdsp.c b/tests/checkasm/diracdsp.c
> > new file mode 100644
> > index 00..8833c2d223
> > --- /dev/null
> > +++ b/tests/checkasm/diracdsp.c
> > @@ -0,0 +1,86 @@
> > +/*
> > + * Copyright (c) 2024 Kyosuke Kawakami
> > + *
> > + * This file is part of FFmpeg.
> > + *
> > + * FFmpeg is free software; you can redistribute it and/or modify
> > + * it under the terms of the GNU General Public License as published by
> > + * the Free Software Foundation; either version 2 of the License, or
> > + * (at your option) any later version.
> > + *
> > + * FFmpeg is distributed in the hope that it will be useful,
> > + * but WITHOUT ANY WARRANTY; without even the implied warranty of
> > + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> > + * GNU General Public License for more details.
> > + *
> > + * You should have received a copy of the GNU General Public License
> along
> > + * with FFmpeg; if not, write to the Free Software Foundation, Inc.,
> > + * 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA.
> > + */
> > +
> > +#include "checkasm.h"
> > +
> > +#include "libavcodec/diracdsp.h"
> > +
> > +#include "libavutil/intreadwrite.h"
> > +#include "libavutil/mem_internal.h"
> > +
> > +#define RANDOMIZE_DESTS(name, size) \
> > +do {\
> > +int i;  \
> > +for (i = 0; i < size; ++i) {\
> > +uint16_t r = rnd(); \
> > +AV_WN16A(name##0 + i, r);   \
> > +AV_WN16A(name##1 + i, r);   \
> > +}   \
> > +} while (0)
> > +
> > +#define RANDOMIZE_BUFFER8(name, size) \
> > +do {  \
> > +int i;\
> > +for (i = 0; i < size; ++i) {  \
> > +uint8_t r = rnd();\
> > +name[i] = r;  \
> > +} \
> > +} while (0)
> > +
> > +#define OBMC_STRIDE 32
> > +#define XBLEN_MAX 32
> > +#define YBLEN_MAX 64
> > +
> > +static void check_add_obmc(size_t func_index, int xblen)
> > +{
> > +LOCAL_ALIGNED_8(uint8_t, src, [XBLEN_MAX * YBLEN_MAX]);
> > +LOCAL_ALIGNED_16(uint16_t, dst0, [XBLEN_MAX * YBLEN_MAX]);
> > +LOCAL_ALIGNED_16(uint16_t, dst1, [XBLEN_MAX * YBLEN_MAX]);
>
> The loads in the asm functions use movdqu, so i assume the buffers in
> the decoder are not 16 byte aligned. To ensure future implementations
> don't mistakenly use aligned loads, you could make this be:
>
> LOCAL_ALIGNED_16(uint16_t, _dst0, [XBLEN_MAX * YBLEN_MAX + 4]);
> LOCAL_ALIGNED_16(uint16_t, _dst1, [XBLEN_MAX * YBLEN_MAX + 4]);
> uint16_t *dst0 = _dst0 + 4, *dst1 = _dst1 + 4;
>
> Using LOCAL_ALIGNED_8() could also end up with a 16 byte aligned buffer,
> so the above will make sure the buffer is 8 byte aligned.
>
> > +LOCAL_ALIGNED_8(uint8_t, obmc_weight, [XBLEN_MAX * YBLEN_MAX]);
> > +
> > +int yblen;
> > +DiracDSPContext h;
> > +
> > +ff_diracdsp_init(&h);
> > +
> > +if (check_func(h.add_dirac_obmc[func_index],
> "diracdsp.add_dirac_obmc_%d", xblen)) {
> > +declare_func(void, uint16_t*, const uint8_t*, int, const
> uint8_t *, int);
> > +
> > +yblen = 1 + (rnd() % YBLEN_MAX);
>
> Use YBLEN_MAX directly. No real gain in using randomized height, and
> this way every --bench run will give wildly different results.
>

The bench should use max_height, but the test should use a randomized
height, IMO.

Ronald
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".


Re: [FFmpeg-devel] [PATCH v6 12/12] avfilter/vf_scale: switch to new swscale API

2024-11-14 Thread compn
On Fri, 15 Nov 2024 00:49:06 +0100
Michael Niedermayer  wrote:

> On Fri, Nov 15, 2024 at 12:11:34AM +0100, Niklas Haas wrote:
> > On Fri, 15 Nov 2024 00:00:10 +0100 Michael Niedermayer
> >  wrote:  
> > > On Tue, Nov 12, 2024 at 10:50:46AM +0100, Niklas Haas wrote:  
> > > > From: Niklas Haas 
> > > > 
> > > > Most logic from this filter has been co-opted into swscale
> > > > itself, allowing the resulting filter to be substantially
> > > > simpler as it no longer has to worry about context
> > > > initialization, interlacing, etc.
> > > > 
> > > > Sponsored-by: Sovereign Tech Fund
> > > > Signed-off-by: Niklas Haas 
> > > > ---
> > > >  libavfilter/vf_scale.c | 354
> > > > + 1 file changed, 72
> > > > insertions(+), 282 deletions(-)  
> > > 
> > > ./ffmpeg -i foreman_cif.y4m  -vf
> > > scale=out_v_chr_pos=0:out_h_chr_pos=0 -f null -
> > > 
> > > [fc#-1 @ 0x55eec4ea3300] Error applying option 'out_v_chr_pos' to
> > > filter 'scale': Option not found Error opening output file -.
> > > Error opening output files: Option not found  
> > 
> > I mean, this change is basically intentional. But I suppose I
> > should add backwards compatibility code to at least round it to the
> > nearest similar value.  
> 
> yes, it should do something better than just failing with a message
> that confuses the user
> 
> also [fc#-1 @ 0x55b30de5f000] Error applying option 'in_v_chr_pos' to
> filter 'scale': Option not found and others

do the docs on the scale filter have to be updated?

https://ffmpeg.org/ffmpeg-filters.html#Options-2

-compn
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".


[FFmpeg-devel] [PATCH v3 07/10] ffv1enc: expose ff_ffv1_write_extradata

2024-11-14 Thread Lynne via ffmpeg-devel
---
 libavcodec/ffv1enc.c | 6 --
 libavcodec/ffv1enc.h | 1 +
 2 files changed, 5 insertions(+), 2 deletions(-)

diff --git a/libavcodec/ffv1enc.c b/libavcodec/ffv1enc.c
index 0ef26db30a..8c0f649b8d 100644
--- a/libavcodec/ffv1enc.c
+++ b/libavcodec/ffv1enc.c
@@ -393,8 +393,10 @@ static void write_header(FFV1Context *f)
 }
 }
 
-static int write_extradata(FFV1Context *f)
+av_cold int ff_ffv1_write_extradata(AVCodecContext *avctx)
 {
+FFV1Context *f = avctx->priv_data;
+
 RangeCoder c;
 uint8_t state[CONTEXT_SIZE];
 int i, j, k;
@@ -741,7 +743,7 @@ av_cold int ff_ffv1_encode_init(AVCodecContext *avctx)
 if ((ret = encode_determine_slices(avctx)) < 0)
 return ret;
 
-if ((ret = write_extradata(s)) < 0)
+if ((ret = ff_ffv1_write_extradata(avctx)) < 0)
 return ret;
 }
 
diff --git a/libavcodec/ffv1enc.h b/libavcodec/ffv1enc.h
index c062af0bf5..6850243ac1 100644
--- a/libavcodec/ffv1enc.h
+++ b/libavcodec/ffv1enc.h
@@ -26,5 +26,6 @@
 #include "avcodec.h"
 
 av_cold int ff_ffv1_encode_init(AVCodecContext *avctx);
+av_cold int ff_ffv1_write_extradata(AVCodecContext *avctx);
 
 #endif /* AVCODEC_FFV1ENC_H */
-- 
2.45.2.753.g447d99e1c3b
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".


[FFmpeg-devel] [PATCH v3 04/10] vulkan: fix printing descriptors to shader for shaders with no descriptors

2024-11-14 Thread Lynne via ffmpeg-devel
---
 libavutil/vulkan.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/libavutil/vulkan.c b/libavutil/vulkan.c
index 159165a19d..2813bc1af9 100644
--- a/libavutil/vulkan.c
+++ b/libavutil/vulkan.c
@@ -2135,7 +2135,7 @@ print:
 /* Write shader info */
 for (int i = 0; i < nb; i++) {
 const struct descriptor_props *prop = &descriptor_props[desc[i].type];
-GLSLA("layout (set = %i, binding = %i", shd->nb_descriptor_sets - 1, 
i);
+GLSLA("layout (set = %i, binding = %i", FFMAX(shd->nb_descriptor_sets 
- 1, 0), i);
 
 if (desc[i].mem_layout)
 GLSLA(", %s", desc[i].mem_layout);
-- 
2.45.2.753.g447d99e1c3b
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".


[FFmpeg-devel] [PATCH v3 01/10] hwcontext_vulkan: explicitly wait when uploading

2024-11-14 Thread Lynne via ffmpeg-devel
---
 libavutil/hwcontext_vulkan.c | 11 +--
 1 file changed, 5 insertions(+), 6 deletions(-)

diff --git a/libavutil/hwcontext_vulkan.c b/libavutil/hwcontext_vulkan.c
index 0b52ad5112..0c9047f4c6 100644
--- a/libavutil/hwcontext_vulkan.c
+++ b/libavutil/hwcontext_vulkan.c
@@ -4200,13 +4200,12 @@ static int vulkan_transfer_frame(AVHWFramesContext 
*hwfc,
 }
 
 err = ff_vk_exec_submit(&p->vkctx, exec);
-if (err < 0) {
+if (err < 0)
 ff_vk_exec_discard_deps(&p->vkctx, exec);
-} else if (!upload) {
-ff_vk_exec_wait(&p->vkctx, exec);
-if (!host_mapped)
-err = copy_buffer_data(hwfc, bufs[0], swf, region, planes, 0);
-}
+
+ff_vk_exec_wait(&p->vkctx, exec);
+if (!upload && !host_mapped)
+err = copy_buffer_data(hwfc, bufs[0], swf, region, planes, 0);
 
 end:
 for (int i = 0; i < nb_bufs; i++)
-- 
2.45.2.753.g447d99e1c3b
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".


[FFmpeg-devel] [PATCH v3 10/10] ffv1enc: add a Vulkan encoder

2024-11-14 Thread Lynne via ffmpeg-devel
This commit implements a standard, compliant, version 3 and version 4
FFv1 encoder, entirely in Vulkan. The encoder is written in standard
GLSL and requires a Vulkan 1.3 supporting GPU with the BDA extension.

The encoder can use any amount of slices, but nominally, should use
32x32 slices (1024 in total) to maximize parallelism.

All features are supported, as well as all pixel formats.
This includes:
 - Rice
 - Range coding with a custom quantization table
 - PCM encoding

CRC calculation is also massively parallelized on the GPU.

Encoding of unaligned dimensions on subsampled data requires
version 4, or requires oversizing the image to 64-pixel alignment
and cropping out the padding via container flags.
---
 configure  |1 +
 libavcodec/Makefile|1 +
 libavcodec/allcodecs.c |1 +
 libavcodec/ffv1enc.c   |2 +-
 libavcodec/ffv1enc_vulkan.c| 1598 
 libavcodec/vulkan/Makefile |8 +
 libavcodec/vulkan/common.comp  |  182 +++
 libavcodec/vulkan/ffv1_common.comp |   75 ++
 libavcodec/vulkan/ffv1_enc.comp|   67 +
 libavcodec/vulkan/ffv1_enc_ac.comp |   83 ++
 libavcodec/vulkan/ffv1_enc_common.comp |  101 ++
 libavcodec/vulkan/ffv1_enc_rct.comp|   85 ++
 libavcodec/vulkan/ffv1_enc_rgb.comp|   83 ++
 libavcodec/vulkan/ffv1_enc_setup.comp  |  153 +++
 libavcodec/vulkan/ffv1_enc_vlc.comp|  112 ++
 libavcodec/vulkan/ffv1_reset.comp  |   55 +
 libavcodec/vulkan/ffv1_vlc.comp|  122 ++
 libavcodec/vulkan/rangecoder.comp  |  190 +++
 18 files changed, 2918 insertions(+), 1 deletion(-)
 create mode 100644 libavcodec/ffv1enc_vulkan.c
 create mode 100644 libavcodec/vulkan/common.comp
 create mode 100644 libavcodec/vulkan/ffv1_common.comp
 create mode 100644 libavcodec/vulkan/ffv1_enc.comp
 create mode 100644 libavcodec/vulkan/ffv1_enc_ac.comp
 create mode 100644 libavcodec/vulkan/ffv1_enc_common.comp
 create mode 100644 libavcodec/vulkan/ffv1_enc_rct.comp
 create mode 100644 libavcodec/vulkan/ffv1_enc_rgb.comp
 create mode 100644 libavcodec/vulkan/ffv1_enc_setup.comp
 create mode 100644 libavcodec/vulkan/ffv1_enc_vlc.comp
 create mode 100644 libavcodec/vulkan/ffv1_reset.comp
 create mode 100644 libavcodec/vulkan/ffv1_vlc.comp
 create mode 100644 libavcodec/vulkan/rangecoder.comp

diff --git a/configure b/configure
index 0e9ed6dc3c..d0954fc7d9 100755
--- a/configure
+++ b/configure
@@ -2951,6 +2951,7 @@ exr_decoder_deps="zlib"
 exr_encoder_deps="zlib"
 ffv1_decoder_select="rangecoder"
 ffv1_encoder_select="rangecoder"
+ffv1_vulkan_encoder_select="vulkan spirv_compiler"
 ffvhuff_decoder_select="huffyuv_decoder"
 ffvhuff_encoder_select="huffyuv_encoder"
 fic_decoder_select="golomb"
diff --git a/libavcodec/Makefile b/libavcodec/Makefile
index 676ff542af..a6e0e0b55e 100644
--- a/libavcodec/Makefile
+++ b/libavcodec/Makefile
@@ -370,6 +370,7 @@ OBJS-$(CONFIG_EXR_ENCODER) += exrenc.o 
float2half.o
 OBJS-$(CONFIG_FASTAUDIO_DECODER)   += fastaudio.o
 OBJS-$(CONFIG_FFV1_DECODER)+= ffv1dec.o ffv1.o
 OBJS-$(CONFIG_FFV1_ENCODER)+= ffv1enc.o ffv1.o
+OBJS-$(CONFIG_FFV1_VULKAN_ENCODER) += ffv1enc.o ffv1.o ffv1enc_vulkan.o
 OBJS-$(CONFIG_FFWAVESYNTH_DECODER) += ffwavesynth.o
 OBJS-$(CONFIG_FIC_DECODER) += fic.o
 OBJS-$(CONFIG_FITS_DECODER)+= fitsdec.o fits.o
diff --git a/libavcodec/allcodecs.c b/libavcodec/allcodecs.c
index d8a5866435..0b559dfc58 100644
--- a/libavcodec/allcodecs.c
+++ b/libavcodec/allcodecs.c
@@ -116,6 +116,7 @@ extern const FFCodec ff_escape130_decoder;
 extern const FFCodec ff_exr_encoder;
 extern const FFCodec ff_exr_decoder;
 extern const FFCodec ff_ffv1_encoder;
+extern const FFCodec ff_ffv1_vulkan_encoder;
 extern const FFCodec ff_ffv1_decoder;
 extern const FFCodec ff_ffvhuff_encoder;
 extern const FFCodec ff_ffvhuff_decoder;
diff --git a/libavcodec/ffv1enc.c b/libavcodec/ffv1enc.c
index 032c69a060..d785189fa9 100644
--- a/libavcodec/ffv1enc.c
+++ b/libavcodec/ffv1enc.c
@@ -885,7 +885,7 @@ av_cold int ff_ffv1_encode_setup_plane_info(AVCodecContext 
*avctx,
 }
 av_assert0(s->bits_per_raw_sample >= 8);
 
-return av_pix_fmt_get_chroma_sub_sample (avctx->pix_fmt, 
&s->chroma_h_shift, &s->chroma_v_shift);
+return av_pix_fmt_get_chroma_sub_sample (pix_fmt, &s->chroma_h_shift, 
&s->chroma_v_shift);
 }
 
 static int encode_init_internal(AVCodecContext *avctx)
diff --git a/libavcodec/ffv1enc_vulkan.c b/libavcodec/ffv1enc_vulkan.c
new file mode 100644
index 00..e367b7dd6f
--- /dev/null
+++ b/libavcodec/ffv1enc_vulkan.c
@@ -0,0 +1,1598 @@
+/*
+ * This file is part of FFmpeg.
+ *
+ * FFmpeg is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU Lesser General Public
+ * License as published by the Free Software Foundation; either
+ * version 2.1 of the License, or (at your option) any later version.
+ *
+

Re: [FFmpeg-devel] [PATCH] doc/infra: Document gitolite

2024-11-14 Thread Michael Niedermayer
On Wed, Nov 13, 2024 at 07:31:00PM +0100, Michael Niedermayer wrote:
> Signed-off-by: Michael Niedermayer 
> ---
>  doc/infra.txt | 3 +++
>  1 file changed, 3 insertions(+)
> 
> diff --git a/doc/infra.txt b/doc/infra.txt
> index 4ef6ccf736d..dfb13eda7b1 100644
> --- a/doc/infra.txt
> +++ b/doc/infra.txt
> @@ -73,6 +73,9 @@ Github mirrors are redundantly synced by multiple people
>  
>  You need a new git repository related to FFmpeg ? contact root at ffmpeg.org
>  
> +git repositories are managed by gitolite, every change to permissions is
> +logged, including when, what and by whom
> +

will apply


[...]
-- 
Michael GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB

Dictatorship: All citizens are under surveillance, all their steps and
actions recorded, for the politicians to enforce control.
Democracy: All politicians are under surveillance, all their steps and
actions recorded, for the citizens to enforce control.


signature.asc
Description: PGP signature
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".


[FFmpeg-devel] [PATCH] mpegts.c: pat_cb(): ensure all PIDs are valid

2024-11-14 Thread Scott Theisen
originally from:
https://github.com/MythTV/mythtv/commit/a1d4d112c3f962a85ddd6248592421171fc8331c
referencing:
https://code.mythtv.org/trac/ticket/1887

ISO/IEC 13818-1:2021 specifies a valid range of [0x0010, 0x1FFE] in
§ 2.4.4.6 Semantic definition of fields in program association section
and Table 2-3 – PID table
---
 libavformat/mpegts.c | 6 ++
 1 file changed, 6 insertions(+)

diff --git a/libavformat/mpegts.c b/libavformat/mpegts.c
index 78ab7f7efe..6d5dc3050b 100644
--- a/libavformat/mpegts.c
+++ b/libavformat/mpegts.c
@@ -2580,6 +2580,12 @@ static void pat_cb(MpegTSFilter *filter, const uint8_t 
*section, int section_len
 break;
 
 av_log(ts->stream, AV_LOG_TRACE, "sid=0x%x pid=0x%x\n", sid, pmt_pid);
+if (pmt_pid <= 0x000F || pmt_pid >= 0x1FFF)
+{
+av_log(ts->stream, AV_LOG_ERROR, "Invalid PAT ignored "
+   "MPEG Program Number=0x%x pid=0x%x\n", sid, pmt_pid);
+return;
+}
 
 if (sid == 0x) {
 /* NIT info */
-- 
2.43.0

___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".


[FFmpeg-devel] [PATCH 1/2] libavformat/mpegts.h: add DVB descriptor_tag values already in use

2024-11-14 Thread Scott Theisen
---
 libavformat/mpegts.h | 11 +++
 1 file changed, 11 insertions(+)

diff --git a/libavformat/mpegts.h b/libavformat/mpegts.h
index 14ae312c50..729c8b07b9 100644
--- a/libavformat/mpegts.h
+++ b/libavformat/mpegts.h
@@ -165,6 +165,17 @@
 #define METADATA_DESCRIPTOR  0x26
 #define METADATA_STD_DESCRIPTOR  0x27
 
+/* DVB descriptor tag values [0x40, 0x7F] from
+   ETSI EN 300 468 Table 12: Possible locations of descriptors */
+#define SERVICE_DESCRIPTOR   0x48
+#define STREAM_IDENTIFIER_DESCRIPTOR 0x52
+#define TELETEXT_DESCRIPTOR  0x56
+#define SUBTITLING_DESCRIPTOR0x59
+#define AC3_DESCRIPTOR   0x6A // AC-3_descriptor
+#define ENHANCED_AC3_DESCRIPTOR  0x7A // enhanced_AC-3_descriptor
+#define DTS_DESCRIPTOR   0x7B
+#define EXTENSION_DESCRIPTOR 0x7F
+
 typedef struct MpegTSContext MpegTSContext;
 
 MpegTSContext *avpriv_mpegts_parse_open(AVFormatContext *s);
-- 
2.43.0

___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".


[FFmpeg-devel] libavformat/mpegts: demux DVB VBI data

2024-11-14 Thread Scott Theisen


These changes are originally from MythTV.
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".


[FFmpeg-devel] [PATCH v3 02/10] hwcontext_vulkan: fix planar RGB images

2024-11-14 Thread Lynne via ffmpeg-devel
They were non-working for quite a while.
---
 libavutil/hwcontext_vulkan.c | 25 +
 libavutil/vulkan.c   | 27 ---
 libavutil/vulkan.h   |  5 +
 3 files changed, 38 insertions(+), 19 deletions(-)

diff --git a/libavutil/hwcontext_vulkan.c b/libavutil/hwcontext_vulkan.c
index 0c9047f4c6..c4704a3402 100644
--- a/libavutil/hwcontext_vulkan.c
+++ b/libavutil/hwcontext_vulkan.c
@@ -326,10 +326,10 @@ static const struct FFVkFormatEntry {
 { VK_FORMAT_R32G32B32A32_UINT,AV_PIX_FMT_RGBA128, 
VK_IMAGE_ASPECT_COLOR_BIT, 1, 1, 1, { VK_FORMAT_R32G32B32A32_UINT} },
 
 /* Planar RGB */
-{ VK_FORMAT_R8_UNORM,   AV_PIX_FMT_GBRAP,VK_IMAGE_ASPECT_COLOR_BIT, 1, 
4, 4, { VK_FORMAT_R8_UNORM,   VK_FORMAT_R8_UNORM,   VK_FORMAT_R8_UNORM,   
VK_FORMAT_R8_UNORM   } },
-{ VK_FORMAT_R16_UNORM,  AV_PIX_FMT_GBRAP16,  VK_IMAGE_ASPECT_COLOR_BIT, 1, 
4, 4, { VK_FORMAT_R16_UNORM,  VK_FORMAT_R16_UNORM,  VK_FORMAT_R16_UNORM,  
VK_FORMAT_R16_UNORM  } },
-{ VK_FORMAT_R32_SFLOAT, AV_PIX_FMT_GBRPF32,  VK_IMAGE_ASPECT_COLOR_BIT, 1, 
3, 3, { VK_FORMAT_R32_SFLOAT, VK_FORMAT_R32_SFLOAT, VK_FORMAT_R32_SFLOAT
   } },
-{ VK_FORMAT_R32_SFLOAT, AV_PIX_FMT_GBRAPF32, VK_IMAGE_ASPECT_COLOR_BIT, 1, 
4, 4, { VK_FORMAT_R32_SFLOAT, VK_FORMAT_R32_SFLOAT, VK_FORMAT_R32_SFLOAT, 
VK_FORMAT_R32_SFLOAT } },
+{ VK_FORMAT_R8_UNORM,   AV_PIX_FMT_GBRAP,VK_IMAGE_ASPECT_COLOR_BIT, 4, 
4, 4, { VK_FORMAT_R8_UNORM,   VK_FORMAT_R8_UNORM,   VK_FORMAT_R8_UNORM,   
VK_FORMAT_R8_UNORM   } },
+{ VK_FORMAT_R16_UNORM,  AV_PIX_FMT_GBRAP16,  VK_IMAGE_ASPECT_COLOR_BIT, 4, 
4, 4, { VK_FORMAT_R16_UNORM,  VK_FORMAT_R16_UNORM,  VK_FORMAT_R16_UNORM,  
VK_FORMAT_R16_UNORM  } },
+{ VK_FORMAT_R32_SFLOAT, AV_PIX_FMT_GBRPF32,  VK_IMAGE_ASPECT_COLOR_BIT, 3, 
3, 3, { VK_FORMAT_R32_SFLOAT, VK_FORMAT_R32_SFLOAT, VK_FORMAT_R32_SFLOAT
   } },
+{ VK_FORMAT_R32_SFLOAT, AV_PIX_FMT_GBRAPF32, VK_IMAGE_ASPECT_COLOR_BIT, 4, 
4, 4, { VK_FORMAT_R32_SFLOAT, VK_FORMAT_R32_SFLOAT, VK_FORMAT_R32_SFLOAT, 
VK_FORMAT_R32_SFLOAT } },
 
 /* Two-plane 420 YUV at 8, 10, 12 and 16 bits */
 { VK_FORMAT_G8_B8R8_2PLANE_420_UNORM,  AV_PIX_FMT_NV12, 
ASPECT_2PLANE, 2, 1, 2, { VK_FORMAT_R8_UNORM,  VK_FORMAT_R8G8_UNORM   } },
@@ -482,8 +482,14 @@ static int vkfmt_from_pixfmt2(AVHWDeviceContext *dev_ctx, 
enum AVPixelFormat p,
 if (basics_primary &&
 !(disable_multiplane && vk_formats_list[i].vk_planes > 1) &&
 (!need_storage || (need_storage && (storage_primary | 
storage_secondary {
-if (fmts)
-fmts[0] = vk_formats_list[i].vkf;
+if (fmts) {
+if (vk_formats_list[i].nb_images > 1) {
+for (int j = 0; j < 
vk_formats_list[i].nb_images_fallback; j++)
+fmts[j] = vk_formats_list[i].fallback[j];
+} else {
+fmts[0] = vk_formats_list[i].vkf;
+}
+}
 if (nb_images)
 *nb_images = 1;
 if (aspect)
@@ -4096,10 +4102,6 @@ static int vulkan_transfer_frame(AVHWFramesContext *hwfc,
 const int planes = av_pix_fmt_count_planes(swf->format);
 const AVPixFmtDescriptor *desc = av_pix_fmt_desc_get(swf->format);
 const int nb_images = ff_vk_count_images(hwf_vk);
-static const VkImageAspectFlags plane_aspect[] = { 
VK_IMAGE_ASPECT_COLOR_BIT,
-   
VK_IMAGE_ASPECT_PLANE_0_BIT,
-   
VK_IMAGE_ASPECT_PLANE_1_BIT,
-   
VK_IMAGE_ASPECT_PLANE_2_BIT, };
 
 VkImageMemoryBarrier2 img_bar[AV_NUM_DATA_POINTERS];
 int nb_img_bar = 0;
@@ -4182,8 +4184,7 @@ static int vulkan_transfer_frame(AVHWFramesContext *hwfc,
 
 uint32_t orig_stride = region[i].bufferRowLength;
 region[i].bufferRowLength /= desc->comp[i].step;
-region[i].imageSubresource.aspectMask = plane_aspect[(planes != 
nb_images) +
- i*(planes != 
nb_images)];
+region[i].imageSubresource.aspectMask = ff_vk_aspect_flag(hwf, i);
 
 if (upload)
 vk->CmdCopyBufferToImage(cmd_buf, vkbuf->buf,
diff --git a/libavutil/vulkan.c b/libavutil/vulkan.c
index 2c71312d78..918287a933 100644
--- a/libavutil/vulkan.c
+++ b/libavutil/vulkan.c
@@ -1273,6 +1273,23 @@ int ff_vk_init_sampler(FFVulkanContext *s, VkSampler 
*sampler,
 return 0;
 }
 
+VkImageAspectFlags ff_vk_aspect_flag(AVFrame *f, int p)
+{
+AVVkFrame *vkf = (AVVkFrame *)f->data[0];
+AVHWFramesContext *hwfc = (AVHWFramesContext *)f->hw_frames_ctx->data;
+int nb_images = ff_vk_count_images(vkf);
+int nb_planes = av_pix_fmt_count_planes(hwfc->sw_format);
+
+static const Vk

[FFmpeg-devel] [PATCH 2/2] libavformat/mpegts: demux DVB VBI data

2024-11-14 Thread Scott Theisen
From: ulmus-scott 

DVB VBI data is defined in ETSI EN 301 775 and can include EBU teletext data
as defined in ETSI EN 300 472.

ETSI EN 300 468 defines teletext_descriptor, VBI_data_descriptor, and
VBI_teletext_descriptor, which has the same definition as, but different use
from, teletext_descriptor.
---
 libavcodec/codec_desc.c | 6 ++
 libavcodec/codec_id.h   | 1 +
 libavformat/mpegts.c| 3 +++
 libavformat/mpegts.h| 2 ++
 4 files changed, 12 insertions(+)

diff --git a/libavcodec/codec_desc.c b/libavcodec/codec_desc.c
index aeac75a6c5..63dbc3f155 100644
--- a/libavcodec/codec_desc.c
+++ b/libavcodec/codec_desc.c
@@ -3720,6 +3720,12 @@ static const AVCodecDescriptor codec_descriptors[] = {
 .name  = "lcevc",
 .long_name = NULL_IF_CONFIG_SMALL("LCEVC (Low Complexity Enhancement 
Video Coding) / MPEG-5 LCEVC / MPEG-5 part 2"),
 },
+{
+.id= AV_CODEC_ID_DVB_VBI,
+.type  = AVMEDIA_TYPE_DATA,
+.name  = "dvb_vbi",
+.long_name = NULL_IF_CONFIG_SMALL("DVB VBI data"),
+},
 {
 .id= AV_CODEC_ID_MPEG2TS,
 .type  = AVMEDIA_TYPE_DATA,
diff --git a/libavcodec/codec_id.h b/libavcodec/codec_id.h
index 6bfaa02601..e7e984379c 100644
--- a/libavcodec/codec_id.h
+++ b/libavcodec/codec_id.h
@@ -596,6 +596,7 @@ enum AVCodecID {
 AV_CODEC_ID_BIN_DATA,
 AV_CODEC_ID_SMPTE_2038,
 AV_CODEC_ID_LCEVC,
+AV_CODEC_ID_DVB_VBI,
 
 
 AV_CODEC_ID_PROBE = 0x19000, ///< codec_id is not known (like 
AV_CODEC_ID_NONE) but lavf should attempt to identify it
diff --git a/libavformat/mpegts.c b/libavformat/mpegts.c
index 78ab7f7efe..07b5ba996d 100644
--- a/libavformat/mpegts.c
+++ b/libavformat/mpegts.c
@@ -890,6 +890,8 @@ static const StreamType DESC_types[] = {
 { 0x6a, AVMEDIA_TYPE_AUDIO,AV_CODEC_ID_AC3  }, /* AC-3 
descriptor */
 { 0x7a, AVMEDIA_TYPE_AUDIO,AV_CODEC_ID_EAC3 }, /* E-AC-3 
descriptor */
 { 0x7b, AVMEDIA_TYPE_AUDIO,AV_CODEC_ID_DTS  },
+{ 0x45, AVMEDIA_TYPE_DATA, AV_CODEC_ID_DVB_VBI  }, /* 
VBI_DATA_DESCRIPTOR */
+{ 0x46, AVMEDIA_TYPE_DATA, AV_CODEC_ID_DVB_VBI  }, /* 
VBI_TELETEXT_DESCRIPTOR */
 { 0x56, AVMEDIA_TYPE_SUBTITLE, AV_CODEC_ID_DVB_TELETEXT },
 { 0x59, AVMEDIA_TYPE_SUBTITLE, AV_CODEC_ID_DVB_SUBTITLE }, /* subtitling 
descriptor */
 { 0 },
@@ -1887,6 +1889,7 @@ int ff_parse_mpeg2_descriptor(AVFormatContext *fc, 
AVStream *st, int stream_type
 }
 }
 break;
+case VBI_TELETEXT_DESCRIPTOR:
 case 0x56: /* DVB teletext descriptor */
 {
 uint8_t *extradata = NULL;
diff --git a/libavformat/mpegts.h b/libavformat/mpegts.h
index 729c8b07b9..d1fc4210ae 100644
--- a/libavformat/mpegts.h
+++ b/libavformat/mpegts.h
@@ -167,6 +167,8 @@
 
 /* DVB descriptor tag values [0x40, 0x7F] from
ETSI EN 300 468 Table 12: Possible locations of descriptors */
+#define VBI_DATA_DESCRIPTOR  0x45
+#define VBI_TELETEXT_DESCRIPTOR  0x46
 #define SERVICE_DESCRIPTOR   0x48
 #define STREAM_IDENTIFIER_DESCRIPTOR 0x52
 #define TELETEXT_DESCRIPTOR  0x56
-- 
2.43.0

___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".


[FFmpeg-devel] [PATCH v3 05/10] .gitignore: add exclusions for shader .c files

2024-11-14 Thread Lynne via ffmpeg-devel
---
 .gitignore | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/.gitignore b/.gitignore
index e810d11107..9cfc78b414 100644
--- a/.gitignore
+++ b/.gitignore
@@ -41,3 +41,5 @@
 /src
 /mapfile
 /tools/python/__pycache__/
+/libavcodec/vulkan/*.c
+/libavfilter/vulkan/*.c
-- 
2.45.2.753.g447d99e1c3b
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".


[FFmpeg-devel] [PATCH v3 08/10] ffv1enc: move plane info init into a separate function

2024-11-14 Thread Lynne via ffmpeg-devel
---
 libavcodec/ffv1enc.c | 39 ++-
 libavcodec/ffv1enc.h |  2 ++
 2 files changed, 28 insertions(+), 13 deletions(-)

diff --git a/libavcodec/ffv1enc.c b/libavcodec/ffv1enc.c
index 8c0f649b8d..f8fbcb7486 100644
--- a/libavcodec/ffv1enc.c
+++ b/libavcodec/ffv1enc.c
@@ -768,22 +768,14 @@ av_cold int ff_ffv1_encode_init(AVCodecContext *avctx)
 return 0;
 }
 
-static int encode_init_internal(AVCodecContext *avctx)
+av_cold int ff_ffv1_encode_setup_plane_info(AVCodecContext *avctx,
+enum AVPixelFormat pix_fmt)
 {
-int ret;
 FFV1Context *s = avctx->priv_data;
-const AVPixFmtDescriptor *desc = av_pix_fmt_desc_get(avctx->pix_fmt);
-
-if ((ret = ff_ffv1_common_init(avctx)) < 0)
-return ret;
-
-if (s->ac == 1) // Compatbility with common command line usage
-s->ac = AC_RANGE_CUSTOM_TAB;
-else if (s->ac == AC_RANGE_DEFAULT_TAB_FORCE)
-s->ac = AC_RANGE_DEFAULT_TAB;
+const AVPixFmtDescriptor *desc = av_pix_fmt_desc_get(pix_fmt);
 
 s->plane_count = 3;
-switch(avctx->pix_fmt) {
+switch(pix_fmt) {
 case AV_PIX_FMT_GRAY9:
 case AV_PIX_FMT_YUV444P9:
 case AV_PIX_FMT_YUV422P9:
@@ -911,11 +903,32 @@ static int encode_init_internal(AVCodecContext *avctx)
 s->version = FFMAX(s->version, 1);
 break;
 default:
-av_log(avctx, AV_LOG_ERROR, "format not supported\n");
+av_log(avctx, AV_LOG_ERROR, "format %s not supported\n",
+   av_get_pix_fmt_name(pix_fmt));
 return AVERROR(ENOSYS);
 }
 av_assert0(s->bits_per_raw_sample >= 8);
 
+return av_pix_fmt_get_chroma_sub_sample (avctx->pix_fmt, 
&s->chroma_h_shift, &s->chroma_v_shift);
+}
+
+static int encode_init_internal(AVCodecContext *avctx)
+{
+int ret;
+FFV1Context *s = avctx->priv_data;
+
+if ((ret = ff_ffv1_common_init(avctx)) < 0)
+return ret;
+
+if (s->ac == 1) // Compatbility with common command line usage
+s->ac = AC_RANGE_CUSTOM_TAB;
+else if (s->ac == AC_RANGE_DEFAULT_TAB_FORCE)
+s->ac = AC_RANGE_DEFAULT_TAB;
+
+ret = ff_ffv1_encode_setup_plane_info(avctx, avctx->pix_fmt);
+if (ret < 0)
+return ret;
+
 if (s->bits_per_raw_sample > 8) {
 if (s->ac == AC_GOLOMB_RICE) {
 av_log(avctx, AV_LOG_INFO,
diff --git a/libavcodec/ffv1enc.h b/libavcodec/ffv1enc.h
index 6850243ac1..82c7f9da19 100644
--- a/libavcodec/ffv1enc.h
+++ b/libavcodec/ffv1enc.h
@@ -27,5 +27,7 @@
 
 av_cold int ff_ffv1_encode_init(AVCodecContext *avctx);
 av_cold int ff_ffv1_write_extradata(AVCodecContext *avctx);
+av_cold int ff_ffv1_encode_setup_plane_info(AVCodecContext *avctx,
+enum AVPixelFormat pix_fmt);
 
 #endif /* AVCODEC_FFV1ENC_H */
-- 
2.45.2.753.g447d99e1c3b
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".


[FFmpeg-devel] [PATCH v3 03/10] vulkan: add support for 10-bit planar RGB

2024-11-14 Thread Lynne via ffmpeg-devel
---
 libavutil/hwcontext_vulkan.c | 1 +
 libavutil/vulkan.c   | 2 ++
 2 files changed, 3 insertions(+)

diff --git a/libavutil/hwcontext_vulkan.c b/libavutil/hwcontext_vulkan.c
index c4704a3402..0a176a7058 100644
--- a/libavutil/hwcontext_vulkan.c
+++ b/libavutil/hwcontext_vulkan.c
@@ -327,6 +327,7 @@ static const struct FFVkFormatEntry {
 
 /* Planar RGB */
 { VK_FORMAT_R8_UNORM,   AV_PIX_FMT_GBRAP,VK_IMAGE_ASPECT_COLOR_BIT, 4, 
4, 4, { VK_FORMAT_R8_UNORM,   VK_FORMAT_R8_UNORM,   VK_FORMAT_R8_UNORM,   
VK_FORMAT_R8_UNORM   } },
+{ VK_FORMAT_R16_UNORM,  AV_PIX_FMT_GBRP10,   VK_IMAGE_ASPECT_COLOR_BIT, 3, 
3, 3, { VK_FORMAT_R16_UNORM,  VK_FORMAT_R16_UNORM,  VK_FORMAT_R16_UNORM 
   } },
 { VK_FORMAT_R16_UNORM,  AV_PIX_FMT_GBRAP16,  VK_IMAGE_ASPECT_COLOR_BIT, 4, 
4, 4, { VK_FORMAT_R16_UNORM,  VK_FORMAT_R16_UNORM,  VK_FORMAT_R16_UNORM,  
VK_FORMAT_R16_UNORM  } },
 { VK_FORMAT_R32_SFLOAT, AV_PIX_FMT_GBRPF32,  VK_IMAGE_ASPECT_COLOR_BIT, 3, 
3, 3, { VK_FORMAT_R32_SFLOAT, VK_FORMAT_R32_SFLOAT, VK_FORMAT_R32_SFLOAT
   } },
 { VK_FORMAT_R32_SFLOAT, AV_PIX_FMT_GBRAPF32, VK_IMAGE_ASPECT_COLOR_BIT, 4, 
4, 4, { VK_FORMAT_R32_SFLOAT, VK_FORMAT_R32_SFLOAT, VK_FORMAT_R32_SFLOAT, 
VK_FORMAT_R32_SFLOAT } },
diff --git a/libavutil/vulkan.c b/libavutil/vulkan.c
index 918287a933..159165a19d 100644
--- a/libavutil/vulkan.c
+++ b/libavutil/vulkan.c
@@ -1298,6 +1298,7 @@ int ff_vk_mt_is_np_rgb(enum AVPixelFormat pix_fmt)
 pix_fmt == AV_PIX_FMT_RGBA64 || pix_fmt == AV_PIX_FMT_RGB565 ||
 pix_fmt == AV_PIX_FMT_BGR565 || pix_fmt == AV_PIX_FMT_BGR0   ||
 pix_fmt == AV_PIX_FMT_0BGR   || pix_fmt == AV_PIX_FMT_RGB0   ||
+pix_fmt == AV_PIX_FMT_GBRP10  ||
 pix_fmt == AV_PIX_FMT_GBRAP   || pix_fmt == AV_PIX_FMT_GBRAP16 ||
 pix_fmt == AV_PIX_FMT_GBRPF32 || pix_fmt == AV_PIX_FMT_GBRAPF32 ||
 pix_fmt == AV_PIX_FMT_X2RGB10 || pix_fmt == AV_PIX_FMT_X2BGR10 ||
@@ -1391,6 +1392,7 @@ const char *ff_vk_shader_rep_fmt(enum AVPixelFormat 
pix_fmt,
 };
 case AV_PIX_FMT_GRAY16:
 case AV_PIX_FMT_GBRAP16:
+case AV_PIX_FMT_GBRP10:
 case AV_PIX_FMT_YUV420P10:
 case AV_PIX_FMT_YUV420P12:
 case AV_PIX_FMT_YUV420P16:
-- 
2.45.2.753.g447d99e1c3b
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".


[FFmpeg-devel] [PATCH v3 06/10] ffv1enc: split off encoder initialization into a separate function

2024-11-14 Thread Lynne via ffmpeg-devel
---
 libavcodec/ffv1enc.c | 401 ++-
 libavcodec/ffv1enc.h |  30 
 2 files changed, 240 insertions(+), 191 deletions(-)
 create mode 100644 libavcodec/ffv1enc.h

diff --git a/libavcodec/ffv1enc.c b/libavcodec/ffv1enc.c
index 7a6c718b41..0ef26db30a 100644
--- a/libavcodec/ffv1enc.c
+++ b/libavcodec/ffv1enc.c
@@ -39,6 +39,7 @@
 #include "put_golomb.h"
 #include "rangecoder.h"
 #include "ffv1.h"
+#include "ffv1enc.h"
 
 static const int8_t quant5_10bit[256] = {
  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  1,  1,  1,  1,  1,
@@ -513,16 +514,42 @@ static int sort_stt(FFV1Context *s, uint8_t stt[256])
 return print;
 }
 
-static av_cold int encode_init(AVCodecContext *avctx)
+
+static int encode_determine_slices(AVCodecContext *avctx)
 {
 FFV1Context *s = avctx->priv_data;
-const AVPixFmtDescriptor *desc = av_pix_fmt_desc_get(avctx->pix_fmt);
-int i, j, k, m, ret;
-
-if ((ret = ff_ffv1_common_init(avctx)) < 0)
-return ret;
+int plane_count = 1 + 2*s->chroma_planes + s->transparency;
+int max_h_slices = AV_CEIL_RSHIFT(avctx->width , s->chroma_h_shift);
+int max_v_slices = AV_CEIL_RSHIFT(avctx->height, s->chroma_v_shift);
+s->num_v_slices = (avctx->width > 352 || avctx->height > 288 || 
!avctx->slices) ? 2 : 1;
+s->num_v_slices = FFMIN(s->num_v_slices, max_v_slices);
+for (; s->num_v_slices < 32; s->num_v_slices++) {
+for (s->num_h_slices = s->num_v_slices; s->num_h_slices < 
2*s->num_v_slices; s->num_h_slices++) {
+int maxw = (avctx->width  + s->num_h_slices - 1) / s->num_h_slices;
+int maxh = (avctx->height + s->num_v_slices - 1) / s->num_v_slices;
+if (s->num_h_slices > max_h_slices || s->num_v_slices > 
max_v_slices)
+continue;
+if (maxw * maxh * (int64_t)(s->bits_per_raw_sample+1) * 
plane_count > 8<<24)
+continue;
+if (s->version < 4)
+if (  ff_need_new_slices(avctx->width , s->num_h_slices, 
s->chroma_h_shift)
+||ff_need_new_slices(avctx->height, s->num_v_slices, 
s->chroma_v_shift))
+continue;
+if (avctx->slices == s->num_h_slices * s->num_v_slices && 
avctx->slices <= MAX_SLICES || !avctx->slices)
+return 0;
+}
+}
+av_log(avctx, AV_LOG_ERROR,
+   "Unsupported number %d of slices requested, please specify a "
+   "supported number with -slices (ex:4,6,9,12,16, ...)\n",
+   avctx->slices);
+return AVERROR(ENOSYS);
+}
 
-s->version = 0;
+av_cold int ff_ffv1_encode_init(AVCodecContext *avctx)
+{
+FFV1Context *s = avctx->priv_data;
+int i, j, k, m, ret;
 
 if ((avctx->flags & (AV_CODEC_FLAG_PASS1 | AV_CODEC_FLAG_PASS2)) ||
 avctx->slices > 1)
@@ -569,153 +596,6 @@ static av_cold int encode_init(AVCodecContext *avctx)
 return AVERROR_INVALIDDATA;
 }
 
-if (s->ac == 1) // Compatbility with common command line usage
-s->ac = AC_RANGE_CUSTOM_TAB;
-else if (s->ac == AC_RANGE_DEFAULT_TAB_FORCE)
-s->ac = AC_RANGE_DEFAULT_TAB;
-
-s->plane_count = 3;
-switch(avctx->pix_fmt) {
-case AV_PIX_FMT_GRAY9:
-case AV_PIX_FMT_YUV444P9:
-case AV_PIX_FMT_YUV422P9:
-case AV_PIX_FMT_YUV420P9:
-case AV_PIX_FMT_YUVA444P9:
-case AV_PIX_FMT_YUVA422P9:
-case AV_PIX_FMT_YUVA420P9:
-if (!avctx->bits_per_raw_sample)
-s->bits_per_raw_sample = 9;
-case AV_PIX_FMT_GRAY10:
-case AV_PIX_FMT_YUV444P10:
-case AV_PIX_FMT_YUV440P10:
-case AV_PIX_FMT_YUV420P10:
-case AV_PIX_FMT_YUV422P10:
-case AV_PIX_FMT_YUVA444P10:
-case AV_PIX_FMT_YUVA422P10:
-case AV_PIX_FMT_YUVA420P10:
-if (!avctx->bits_per_raw_sample && !s->bits_per_raw_sample)
-s->bits_per_raw_sample = 10;
-case AV_PIX_FMT_GRAY12:
-case AV_PIX_FMT_YUV444P12:
-case AV_PIX_FMT_YUV440P12:
-case AV_PIX_FMT_YUV420P12:
-case AV_PIX_FMT_YUV422P12:
-case AV_PIX_FMT_YUVA444P12:
-case AV_PIX_FMT_YUVA422P12:
-if (!avctx->bits_per_raw_sample && !s->bits_per_raw_sample)
-s->bits_per_raw_sample = 12;
-case AV_PIX_FMT_GRAY14:
-case AV_PIX_FMT_YUV444P14:
-case AV_PIX_FMT_YUV420P14:
-case AV_PIX_FMT_YUV422P14:
-if (!avctx->bits_per_raw_sample && !s->bits_per_raw_sample)
-s->bits_per_raw_sample = 14;
-s->packed_at_lsb = 1;
-case AV_PIX_FMT_GRAY16:
-case AV_PIX_FMT_YUV444P16:
-case AV_PIX_FMT_YUV422P16:
-case AV_PIX_FMT_YUV420P16:
-case AV_PIX_FMT_YUVA444P16:
-case AV_PIX_FMT_YUVA422P16:
-case AV_PIX_FMT_YUVA420P16:
-if (!avctx->bits_per_raw_sample && !s->bits_per_raw_sample) {
-s->bits_per_raw_sample = 16;
-} else if (!s->bits_per_raw_sample) {
-s->bits_per_raw_sample = avctx->bits_per_raw_sample;
-}
-if (s->bits_per_raw_sample <= 8) {
-av_

[FFmpeg-devel] [PATCH v3 09/10] ffv1enc: move slice allocation out of generic encode init

2024-11-14 Thread Lynne via ffmpeg-devel
---
 libavcodec/ffv1enc.c | 48 ++--
 1 file changed, 24 insertions(+), 24 deletions(-)

diff --git a/libavcodec/ffv1enc.c b/libavcodec/ffv1enc.c
index f8fbcb7486..032c69a060 100644
--- a/libavcodec/ffv1enc.c
+++ b/libavcodec/ffv1enc.c
@@ -739,32 +739,8 @@ av_cold int ff_ffv1_encode_init(AVCodecContext *avctx)
 /* Disable slices when the version doesn't support them */
 s->num_h_slices = 1;
 s->num_v_slices = 1;
-} else {
-if ((ret = encode_determine_slices(avctx)) < 0)
-return ret;
-
-if ((ret = ff_ffv1_write_extradata(avctx)) < 0)
-return ret;
-}
-
-if ((ret = ff_ffv1_init_slice_contexts(s)) < 0)
-return ret;
-s->slice_count = s->max_slice_count;
-
-for (int j = 0; j < s->slice_count; j++) {
-for (int i = 0; i < s->plane_count; i++) {
-PlaneContext *const p = &s->slices[j].plane[i];
-
-p->quant_table_index = s->context_model;
-p->context_count = s->context_count[p->quant_table_index];
-}
-
-ff_build_rac_states(&s->slices[j].c, 0.05 * (1LL << 32), 256 - 8);
 }
 
-if ((ret = ff_ffv1_init_slices_state(s)) < 0)
-return ret;
-
 return 0;
 }
 
@@ -943,6 +919,30 @@ static int encode_init_internal(AVCodecContext *avctx)
 if (ret < 0)
 return ret;
 
+if ((ret = encode_determine_slices(avctx)) < 0)
+return ret;
+
+if ((ret = ff_ffv1_write_extradata(avctx)) < 0)
+return ret;
+
+if ((ret = ff_ffv1_init_slice_contexts(s)) < 0)
+return ret;
+s->slice_count = s->max_slice_count;
+
+for (int j = 0; j < s->slice_count; j++) {
+for (int i = 0; i < s->plane_count; i++) {
+PlaneContext *const p = &s->slices[j].plane[i];
+
+p->quant_table_index = s->context_model;
+p->context_count = s->context_count[p->quant_table_index];
+}
+
+ff_build_rac_states(&s->slices[j].c, 0.05 * (1LL << 32), 256 - 8);
+}
+
+if ((ret = ff_ffv1_init_slices_state(s)) < 0)
+return ret;
+
 #define STATS_OUT_SIZE 1024 * 1024 * 6
 if (avctx->flags & AV_CODEC_FLAG_PASS1) {
 avctx->stats_out = av_mallocz(STATS_OUT_SIZE);
-- 
2.45.2.753.g447d99e1c3b
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".


[FFmpeg-devel] [PATCH v4 0/5] avcodec/x86/diracdsp: migrate last remaining MMX function to SSE2

2024-11-14 Thread Kyosuke Kawakami
This series of patch migrates the last remaining MMX function in
diracdsp to SSE2.

Changes from v3 are:
- Use correct register load/use counts
- Fix garbage value issue on Windows
- Use constant yblen on checkasm benchmark
- Test that functions accept unaligned buffer

Thanks to James and Ronald for feedback.

___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".


[FFmpeg-devel] [PATCH v4 1/5] checkasm/diracdsp: test add_dirac_obmc

2024-11-14 Thread Kyosuke Kawakami
Signed-off-by: Kyosuke Kawakami 
---
 tests/checkasm/Makefile   |  1 +
 tests/checkasm/checkasm.c |  3 ++
 tests/checkasm/checkasm.h |  1 +
 tests/checkasm/diracdsp.c | 91 +++
 tests/fate/checkasm.mak   |  1 +
 5 files changed, 97 insertions(+)
 create mode 100644 tests/checkasm/diracdsp.c

diff --git a/tests/checkasm/Makefile b/tests/checkasm/Makefile
index ae324ced3f..c7268d836e 100644
--- a/tests/checkasm/Makefile
+++ b/tests/checkasm/Makefile
@@ -29,6 +29,7 @@ AVCODECOBJS-$(CONFIG_AAC_DECODER)   += aacpsdsp.o \
 AVCODECOBJS-$(CONFIG_AAC_ENCODER)   += aacencdsp.o
 AVCODECOBJS-$(CONFIG_ALAC_DECODER)  += alacdsp.o
 AVCODECOBJS-$(CONFIG_DCA_DECODER)   += synth_filter.o
+AVCODECOBJS-$(CONFIG_DIRAC_DECODER) += diracdsp.o
 AVCODECOBJS-$(CONFIG_EXR_DECODER)   += exrdsp.o
 AVCODECOBJS-$(CONFIG_FLAC_DECODER)  += flacdsp.o
 AVCODECOBJS-$(CONFIG_HUFFYUV_DECODER)   += huffyuvdsp.o
diff --git a/tests/checkasm/checkasm.c b/tests/checkasm/checkasm.c
index c9d2b5faf1..fb307af0ae 100644
--- a/tests/checkasm/checkasm.c
+++ b/tests/checkasm/checkasm.c
@@ -138,6 +138,9 @@ static const struct {
 #if CONFIG_DCA_DECODER
 { "synth_filter", checkasm_check_synth_filter },
 #endif
+#if CONFIG_DIRAC_DECODER
+{ "diracdsp", checkasm_check_diracdsp },
+#endif
 #if CONFIG_EXR_DECODER
 { "exrdsp", checkasm_check_exrdsp },
 #endif
diff --git a/tests/checkasm/checkasm.h b/tests/checkasm/checkasm.h
index 866eef01e9..0ba5c3040d 100644
--- a/tests/checkasm/checkasm.h
+++ b/tests/checkasm/checkasm.h
@@ -84,6 +84,7 @@ void checkasm_check_blend(void);
 void checkasm_check_blockdsp(void);
 void checkasm_check_bswapdsp(void);
 void checkasm_check_colorspace(void);
+void checkasm_check_diracdsp(void);
 void checkasm_check_exrdsp(void);
 void checkasm_check_fdctdsp(void);
 void checkasm_check_fixed_dsp(void);
diff --git a/tests/checkasm/diracdsp.c b/tests/checkasm/diracdsp.c
new file mode 100644
index 00..e7dbbe184b
--- /dev/null
+++ b/tests/checkasm/diracdsp.c
@@ -0,0 +1,91 @@
+/*
+ * Copyright (c) 2024 Kyosuke Kawakami
+ *
+ * This file is part of FFmpeg.
+ *
+ * FFmpeg is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ *
+ * FFmpeg is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License along
+ * with FFmpeg; if not, write to the Free Software Foundation, Inc.,
+ * 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA.
+ */
+
+#include "checkasm.h"
+
+#include "libavcodec/diracdsp.h"
+
+#include "libavutil/intreadwrite.h"
+#include "libavutil/mem_internal.h"
+
+#define RANDOMIZE_DESTS(name, size) \
+do {\
+int i;  \
+for (i = 0; i < size; ++i) {\
+uint16_t r = rnd(); \
+AV_WN16A(name##0 + i, r);   \
+AV_WN16A(name##1 + i, r);   \
+}   \
+} while (0)
+
+#define RANDOMIZE_BUFFER8(name, size) \
+do {  \
+int i;\
+for (i = 0; i < size; ++i) {  \
+uint8_t r = rnd();\
+name[i] = r;  \
+} \
+} while (0)
+
+#define OBMC_STRIDE 32
+#define XBLEN_MAX 32
+#define YBLEN_MAX 64
+
+static void check_add_obmc(size_t func_index, int xblen)
+{
+LOCAL_ALIGNED_8(uint8_t, src, [XBLEN_MAX * YBLEN_MAX]);
+LOCAL_ALIGNED_16(uint16_t, _dst0, [XBLEN_MAX * YBLEN_MAX + 4]);
+LOCAL_ALIGNED_16(uint16_t, _dst1, [XBLEN_MAX * YBLEN_MAX + 4]);
+LOCAL_ALIGNED_8(uint8_t, obmc_weight, [XBLEN_MAX * YBLEN_MAX]);
+
+// Ensure that they accept unaligned buffer.
+// Not using LOCAL_ALIGNED_8 because it might make 16 byte aligned buffer.
+uint16_t *dst0 = _dst0 + 4;
+uint16_t *dst1 = _dst1 + 4;
+
+int yblen;
+DiracDSPContext h;
+
+ff_diracdsp_init(&h);
+
+if (check_func(h.add_dirac_obmc[func_index], "diracdsp.add_dirac_obmc_%d", 
xblen)) {
+declare_func(void, uint16_t*, const uint8_t*, int, const uint8_t *, 
int);
+
+RANDOMIZE_BUFFER8(src, YBLEN_MAX * xblen);
+RANDOMIZE_DESTS(dst, YBLEN_MAX * xblen);
+RANDOMIZE_BUFFER8(obmc_weight, YBLEN_MAX * OBMC_STRIDE);
+
+yblen = 1 + (rnd() % YBLEN_MAX);
+call_ref(dst0, src, xblen, obmc_weight, yblen);
+call_new(dst1, src, xblen, obmc_weight, yblen);
+

[FFmpeg-devel] [PATCH v4 4/5] avcodec/x86/diracdsp: migrate last remaining MMX function to SSE2

2024-11-14 Thread Kyosuke Kawakami
The add_dirac_obmc8_mmx function was the only MMX function left. This
patch migrates it to SSE2.

Here are the checkasm benchmark results:

diracdsp.add_dirac_obmc_8_c:2299.1 ( 1.00x)
diracdsp.add_dirac_obmc_8_mmx:   237.6 ( 9.68x)
diracdsp.add_dirac_obmc_8_sse2:  109.1 (21.07x)

Signed-off-by: Kyosuke Kawakami 
---
 libavcodec/x86/diracdsp.asm| 24 
 libavcodec/x86/diracdsp_init.c | 10 +++---
 2 files changed, 23 insertions(+), 11 deletions(-)

diff --git a/libavcodec/x86/diracdsp.asm b/libavcodec/x86/diracdsp.asm
index a653fa04de..6ae7f888b3 100644
--- a/libavcodec/x86/diracdsp.asm
+++ b/libavcodec/x86/diracdsp.asm
@@ -228,7 +228,7 @@ cglobal add_dirac_obmc%1_%2, 5,5,5, dst, src, stride, obmc, 
yblen
 punpckhbw   m1, m4
 movam2, [obmcq+i]
 movam3, m2
-   punpcklbw   m2, m4
+punpcklbw   m2, m4
 punpckhbw   m3, m4
 pmullw  m0, m2
 pmullw  m1, m3
@@ -248,9 +248,6 @@ cglobal add_dirac_obmc%1_%2, 5,5,5, dst, src, stride, obmc, 
yblen
 RET
 %endm
 
-INIT_MMX
-ADD_OBMC 8, mmx
-
 INIT_XMM
 PUT_RECT sse2
 ADD_RECT sse2
@@ -259,6 +256,25 @@ HPEL_FILTER sse2
 ADD_OBMC 32, sse2
 ADD_OBMC 16, sse2
 
+cglobal add_dirac_obmc8_sse2, 5,5,4, dst, src, stride, obmc, yblen
+pxorm3, m3
+movsxdifnidn strideq, strided
+.loop:
+movhm0, [srcq]
+punpcklbw   m0, m3
+movhm1, [obmcq]
+punpcklbw   m1, m3
+pmullw  m0, m1
+movum1, [dstq]
+paddw   m0, m1
+movu[dstq], m0
+lea srcq, [srcq+strideq]
+lea dstq, [dstq+2*strideq]
+add obmcq, 32
+sub yblend, 1
+jg  .loop
+RET
+
 INIT_XMM sse4
 
 ; void dequant_subband_32(uint8_t *src, uint8_t *dst, ptrdiff_t stride, const 
int qf, const int qs, int tot_v, int tot_h)
diff --git a/libavcodec/x86/diracdsp_init.c b/libavcodec/x86/diracdsp_init.c
index f678759dc0..08247133e1 100644
--- a/libavcodec/x86/diracdsp_init.c
+++ b/libavcodec/x86/diracdsp_init.c
@@ -24,8 +24,7 @@
 
 void ff_add_rect_clamped_sse2(uint8_t *, const uint16_t *, int, const int16_t 
*, int, int, int);
 
-void ff_add_dirac_obmc8_mmx(uint16_t *dst, const uint8_t *src, int stride, 
const uint8_t *obmc_weight, int yblen);
-
+void ff_add_dirac_obmc8_sse2(uint16_t *dst, const uint8_t *src, int stride, 
const uint8_t *obmc_weight, int yblen);
 void ff_add_dirac_obmc16_sse2(uint16_t *dst, const uint8_t *src, int stride, 
const uint8_t *obmc_weight, int yblen);
 void ff_add_dirac_obmc32_sse2(uint16_t *dst, const uint8_t *src, int stride, 
const uint8_t *obmc_weight, int yblen);
 
@@ -94,15 +93,12 @@ void ff_diracdsp_init_x86(DiracDSPContext* c)
 #if HAVE_X86ASM
 int mm_flags = av_get_cpu_flags();
 
-if (EXTERNAL_MMX(mm_flags)) {
-c->add_dirac_obmc[0] = ff_add_dirac_obmc8_mmx;
-}
-
 if (EXTERNAL_SSE2(mm_flags)) {
 c->dirac_hpel_filter = dirac_hpel_filter_sse2;
 c->add_rect_clamped = ff_add_rect_clamped_sse2;
 c->put_signed_rect_clamped[0] = (void 
*)ff_put_signed_rect_clamped_sse2;
 
+c->add_dirac_obmc[0] = ff_add_dirac_obmc8_sse2;
 c->add_dirac_obmc[1] = ff_add_dirac_obmc16_sse2;
 c->add_dirac_obmc[2] = ff_add_dirac_obmc32_sse2;
 
@@ -116,5 +112,5 @@ void ff_diracdsp_init_x86(DiracDSPContext* c)
 c->dequant_subband[1] = ff_dequant_subband_32_sse4;
 c->put_signed_rect_clamped[1] = ff_put_signed_rect_clamped_10_sse4;
 }
-#endif
+#endif // HAVE_X86ASM
 }
-- 
2.47.0

___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".


[FFmpeg-devel] [PATCH v4 2/5] avcodec/x86/diracdsp: fix wrong register load/use count

2024-11-14 Thread Kyosuke Kawakami
---
 libavcodec/x86/diracdsp.asm | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/libavcodec/x86/diracdsp.asm b/libavcodec/x86/diracdsp.asm
index e5e2b11846..03b929da76 100644
--- a/libavcodec/x86/diracdsp.asm
+++ b/libavcodec/x86/diracdsp.asm
@@ -216,7 +216,7 @@ cglobal add_rect_clamped_%1, 7,9,3, dst, src, stride, idwt, 
idwt_stride, w, h
 
 %macro ADD_OBMC 2
 ; void add_obmc(uint16_t *dst, uint8_t *src, int stride, uint8_t *obmc_weight, 
int yblen)
-cglobal add_dirac_obmc%1_%2, 6,6,5, dst, src, stride, obmc, yblen
+cglobal add_dirac_obmc%1_%2, 5,5,5, dst, src, stride, obmc, yblen
 pxorm4, m4
 .loop:
 %assign i 0
-- 
2.47.0

___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".


[FFmpeg-devel] [PATCH v4 3/5] avcodec/x86/diracdsp: cast stride argument

2024-11-14 Thread Kyosuke Kawakami
Signed-off-by: Kyosuke Kawakami 
---
 libavcodec/x86/diracdsp.asm | 1 +
 1 file changed, 1 insertion(+)

diff --git a/libavcodec/x86/diracdsp.asm b/libavcodec/x86/diracdsp.asm
index 03b929da76..a653fa04de 100644
--- a/libavcodec/x86/diracdsp.asm
+++ b/libavcodec/x86/diracdsp.asm
@@ -218,6 +218,7 @@ cglobal add_rect_clamped_%1, 7,9,3, dst, src, stride, idwt, 
idwt_stride, w, h
 ; void add_obmc(uint16_t *dst, uint8_t *src, int stride, uint8_t *obmc_weight, 
int yblen)
 cglobal add_dirac_obmc%1_%2, 5,5,5, dst, src, stride, obmc, yblen
 pxorm4, m4
+movsxdifnidn strideq, strided
 .loop:
 %assign i 0
 %rep %1 / mmsize
-- 
2.47.0

___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".


[FFmpeg-devel] [PATCH v4 5/5] avcodec/x86/diracdsp_init: remove unused macro

2024-11-14 Thread Kyosuke Kawakami
PIXFUNC macro is unused since d29a9c2aa68fc3eb6d61ff95c698e29316037583.

Signed-off-by: Kyosuke Kawakami 
---
 libavcodec/x86/diracdsp_init.c | 5 -
 1 file changed, 5 deletions(-)

diff --git a/libavcodec/x86/diracdsp_init.c b/libavcodec/x86/diracdsp_init.c
index 08247133e1..ef01ebdf2e 100644
--- a/libavcodec/x86/diracdsp_init.c
+++ b/libavcodec/x86/diracdsp_init.c
@@ -56,11 +56,6 @@ void ff_dequant_subband_32_sse4(uint8_t *src, uint8_t *dst, 
ptrdiff_t stride, co
 }  
  \
 }
 
-#define PIXFUNC(PFX, IDX, EXT) 
  \
-/*MMXDISABLEDc->PFX ## _dirac_pixels_tab[0][IDX] = PFX ## _dirac_pixels8_ 
## EXT;*/  \
-c->PFX ## _dirac_pixels_tab[1][IDX] = PFX ## _dirac_pixels16_ ## EXT; \
-c->PFX ## _dirac_pixels_tab[2][IDX] = PFX ## _dirac_pixels32_ ## EXT
-
 #define DIRAC_PIXOP(OPNAME, EXT)\
 static void OPNAME ## _dirac_pixels16_ ## EXT(uint8_t *dst, const uint8_t 
*src[5], \
   int stride, int h) \
-- 
2.47.0

___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".


Re: [FFmpeg-devel] [PATCH 3/3] lavc/pthread_frame: rework the logic for updating thread contexts

2024-11-14 Thread Michael Niedermayer
On Wed, Nov 13, 2024 at 02:06:58PM +0100, Anton Khirnov wrote:
> Propagating decoder state between per-thread contexts with frame
> threading currently works as follows:
> 0)  Every frame thread has its own "child" decoder context,
> 1)  Frame thread T0 decodes the frame header and updates its context
> accordingly. At most one frame thread can be in this stage at any
> given time.
> 2)  Frame thread T0 calls ff_thread_finish_setup() to indicate that
> header decoding is done.
> 3a) Frame thread T0 proceeds with decoding frame data.
> 3b) The main thread calls the decoder's update_thread_context()
> callback, transferring T0's state to the next thread T1.
> 
> Since 3a) and 3b) run concurrently, during 3a) T0 must not write to any
> context variables accessed by update_thread_context(), otherwise a data
> race occurs. This approach turns out to be highly fragile in practice,
> as developers are either not aware of this constraint, or fail to keep
> it in mind while modifying decoders.
> 
> This commit aims to eliminate the possibility of such races by changing
> the logic in the folowing way:
> * child decoders are no longer permanently bound to worker threads, but
>   are instead assigned dynamically only while decoding a single frame; a
>   different decoder may be assigned to the same thread for a later frame
> * with N frame threads, N+1 child decoder contexts are allocated
>   (instead of N, as before), so at any time at least one decoder is
>   idle (unassigned)
> * when a frame thread calls ff_thread_finish_setup(), its context state
>   is immediately and synchronously transferred to the idle context
> * the idle context is then assigned to the next frame thread, whose
>   previous decoder now becomes idle
> 
> With this approach, improperly updating a decoder context after
> ff_thread_finish_setup() transforms from a race into a deterministic
> failure to propagate the relevant variables to following frame threads,
> which
> * should be much easier to debug
> * is no longer UB
> ---
>  libavcodec/pthread_frame.c | 142 +
>  1 file changed, 81 insertions(+), 61 deletions(-)

Segfaults:

[h264 @ 0xa213ac0] ==3430921== Thread 2 av:h264:df0:
==3430921== Invalid read of size 4
==3430921==at 0xC64D63: ff_thread_report_progress (pthread_frame.c:666)
==3430921==by 0x1100CD3: h264_field_start (h264_slice.c:1472)
==3430921==by 0x11038FE: ff_h264_queue_decode_slice (h264_slice.c:2153)
==3430921==by 0xA6AA24: decode_nal_units (h264dec.c:657)
==3430921==by 0xA6C03F: h264_decode_frame (h264dec.c:1070)
==3430921==by 0x99FC82: decode_simple_internal (decode.c:443)
==3430921==by 0x9A01E1: decode_simple_receive_frame (decode.c:613)
==3430921==by 0x9A038B: ff_decode_receive_frame_internal (decode.c:649)
==3430921==by 0xC63BD4: frame_worker_thread (pthread_frame.c:317)
==3430921==by 0x692C608: start_thread (pthread_create.c:477)
==3430921==by 0x6A66352: clone (clone.S:95)
==3430921==  Address 0x134 is not stack'd, malloc'd or (recently) free'd
==3430921==
==3430921==
==3430921== Process terminating with default action of signal 11 (SIGSEGV)
==3430921==  Access not within mapped region at address 0x134
==3430921==at 0xC64D63: ff_thread_report_progress (pthread_frame.c:666)
==3430921==by 0x1100CD3: h264_field_start (h264_slice.c:1472)
==3430921==by 0x11038FE: ff_h264_queue_decode_slice (h264_slice.c:2153)
==3430921==by 0xA6AA24: decode_nal_units (h264dec.c:657)
==3430921==by 0xA6C03F: h264_decode_frame (h264dec.c:1070)
==3430921==by 0x99FC82: decode_simple_internal (decode.c:443)
==3430921==by 0x9A01E1: decode_simple_receive_frame (decode.c:613)
==3430921==by 0x9A038B: ff_decode_receive_frame_internal (decode.c:649)
==3430921==by 0xC63BD4: frame_worker_thread (pthread_frame.c:317)
==3430921==by 0x692C608: start_thread (pthread_create.c:477)
==3430921==by 0x6A66352: clone (clone.S:95)
==3430921==  If you believe this happened as a result of a stack
==3430921==  overflow in your program's main thread (unlikely but
==3430921==  possible), you can try to increase the size of the
==3430921==  main thread stack using the --main-stacksize= flag.
==3430921==  The main thread stack size used in this run was 8388608.

Testcase:
-threads 2 -i ~/tickets/2927/h264_dead.avi  -threads 1 -f null -
(if sample is not on trac or server, i can provide it)

thx

[...]
-- 
Michael GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB

I do not agree with what you have to say, but I'll defend to the death your
right to say it. -- Voltaire


signature.asc
Description: PGP signature
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".


Re: [FFmpeg-devel] [PATCH v4 0/5] avcodec/x86/diracdsp: migrate last remaining MMX function to SSE2

2024-11-14 Thread Ronald S. Bultje
Hi,

On Thu, Nov 14, 2024 at 1:28 PM Kyosuke Kawakami 
wrote:

> This series of patch migrates the last remaining MMX function in
> diracdsp to SSE2.
>
> Changes from v3 are:
> - Use correct register load/use counts
> - Fix garbage value issue on Windows
> - Use constant yblen on checkasm benchmark
> - Test that functions accept unaligned buffer
>
> Thanks to James and Ronald for feedback.
>

Patchset (still) LGTM. I'm again planning to merge tomorrow (end-of-day),
unless other reviews come in.

Ronald
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".


Re: [FFmpeg-devel] [PATCH v6 12/12] avfilter/vf_scale: switch to new swscale API

2024-11-14 Thread Michael Niedermayer
On Fri, Nov 15, 2024 at 12:11:34AM +0100, Niklas Haas wrote:
> On Fri, 15 Nov 2024 00:00:10 +0100 Michael Niedermayer 
>  wrote:
> > On Tue, Nov 12, 2024 at 10:50:46AM +0100, Niklas Haas wrote:
> > > From: Niklas Haas 
> > > 
> > > Most logic from this filter has been co-opted into swscale itself,
> > > allowing the resulting filter to be substantially simpler as it no
> > > longer has to worry about context initialization, interlacing, etc.
> > > 
> > > Sponsored-by: Sovereign Tech Fund
> > > Signed-off-by: Niklas Haas 
> > > ---
> > >  libavfilter/vf_scale.c | 354 +
> > >  1 file changed, 72 insertions(+), 282 deletions(-)
> > 
> > ./ffmpeg -i foreman_cif.y4m  -vf scale=out_v_chr_pos=0:out_h_chr_pos=0 -f 
> > null -
> > 
> > [fc#-1 @ 0x55eec4ea3300] Error applying option 'out_v_chr_pos' to filter 
> > 'scale': Option not found
> > Error opening output file -.
> > Error opening output files: Option not found
> 
> I mean, this change is basically intentional. But I suppose I should add
> backwards compatibility code to at least round it to the nearest similar 
> value.

yes, it should do something better than just failing with a message that 
confuses the user

also [fc#-1 @ 0x55b30de5f000] Error applying option 'in_v_chr_pos' to filter 
'scale': Option not found
and others

thx

[...]
-- 
Michael GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB

I know you won't believe me, but the highest form of Human Excellence is
to question oneself and others. -- Socrates


signature.asc
Description: PGP signature
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".


Re: [FFmpeg-devel] [PATCH v6 12/12] avfilter/vf_scale: switch to new swscale API

2024-11-14 Thread Michael Niedermayer
On Tue, Nov 12, 2024 at 10:50:46AM +0100, Niklas Haas wrote:
> From: Niklas Haas 
> 
> Most logic from this filter has been co-opted into swscale itself,
> allowing the resulting filter to be substantially simpler as it no
> longer has to worry about context initialization, interlacing, etc.
> 
> Sponsored-by: Sovereign Tech Fund
> Signed-off-by: Niklas Haas 
> ---
>  libavfilter/vf_scale.c | 354 +
>  1 file changed, 72 insertions(+), 282 deletions(-)

breaks:
./ffmpeg -i laraShadow_dl.flv  -alphablend checkerboard -qscale 2 -bitexact -t 
0.5 -y file-alphablend-yuv420.avi

any video with some transparency should work
after the patch the video is all black

thx
[...]
-- 
Michael GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB

No great genius has ever existed without some touch of madness. -- Aristotle


signature.asc
Description: PGP signature
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".


Re: [FFmpeg-devel] [PATCH v6 12/12] avfilter/vf_scale: switch to new swscale API

2024-11-14 Thread Michael Niedermayer
On Tue, Nov 12, 2024 at 10:50:46AM +0100, Niklas Haas wrote:
> From: Niklas Haas 
> 
> Most logic from this filter has been co-opted into swscale itself,
> allowing the resulting filter to be substantially simpler as it no
> longer has to worry about context initialization, interlacing, etc.
> 
> Sponsored-by: Sovereign Tech Fund
> Signed-off-by: Niklas Haas 
> ---
>  libavfilter/vf_scale.c | 354 +
>  1 file changed, 72 insertions(+), 282 deletions(-)

./ffmpeg -i foreman_cif.y4m  -vf scale=out_v_chr_pos=0:out_h_chr_pos=0 -f null -

[fc#-1 @ 0x55eec4ea3300] Error applying option 'out_v_chr_pos' to filter 
'scale': Option not found
Error opening output file -.
Error opening output files: Option not found

thx

[...]
-- 
Michael GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB

Nations do behave wisely once they have exhausted all other alternatives. 
-- Abba Eban


signature.asc
Description: PGP signature
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".


[FFmpeg-devel] [PATCH] swscale/ppc: remove AltiVec acceleration for YUV->RGB conversions

2024-11-14 Thread Sean McGovern
Even on a reasonably modern POWER9 (ppc64le), it does not function correctly.
---
 libswscale/ppc/Makefile  |   1 -
 libswscale/ppc/swscale_altivec.c |  25 -
 libswscale/ppc/swscale_vsx.c |   1 -
 libswscale/ppc/yuv2rgb_altivec.c | 866 ---
 libswscale/ppc/yuv2rgb_altivec.h |  51 --
 libswscale/utils.c   |   6 -
 libswscale/yuv2rgb.c |   4 +-
 7 files changed, 1 insertion(+), 953 deletions(-)
 delete mode 100644 libswscale/ppc/yuv2rgb_altivec.c
 delete mode 100644 libswscale/ppc/yuv2rgb_altivec.h

diff --git a/libswscale/ppc/Makefile b/libswscale/ppc/Makefile
index 0a31a3025b..c37440a9a6 100644
--- a/libswscale/ppc/Makefile
+++ b/libswscale/ppc/Makefile
@@ -1,4 +1,3 @@
 OBJS += ppc/swscale_altivec.o   \
-ppc/yuv2rgb_altivec.o   \
 ppc/yuv2yuv_altivec.o   \
 ppc/swscale_vsx.o
diff --git a/libswscale/ppc/swscale_altivec.c b/libswscale/ppc/swscale_altivec.c
index 836aaab1f8..da21517633 100644
--- a/libswscale/ppc/swscale_altivec.c
+++ b/libswscale/ppc/swscale_altivec.c
@@ -28,7 +28,6 @@
 #include "libswscale/swscale_internal.h"
 #include "libavutil/attributes.h"
 #include "libavutil/cpu.h"
-#include "yuv2rgb_altivec.h"
 #include "libavutil/ppc/util_altivec.h"
 
 #if HAVE_ALTIVEC
@@ -254,30 +253,6 @@ av_cold void ff_sws_init_swscale_ppc(SwsInternal *c)
 c->yuv2plane1 = yuv2plane1_floatLE_altivec;
 }
 
-/* The following list of supported dstFormat values should
- * match what's found in the body of ff_yuv2packedX_altivec() */
-if (!(c->flags & (SWS_BITEXACT | SWS_FULL_CHR_H_INT)) && !c->needAlpha) {
-switch (c->dstFormat) {
-case AV_PIX_FMT_ABGR:
-c->yuv2packedX = ff_yuv2abgr_X_altivec;
-break;
-case AV_PIX_FMT_BGRA:
-c->yuv2packedX = ff_yuv2bgra_X_altivec;
-break;
-case AV_PIX_FMT_ARGB:
-c->yuv2packedX = ff_yuv2argb_X_altivec;
-break;
-case AV_PIX_FMT_RGBA:
-c->yuv2packedX = ff_yuv2rgba_X_altivec;
-break;
-case AV_PIX_FMT_BGR24:
-c->yuv2packedX = ff_yuv2bgr24_X_altivec;
-break;
-case AV_PIX_FMT_RGB24:
-c->yuv2packedX = ff_yuv2rgb24_X_altivec;
-break;
-}
-}
 #endif /* HAVE_ALTIVEC */
 
 ff_sws_init_swscale_vsx(c);
diff --git a/libswscale/ppc/swscale_vsx.c b/libswscale/ppc/swscale_vsx.c
index f83bb96ec9..1065908034 100644
--- a/libswscale/ppc/swscale_vsx.c
+++ b/libswscale/ppc/swscale_vsx.c
@@ -29,7 +29,6 @@
 #include "libavutil/attributes.h"
 #include "libavutil/cpu.h"
 #include "libavutil/mem_internal.h"
-#include "yuv2rgb_altivec.h"
 #include "libavutil/ppc/util_altivec.h"
 
 #if HAVE_VSX
diff --git a/libswscale/ppc/yuv2rgb_altivec.c b/libswscale/ppc/yuv2rgb_altivec.c
deleted file mode 100644
index 9db305f43f..00
--- a/libswscale/ppc/yuv2rgb_altivec.c
+++ /dev/null
@@ -1,866 +0,0 @@
-/*
- * AltiVec acceleration for colorspace conversion
- *
- * copyright (C) 2004 Marc Hoffman 
- *
- * This file is part of FFmpeg.
- *
- * FFmpeg is free software; you can redistribute it and/or
- * modify it under the terms of the GNU Lesser General Public
- * License as published by the Free Software Foundation; either
- * version 2.1 of the License, or (at your option) any later version.
- *
- * FFmpeg is distributed in the hope that it will be useful,
- * but WITHOUT ANY WARRANTY; without even the implied warranty of
- * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
- * Lesser General Public License for more details.
- *
- * You should have received a copy of the GNU Lesser General Public
- * License along with FFmpeg; if not, write to the Free Software
- * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA
- */
-
-/*
- * Convert I420 YV12 to RGB in various formats,
- * it rejects images that are not in 420 formats,
- * it rejects images that don't have widths of multiples of 16,
- * it rejects images that don't have heights of multiples of 2.
- * Reject defers to C simulation code.
- *
- * Lots of optimizations to be done here.
- *
- * 1. Need to fix saturation code. I just couldn't get it to fly with packs
- * and adds, so we currently use max/min to clip.
- *
- * 2. The inefficient use of chroma loading needs a bit of brushing up.
- *
- * 3. Analysis of pipeline stalls needs to be done. Use shark to identify
- * pipeline stalls.
- *
- *
- * MODIFIED to calculate coeffs from currently selected color space.
- * MODIFIED core to be a macro where you specify the output format.
- * ADDED UYVY conversion which is never called due to some thing in swscale.
- * CORRECTED algorithim selection to be strict on input formats.
- * ADDED runtime detection of AltiVec.
- *
- * ADDED altivec_yuv2packedX vertical scl + RGB converter
- *
- * March 2

Re: [FFmpeg-devel] [PATCH v6 12/12] avfilter/vf_scale: switch to new swscale API

2024-11-14 Thread Niklas Haas
On Fri, 15 Nov 2024 00:00:10 +0100 Michael Niedermayer  
wrote:
> On Tue, Nov 12, 2024 at 10:50:46AM +0100, Niklas Haas wrote:
> > From: Niklas Haas 
> > 
> > Most logic from this filter has been co-opted into swscale itself,
> > allowing the resulting filter to be substantially simpler as it no
> > longer has to worry about context initialization, interlacing, etc.
> > 
> > Sponsored-by: Sovereign Tech Fund
> > Signed-off-by: Niklas Haas 
> > ---
> >  libavfilter/vf_scale.c | 354 +
> >  1 file changed, 72 insertions(+), 282 deletions(-)
> 
> ./ffmpeg -i foreman_cif.y4m  -vf scale=out_v_chr_pos=0:out_h_chr_pos=0 -f 
> null -
> 
> [fc#-1 @ 0x55eec4ea3300] Error applying option 'out_v_chr_pos' to filter 
> 'scale': Option not found
> Error opening output file -.
> Error opening output files: Option not found

I mean, this change is basically intentional. But I suppose I should add
backwards compatibility code to at least round it to the nearest similar value.

> 
> thx
> 
> [...]
> -- 
> Michael GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB
> 
> Nations do behave wisely once they have exhausted all other alternatives. 
> -- Abba Eban
> ___
> ffmpeg-devel mailing list
> ffmpeg-devel@ffmpeg.org
> https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
> 
> To unsubscribe, visit link above, or email
> ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".


Re: [FFmpeg-devel] [PATCH v2 3/6] ffv1enc: split off encoder initialization into a separate function

2024-11-14 Thread Michael Niedermayer
On Thu, Nov 14, 2024 at 08:21:02AM +0100, Lynne via ffmpeg-devel wrote:
> On 11/14/24 00:46, Michael Niedermayer wrote:
> 
> > On Mon, Nov 11, 2024 at 04:40:15AM +0100, Lynne via ffmpeg-devel wrote:
> > > ---
> > >   libavcodec/ffv1enc.c | 354 +++
> > >   libavcodec/ffv1enc.h |  30 
> > >   2 files changed, 217 insertions(+), 167 deletions(-)
> > >   create mode 100644 libavcodec/ffv1enc.h
> > > 
> > > diff --git a/libavcodec/ffv1enc.c b/libavcodec/ffv1enc.c
> > > index 7a6c718b41..ca2c9f32e2 100644
> > > --- a/libavcodec/ffv1enc.c
> > > +++ b/libavcodec/ffv1enc.c
> > [...]
> > 
> > > @@ -873,7 +907,7 @@ static av_cold int encode_init(AVCodecContext *avctx)
> > >   continue;
> > >   if (maxw * maxh * (int64_t)(s->bits_per_raw_sample+1) * 
> > > plane_count > 8<<24)
> > >   continue;
> > > -if (s->version < 4)
> > > +if (avctx->level < 4)
> > >   if (  ff_need_new_slices(avctx->width , 
> > > s->num_h_slices, s->chroma_h_shift)
> > >   ||ff_need_new_slices(avctx->height, 
> > > s->num_v_slices, s->chroma_v_shift))
> > >   continue;
> > avctx->level is read only from the point of view of the encoder
> > while s->level can (and is sometimes) changed by the encoder
> > So in cases where version is adjusted across 4, level would be wrong,
> > it may be that this doesnt occur ATM but its still not correct
> 
> 
> It cannot happen, not with the way the code is written. This is functionally
> correct.

it is semantically incorrect, s->version is the used version
avctx->level is what the user specified.

If the user does not specify a level, then avctx->level will be -99
s->version can be arbitrary and is the correct field to use


thx

[...]
-- 
Michael GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB

If you fake or manipulate statistics in a paper in physics you will never
get a job again.
If you fake or manipulate statistics in a paper in medicin you will get
a job for life at the pharma industry.


signature.asc
Description: PGP signature
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".


[FFmpeg-devel] [PATCH] libavformat/mpegts.c: add support for ATSC E-AC-3 streams

2024-11-14 Thread Scott Theisen
ATSC A/52:2018 Digital Audio Compression (AC-3, E-AC-3), Annex G
defines stream_type 0x87 for E-AC-3 bit streams.
---
 libavformat/mpegts.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/libavformat/mpegts.c b/libavformat/mpegts.c
index 78ab7f7efe..177e610e53 100644
--- a/libavformat/mpegts.c
+++ b/libavformat/mpegts.c
@@ -847,6 +847,7 @@ static const StreamType SCTE_types[] = {
 /* ATSC ? */
 static const StreamType MISC_types[] = {
 { 0x81, AVMEDIA_TYPE_AUDIO, AV_CODEC_ID_AC3 },
+{ 0x87, AVMEDIA_TYPE_AUDIO, AV_CODEC_ID_EAC3 },
 { 0x8a, AVMEDIA_TYPE_AUDIO, AV_CODEC_ID_DTS },
 { 0 },
 };
-- 
2.43.0

___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".


[FFmpeg-devel] [PATCH] vulkan_encode_h264/5: Fix uninitialized return value in write_extra_headers

2024-11-14 Thread David Rosca
---
 libavcodec/vulkan_encode_h264.c | 1 +
 libavcodec/vulkan_encode_h265.c | 1 +
 2 files changed, 2 insertions(+)

diff --git a/libavcodec/vulkan_encode_h264.c b/libavcodec/vulkan_encode_h264.c
index cdc87fb4ca..fdd3cc8a55 100644
--- a/libavcodec/vulkan_encode_h264.c
+++ b/libavcodec/vulkan_encode_h264.c
@@ -1311,6 +1311,7 @@ static int write_extra_headers(AVCodecContext *avctx,
 if (err < 0)
 goto fail;
 } else {
+err = 0;
 *data_len = 0;
 }
 
diff --git a/libavcodec/vulkan_encode_h265.c b/libavcodec/vulkan_encode_h265.c
index cd50f2f756..8c25cacded 100644
--- a/libavcodec/vulkan_encode_h265.c
+++ b/libavcodec/vulkan_encode_h265.c
@@ -1471,6 +1471,7 @@ static int write_extra_headers(AVCodecContext *avctx,
 if (err < 0)
 goto fail;
 } else {
+err = 0;
 *data_len = 0;
 }
 
-- 
2.47.0

___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".


Re: [FFmpeg-devel] [PATCH v2] avcodec/dxva2: add support for HEVC RExt DXVA profiles

2024-11-14 Thread Steve Lhomme

Hi,

For the record we have been running this in VLC for quite some time, 
only for Intel hardware.

https://code.videolan.org/videolan/vlc/-/blob/3.0.x/contrib/src/ffmpeg/0001-avcodec-dxva2_hevc-add-support-for-parsing-HEVC-Rang.patch?ref_type=heads
https://code.videolan.org/videolan/vlc/-/blob/3.0.x/contrib/src/ffmpeg/0002-avcodec-hevcdec-allow-HEVC-444-8-10-12-bits-decoding.patch?ref_type=heads
https://code.videolan.org/videolan/vlc/-/blob/3.0.x/contrib/src/ffmpeg/0003-avcodec-hevcdec-allow-HEVC-422-10-12-bits-decoding-w.patch?ref_type=heads

It seems the MS GUID are not the same as Intel...

On 13/11/2024 04:25, Cameron Gutman wrote:

Microsoft has formally standardized DXVA GUIDs for HEVC Range Extension
profiles in the Windows 11 24H2 SDK. They are supported by Intel GPUs
starting with Tiger Lake.

Like VDPAU and VAAPI, DXVA has separate GUIDs for each RExt profile, so
we must parse the SPS like those hwaccels do to figure out which one to
use when creating our decoder.

The new RExt profiles are supported with DXVA2 and D3D11VA (depending on
driver support).

Signed-off-by: Cameron Gutman 
---
v2 addresses some off-list feedback regarding the duplicate definition
of some DXVA GUIDs and incorrect handling of the pixel format of the
hw_frames_ctx when FFmpeg creates one internally.
---
  libavcodec/d3d12va_hevc.c   |  21 ---
  libavcodec/dxva2.c  |  77 +---
  libavcodec/dxva2_hevc.c | 115 ++--
  libavcodec/dxva2_internal.h |  47 ++-
  libavcodec/hevc/hevcdec.c   |  28 +
  5 files changed, 266 insertions(+), 22 deletions(-)

diff --git a/libavcodec/d3d12va_hevc.c b/libavcodec/d3d12va_hevc.c
index 7686f0eb6c..f4272cec1e 100644
--- a/libavcodec/d3d12va_hevc.c
+++ b/libavcodec/d3d12va_hevc.c
@@ -33,12 +33,15 @@
  #define MAX_SLICES 256
  
  typedef struct HEVCDecodePictureContext {

-DXVA_PicParams_HEVCpp;
-DXVA_Qmatrix_HEVC  qm;
-unsigned   slice_count;
-DXVA_Slice_HEVC_Short  slice_short[MAX_SLICES];
-const uint8_t *bitstream;
-unsigned   bitstream_size;
+union {
+DXVA_PicParams_HEVC pp;
+ff_DXVA_PicParams_HEVC_RangeExt ppext;
+};
+DXVA_Qmatrix_HEVC   qm;
+unsignedslice_count;
+DXVA_Slice_HEVC_Short   slice_short[MAX_SLICES];
+const uint8_t   *bitstream;
+unsignedbitstream_size;
  } HEVCDecodePictureContext;
  
  static void fill_slice_short(DXVA_Slice_HEVC_Short *slice, unsigned position, unsigned size)

@@ -62,7 +65,7 @@ static int d3d12va_hevc_start_frame(AVCodecContext *avctx, 
av_unused const uint8
  
  ctx->used_mask = 0;
  
-ff_dxva2_hevc_fill_picture_parameters(avctx, (AVDXVAContext *)ctx, &ctx_pic->pp);

+ff_dxva2_hevc_fill_picture_parameters(avctx, (AVDXVAContext *)ctx, 
&ctx_pic->ppext);
  
  ff_dxva2_hevc_fill_scaling_lists(avctx, (AVDXVAContext *)ctx, &ctx_pic->qm);
  
@@ -152,11 +155,13 @@ static int d3d12va_hevc_end_frame(AVCodecContext *avctx)

  HEVCDecodePictureContext *ctx_pic = h->cur_frame->hwaccel_picture_private;
  
  int scale = ctx_pic->pp.dwCodingParamToolFlags & 1;

+int rext  = avctx->profile == AV_PROFILE_HEVC_REXT;
  
  if (ctx_pic->slice_count <= 0 || ctx_pic->bitstream_size <= 0)

  return -1;
  
-return ff_d3d12va_common_end_frame(avctx, h->cur_frame->f, &ctx_pic->pp, sizeof(ctx_pic->pp),

+return ff_d3d12va_common_end_frame(avctx, h->cur_frame->f, &ctx_pic->pp,
+   rext ? sizeof(ctx_pic->ppext) : sizeof(ctx_pic->pp),
 scale ? &ctx_pic->qm : NULL, scale ? sizeof(ctx_pic->qm) : 0, 
update_input_arguments);
  }
  
diff --git a/libavcodec/dxva2.c b/libavcodec/dxva2.c

index 22ecd5acaf..ed6036e099 100644
--- a/libavcodec/dxva2.c
+++ b/libavcodec/dxva2.c
@@ -48,7 +48,6 @@ 
DEFINE_GUID(ff_DXVA2_ModeVP9_VLD_Profile0,0x463707f8,0xa1d0,0x4585,0x87,0x6d,0x8
  
DEFINE_GUID(ff_DXVA2_ModeVP9_VLD_10bit_Profile2,0xa4c749ef,0x6ecf,0x48aa,0x84,0x48,0x50,0xa7,0xa1,0x16,0x5f,0xf7);
  
DEFINE_GUID(ff_DXVA2_ModeAV1_VLD_Profile0,0xb8be4ccb,0xcf53,0x46ba,0x8d,0x59,0xd6,0xb8,0xa6,0xda,0x5d,0x2a);
  DEFINE_GUID(ff_DXVA2_NoEncrypt,  0x1b81beD0, 
0xa0c7,0x11d3,0xb9,0x84,0x00,0xc0,0x4f,0x2e,0x73,0xc5);
-DEFINE_GUID(ff_GUID_NULL,0x, 
0x,0x,0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00);
  DEFINE_GUID(ff_IID_IDirectXVideoDecoderService, 
0xfc51a551,0xd5e7,0x11d9,0xaf,0x55,0x00,0x05,0x4e,0x43,0xff,0x02);
  
  typedef struct dxva_mode {

@@ -70,6 +69,8 @@ static const int prof_hevc_main[]= {AV_PROFILE_HEVC_MAIN,
  AV_PROFILE_UNKNOWN};
  static const int prof_hevc_main10[]  = {AV_PROFILE_HEVC_MAIN_10,
  AV_PROFILE_UNKNOWN};
+static const int prof_hevc_rext[]= {AV_PROFILE_HEVC_REXT,
+   

Re: [FFmpeg-devel] [PATCH v2] libavfi/vf_drawtext: fix memory management when destroying font face

2024-11-14 Thread Leandro Santiago
Just as a ping, to know whether anyone is able to have a look at this patch :-)

Please let me know if you folks need extra context or info.

31 Oct 2024 21:48:31 Leandro Santiago :

> Ref https://trac.ffmpeg.org/ticket/11152
> 
> According to harfbuzz docs, hb_ft_font_set_funcs() does not need to be
> called, as, quoted:
> 
> ```
> An #hb_font_t object created with hb_ft_font_create()
> is preconfigured for FreeType font functions and does not
> require this function to be used.
> ```
> 
> Using this function seems to cause memory management issues between
> harfbuzz and freetype, and could be eliminated.
> 
> This commit also call hb_ft_font_changed() when the underlying FC_Face
> changes size, as stated on hardbuzz:
> 
> ```
> HarfBuzz also provides a utility function called hb_ft_font_changed() that 
> you should call
> whenever you have altered the properties of your underlying FT_Face, as well 
> as a hb_ft_get_face()
> that you can call on an hb_font_t font object to fetch its underlying FT_Face.
> ```
> 
> Finally, the execution order between hb_font_destroy() and
> hb_buffer_destroy() is flipped to match the order of creation of
> the respective objects.
> 
> Signed-off-by: Leandro Santiago 
> ---
> libavfilter/vf_drawtext.c | 15 ---
> 1 file changed, 12 insertions(+), 3 deletions(-)
> 
> diff --git a/libavfilter/vf_drawtext.c b/libavfilter/vf_drawtext.c
> index 2b0a21a4b4..cf99b4e59e 100644
> --- a/libavfilter/vf_drawtext.c
> +++ b/libavfilter/vf_drawtext.c
> @@ -445,6 +445,7 @@ static int glyph_cmp(const void *key, const void *b)
> static av_cold int set_fontsize(AVFilterContext *ctx, unsigned int fontsize)
> {
>  int err;
> +    int line;
>  DrawTextContext *s = ctx->priv;
> 
>  if ((err = FT_Set_Pixel_Sizes(s->face, 0, fontsize))) {
> @@ -453,6 +454,12 @@ static av_cold int set_fontsize(AVFilterContext *ctx, 
> unsigned int fontsize)
>  return AVERROR(EINVAL);
>  }
> 
> +    // Whenever the underlying FT_Face changes, harfbuzz has to be notified 
> of the change.
> +    for (line = 0; line < s->line_count; line++) {
> +  TextLine *cur_line = &s->lines[line];
> +  hb_ft_font_changed(cur_line->hb_data.font);
> +    }
> +
>  s->fontsize = fontsize;
> 
>  return 0;
> @@ -1365,15 +1372,17 @@ static int shape_text_hb(DrawTextContext *s, 
> HarfbuzzData* hb, const char* text,
>  if(!hb_buffer_allocation_successful(hb->buf)) {
>  return AVERROR(ENOMEM);
>  }
> +
>  hb_buffer_set_direction(hb->buf, HB_DIRECTION_LTR);
>  hb_buffer_set_script(hb->buf, HB_SCRIPT_LATIN);
>  hb_buffer_set_language(hb->buf, hb_language_from_string("en", -1));
>  hb_buffer_guess_segment_properties(hb->buf);
> -    hb->font = hb_ft_font_create(s->face, NULL);
> +
> +    hb->font = hb_ft_font_create_referenced(s->face);
>  if(hb->font == NULL) {
>  return AVERROR(ENOMEM);
>  }
> -    hb_ft_font_set_funcs(hb->font);
> +
>  hb_buffer_add_utf8(hb->buf, text, textLen, 0, -1);
>  hb_shape(hb->font, hb->buf, NULL, 0);
>  hb->glyph_info = hb_buffer_get_glyph_infos(hb->buf, &hb->glyph_count);
> @@ -1384,8 +1393,8 @@ static int shape_text_hb(DrawTextContext *s, 
> HarfbuzzData* hb, const char* text,
> 
> static void hb_destroy(HarfbuzzData *hb)
> {
> -    hb_buffer_destroy(hb->buf);
>  hb_font_destroy(hb->font);
> +    hb_buffer_destroy(hb->buf);
>  hb->buf = NULL;
>  hb->font = NULL;
>  hb->glyph_info = NULL;
> -- 
> 2.46.1
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".


Re: [FFmpeg-devel] [PATCH v3 1/3] checkasm/diracdsp: test add_dirac_obmc

2024-11-14 Thread James Almer

On 11/14/2024 11:30 AM, Kyosuke Kawakami wrote:

Signed-off-by: Kyosuke Kawakami 
---
  tests/checkasm/Makefile   |  1 +
  tests/checkasm/checkasm.c |  3 ++
  tests/checkasm/checkasm.h |  1 +
  tests/checkasm/diracdsp.c | 86 +++
  tests/fate/checkasm.mak   |  1 +
  5 files changed, 92 insertions(+)
  create mode 100644 tests/checkasm/diracdsp.c


[...]


diff --git a/tests/checkasm/diracdsp.c b/tests/checkasm/diracdsp.c
new file mode 100644
index 00..8833c2d223
--- /dev/null
+++ b/tests/checkasm/diracdsp.c
@@ -0,0 +1,86 @@
+/*
+ * Copyright (c) 2024 Kyosuke Kawakami
+ *
+ * This file is part of FFmpeg.
+ *
+ * FFmpeg is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ *
+ * FFmpeg is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License along
+ * with FFmpeg; if not, write to the Free Software Foundation, Inc.,
+ * 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA.
+ */
+
+#include "checkasm.h"
+
+#include "libavcodec/diracdsp.h"
+
+#include "libavutil/intreadwrite.h"
+#include "libavutil/mem_internal.h"
+
+#define RANDOMIZE_DESTS(name, size) \
+do {\
+int i;  \
+for (i = 0; i < size; ++i) {\
+uint16_t r = rnd(); \
+AV_WN16A(name##0 + i, r);   \
+AV_WN16A(name##1 + i, r);   \
+}   \
+} while (0)
+
+#define RANDOMIZE_BUFFER8(name, size) \
+do {  \
+int i;\
+for (i = 0; i < size; ++i) {  \
+uint8_t r = rnd();\
+name[i] = r;  \
+} \
+} while (0)
+
+#define OBMC_STRIDE 32
+#define XBLEN_MAX 32
+#define YBLEN_MAX 64
+
+static void check_add_obmc(size_t func_index, int xblen)
+{
+LOCAL_ALIGNED_8(uint8_t, src, [XBLEN_MAX * YBLEN_MAX]);
+LOCAL_ALIGNED_16(uint16_t, dst0, [XBLEN_MAX * YBLEN_MAX]);
+LOCAL_ALIGNED_16(uint16_t, dst1, [XBLEN_MAX * YBLEN_MAX]);


The loads in the asm functions use movdqu, so i assume the buffers in 
the decoder are not 16 byte aligned. To ensure future implementations 
don't mistakenly use aligned loads, you could make this be:


LOCAL_ALIGNED_16(uint16_t, _dst0, [XBLEN_MAX * YBLEN_MAX + 4]);
LOCAL_ALIGNED_16(uint16_t, _dst1, [XBLEN_MAX * YBLEN_MAX + 4]);
uint16_t *dst0 = _dst0 + 4, *dst1 = _dst1 + 4;

Using LOCAL_ALIGNED_8() could also end up with a 16 byte aligned buffer, 
so the above will make sure the buffer is 8 byte aligned.



+LOCAL_ALIGNED_8(uint8_t, obmc_weight, [XBLEN_MAX * YBLEN_MAX]);
+
+int yblen;
+DiracDSPContext h;
+
+ff_diracdsp_init(&h);
+
+if (check_func(h.add_dirac_obmc[func_index], "diracdsp.add_dirac_obmc_%d", 
xblen)) {
+declare_func(void, uint16_t*, const uint8_t*, int, const uint8_t *, 
int);
+
+yblen = 1 + (rnd() % YBLEN_MAX);


Use YBLEN_MAX directly. No real gain in using randomized height, and 
this way every --bench run will give wildly different results.



+RANDOMIZE_BUFFER8(src, yblen * xblen);
+RANDOMIZE_DESTS(dst, yblen * xblen);
+RANDOMIZE_BUFFER8(obmc_weight, yblen * OBMC_STRIDE);
+
+call_ref(dst0, src, xblen, obmc_weight, yblen);
+call_new(dst1, src, xblen, obmc_weight, yblen);
+if (memcmp(dst0, dst1, yblen * xblen))
+fail();
+
+bench_new(dst1, src, xblen, obmc_weight, yblen);
+}
+}
+
+void checkasm_check_diracdsp(void)
+{
+check_add_obmc(0, 8);
+check_add_obmc(1, 16);
+check_add_obmc(2, 32);
+report("diracdsp");
+}
diff --git a/tests/fate/checkasm.mak b/tests/fate/checkasm.mak
index d1396cb641..8a2c04e1cd 100644
--- a/tests/fate/checkasm.mak
+++ b/tests/fate/checkasm.mak
@@ -7,6 +7,7 @@ FATE_CHECKASM = fate-checkasm-aacencdsp 
\
  fate-checkasm-av_tx \
  fate-checkasm-blockdsp  \
  fate-checkasm-bswapdsp  \
+fate-checkasm-diracdsp  \
  fate-checkasm-exrdsp\
  fate-checkasm-fdctdsp   \
  fate-checkasm-fixed_dsp \




OpenPGP_sig

Re: [FFmpeg-devel] [PATCH v3 2/3] avcodec/x86/diracdsp: migrate last remaining MMX function to SSE2

2024-11-14 Thread James Almer

On 11/14/2024 11:30 AM, Kyosuke Kawakami wrote:

The add_dirac_obmc8_mmx function was the only MMX function left. This
patch migrates it to SSE2.

Here are the checkasm benchmark results:

diracdsp.add_dirac_obmc_8_c:2299.1 ( 1.00x)
diracdsp.add_dirac_obmc_8_mmx:   237.6 ( 9.68x)
diracdsp.add_dirac_obmc_8_sse2:  109.1 (21.07x)

Signed-off-by: Kyosuke Kawakami 
---
  libavcodec/x86/diracdsp.asm| 23 +++
  libavcodec/x86/diracdsp_init.c | 10 +++---
  2 files changed, 22 insertions(+), 11 deletions(-)

diff --git a/libavcodec/x86/diracdsp.asm b/libavcodec/x86/diracdsp.asm
index e5e2b11846..e708400b66 100644
--- a/libavcodec/x86/diracdsp.asm
+++ b/libavcodec/x86/diracdsp.asm
@@ -227,7 +227,7 @@ cglobal add_dirac_obmc%1_%2, 6,6,5, dst, src, stride, obmc, 
yblen
  punpckhbw   m1, m4
  movam2, [obmcq+i]
  movam3, m2
-   punpcklbw   m2, m4
+punpcklbw   m2, m4
  punpckhbw   m3, m4
  pmullw  m0, m2
  pmullw  m1, m3
@@ -247,9 +247,6 @@ cglobal add_dirac_obmc%1_%2, 6,6,5, dst, src, stride, obmc, 
yblen
  RET
  %endm
  
-INIT_MMX

-ADD_OBMC 8, mmx
-
  INIT_XMM
  PUT_RECT sse2
  ADD_RECT sse2
@@ -258,6 +255,24 @@ HPEL_FILTER sse2
  ADD_OBMC 32, sse2
  ADD_OBMC 16, sse2
  
+cglobal add_dirac_obmc8_sse2, 6,6,5, dst, src, stride, obmc, yblen


You're loading 5 gpr and using 5 too, not 6.


+pxorm4, m4


Add...

movsxdifnidn strideq, strided

...here, otherwise the tests will fail on Windows x86_64 (Upper 32 bits 
of the register are garbage).


And while at it, also make these changes to the other two ADD_OBMC 
functions in the macro above.



+.loop:
+movhm0, [srcq]
+punpcklbw   m0, m4
+movhm1, [obmcq]
+punpcklbw   m1, m4
+pmullw  m0, m1
+movum1, [dstq]
+paddw   m0, m1
+movu[dstq], m0
+lea srcq, [srcq+strideq]
+lea dstq, [dstq+2*strideq]
+add obmcq, 32
+sub yblend, 1
+jg  .loop
+RET
+
  INIT_XMM sse4
  
  ; void dequant_subband_32(uint8_t *src, uint8_t *dst, ptrdiff_t stride, const int qf, const int qs, int tot_v, int tot_h)

diff --git a/libavcodec/x86/diracdsp_init.c b/libavcodec/x86/diracdsp_init.c
index f678759dc0..08247133e1 100644
--- a/libavcodec/x86/diracdsp_init.c
+++ b/libavcodec/x86/diracdsp_init.c
@@ -24,8 +24,7 @@
  
  void ff_add_rect_clamped_sse2(uint8_t *, const uint16_t *, int, const int16_t *, int, int, int);
  
-void ff_add_dirac_obmc8_mmx(uint16_t *dst, const uint8_t *src, int stride, const uint8_t *obmc_weight, int yblen);

-
+void ff_add_dirac_obmc8_sse2(uint16_t *dst, const uint8_t *src, int stride, 
const uint8_t *obmc_weight, int yblen);
  void ff_add_dirac_obmc16_sse2(uint16_t *dst, const uint8_t *src, int stride, 
const uint8_t *obmc_weight, int yblen);
  void ff_add_dirac_obmc32_sse2(uint16_t *dst, const uint8_t *src, int stride, 
const uint8_t *obmc_weight, int yblen);
  
@@ -94,15 +93,12 @@ void ff_diracdsp_init_x86(DiracDSPContext* c)

  #if HAVE_X86ASM
  int mm_flags = av_get_cpu_flags();
  
-if (EXTERNAL_MMX(mm_flags)) {

-c->add_dirac_obmc[0] = ff_add_dirac_obmc8_mmx;
-}
-
  if (EXTERNAL_SSE2(mm_flags)) {
  c->dirac_hpel_filter = dirac_hpel_filter_sse2;
  c->add_rect_clamped = ff_add_rect_clamped_sse2;
  c->put_signed_rect_clamped[0] = (void 
*)ff_put_signed_rect_clamped_sse2;
  
+c->add_dirac_obmc[0] = ff_add_dirac_obmc8_sse2;

  c->add_dirac_obmc[1] = ff_add_dirac_obmc16_sse2;
  c->add_dirac_obmc[2] = ff_add_dirac_obmc32_sse2;
  
@@ -116,5 +112,5 @@ void ff_diracdsp_init_x86(DiracDSPContext* c)

  c->dequant_subband[1] = ff_dequant_subband_32_sse4;
  c->put_signed_rect_clamped[1] = ff_put_signed_rect_clamped_10_sse4;
  }
-#endif
+#endif // HAVE_X86ASM
  }




OpenPGP_signature.asc
Description: OpenPGP digital signature
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".


Re: [FFmpeg-devel] patchwork.ffmpeg.org is down

2024-11-14 Thread Michael Niedermayer
Hi

On Wed, Nov 13, 2024 at 08:42:26PM -0500, Sean McGovern wrote:
> Hi,
> 
> On Wed, Nov 13, 2024, 20:33 Dennis Mungai  wrote:
> 
> > Currently returning 504 Gateway Timeout for the last ~3 hours.
> >
> 
> Confirming I see the same error here.

Saw it too yesterday, i belive BtbN was/is working on improving the
patchwork setup (it seems working ATM)

thx

[...]
-- 
Michael GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB

No human being will ever know the Truth, for even if they happen to say it
by chance, they would not even known they had done so. -- Xenophanes


signature.asc
Description: PGP signature
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".


Re: [FFmpeg-devel] [PATCH v3 1/3] checkasm/diracdsp: test add_dirac_obmc

2024-11-14 Thread James Almer

On 11/14/2024 1:29 PM, Ronald S. Bultje wrote:

Hi,

On Thu, Nov 14, 2024 at 10:18 AM James Almer  wrote:


On 11/14/2024 11:30 AM, Kyosuke Kawakami wrote:

Signed-off-by: Kyosuke Kawakami 
---
   tests/checkasm/Makefile   |  1 +
   tests/checkasm/checkasm.c |  3 ++
   tests/checkasm/checkasm.h |  1 +
   tests/checkasm/diracdsp.c | 86 +++
   tests/fate/checkasm.mak   |  1 +
   5 files changed, 92 insertions(+)
   create mode 100644 tests/checkasm/diracdsp.c


[...]


diff --git a/tests/checkasm/diracdsp.c b/tests/checkasm/diracdsp.c
new file mode 100644
index 00..8833c2d223
--- /dev/null
+++ b/tests/checkasm/diracdsp.c
@@ -0,0 +1,86 @@
+/*
+ * Copyright (c) 2024 Kyosuke Kawakami
+ *
+ * This file is part of FFmpeg.
+ *
+ * FFmpeg is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ *
+ * FFmpeg is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License

along

+ * with FFmpeg; if not, write to the Free Software Foundation, Inc.,
+ * 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA.
+ */
+
+#include "checkasm.h"
+
+#include "libavcodec/diracdsp.h"
+
+#include "libavutil/intreadwrite.h"
+#include "libavutil/mem_internal.h"
+
+#define RANDOMIZE_DESTS(name, size) \
+do {\
+int i;  \
+for (i = 0; i < size; ++i) {\
+uint16_t r = rnd(); \
+AV_WN16A(name##0 + i, r);   \
+AV_WN16A(name##1 + i, r);   \
+}   \
+} while (0)
+
+#define RANDOMIZE_BUFFER8(name, size) \
+do {  \
+int i;\
+for (i = 0; i < size; ++i) {  \
+uint8_t r = rnd();\
+name[i] = r;  \
+} \
+} while (0)
+
+#define OBMC_STRIDE 32
+#define XBLEN_MAX 32
+#define YBLEN_MAX 64
+
+static void check_add_obmc(size_t func_index, int xblen)
+{
+LOCAL_ALIGNED_8(uint8_t, src, [XBLEN_MAX * YBLEN_MAX]);
+LOCAL_ALIGNED_16(uint16_t, dst0, [XBLEN_MAX * YBLEN_MAX]);
+LOCAL_ALIGNED_16(uint16_t, dst1, [XBLEN_MAX * YBLEN_MAX]);


The loads in the asm functions use movdqu, so i assume the buffers in
the decoder are not 16 byte aligned. To ensure future implementations
don't mistakenly use aligned loads, you could make this be:

LOCAL_ALIGNED_16(uint16_t, _dst0, [XBLEN_MAX * YBLEN_MAX + 4]);
LOCAL_ALIGNED_16(uint16_t, _dst1, [XBLEN_MAX * YBLEN_MAX + 4]);
uint16_t *dst0 = _dst0 + 4, *dst1 = _dst1 + 4;

Using LOCAL_ALIGNED_8() could also end up with a 16 byte aligned buffer,
so the above will make sure the buffer is 8 byte aligned.


+LOCAL_ALIGNED_8(uint8_t, obmc_weight, [XBLEN_MAX * YBLEN_MAX]);
+
+int yblen;
+DiracDSPContext h;
+
+ff_diracdsp_init(&h);
+
+if (check_func(h.add_dirac_obmc[func_index],

"diracdsp.add_dirac_obmc_%d", xblen)) {

+declare_func(void, uint16_t*, const uint8_t*, int, const

uint8_t *, int);

+
+yblen = 1 + (rnd() % YBLEN_MAX);


Use YBLEN_MAX directly. No real gain in using randomized height, and
this way every --bench run will give wildly different results.



The bench should use max_height, but the test should use a randomized
height, IMO.


Ah, good point. That sounds better.



OpenPGP_signature.asc
Description: OpenPGP digital signature
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".


Re: [FFmpeg-devel] [PATCH v2 1/3] checkasm/diracdsp: test add_dirac_obmc

2024-11-14 Thread Ronald S. Bultje
Hi,

thanks for adding the test! This looks pretty good. Minor suggestion:

On Wed, Nov 13, 2024 at 5:39 PM Kyosuke Kawakami 
wrote:

> +#define RANDOMIZE_DESTS(name, size) \
> +do {\
> +int i;  \
> +for (i = 0; i < size; ++i) {\
> +uint16_t r = rnd(); \
> +AV_WN16A(name##0 + i, r);   \
> +AV_WN16A(name##1 + i, r);   \
> +}   \
> +} while (0)
> +
> +#define RANDOMIZE_BUFFER8(name, size) \
> +do {  \
> +int i;\
> +for (i = 0; i < size; ++i) {  \
> +uint8_t r = rnd();\
> +name[i] = r;  \
> +} \
> +} while (0)
> +
> +#define OBMC_STRIDE 32
> +
> +#define CHECK_ADD_OBMC(func_index, yblen, xblen)
> \
> +static void check_add_obmc ## xblen(void)
>  \
> +{
>  \
> +LOCAL_ALIGNED_8(uint8_t, src, [yblen * xblen]);
>  \
> +LOCAL_ALIGNED_16(uint16_t, dst0, [yblen * xblen]);
> \
> +LOCAL_ALIGNED_16(uint16_t, dst1, [yblen * xblen]);
> \
> +LOCAL_ALIGNED_8(uint8_t, obmc_weight, [yblen * OBMC_STRIDE]);
>  \
> +DiracDSPContext h;
> \
> +ff_diracdsp_init(&h);
>  \
> +if (check_func(h.add_dirac_obmc[func_index],
> "diracdsp.add_dirac_obmc_%d", xblen)) {  \
> +declare_func(void, uint16_t*, const uint8_t*, int, const
> uint8_t *, int); \
> +RANDOMIZE_BUFFER8(src, yblen * xblen);
> \
> +RANDOMIZE_DESTS(dst, yblen * xblen);
> \
> +RANDOMIZE_BUFFER8(obmc_weight, yblen * OBMC_STRIDE);
> \
> +call_ref(dst0, src, xblen, obmc_weight, yblen);
>  \
> +call_new(dst1, src, xblen, obmc_weight, yblen);
>  \
> +if (memcmp(dst0, dst1, yblen * xblen))
> \
> +fail();
>  \
> +bench_new(dst1, src, xblen, obmc_weight, yblen);
> \
> +}
>  \
> +}
>  \
> +
> +CHECK_ADD_OBMC(0, 64, 8)
> +CHECK_ADD_OBMC(1, 64, 16)
> +CHECK_ADD_OBMC(2, 64, 32)
> +
> +void checkasm_check_diracdsp(void)
> +{
> +check_add_obmc8();
> +check_add_obmc16();
> +check_add_obmc32();
> +report("diracdsp");
> +}
>

In terms of binary size, the above will triplicate nearly identical code in
the binary, and that doesn't really seem necessary. You should be able to
write the above out as a function and call it with appropriate xblen and
func_index arguments (check_add_obmc(0, 8); check_add_obmc(1, 16);
check_add_obmc(2, 32)) without duplicating these in the binary. Done for
many tests, this reduces binary size and compile time significantly.

You can also get bonus points if you randomize yblen, since the bitstream
allows multiple values for this (at least 4, 12, 16, 24, but possibly any
number), so maybe 1 + (random % 64) would be a good choice?

If you don't care for the above, we can merge this as-is, they're merely
suggestions, let me know what you think.

Ronald
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".


Re: [FFmpeg-devel] [PATCH v2 2/3] avcodec/x86/diracdsp: migrate last remaining MMX function to SSE2

2024-11-14 Thread Ronald S. Bultje
Hi,

On Wed, Nov 13, 2024 at 5:44 PM Kyosuke Kawakami 
wrote:

> The add_dirac_obmc8_mmx function was the only MMX function left. This
> patch migrates it to SSE2.
>
> Here is checkasm benchmark results:
>
> diracdsp.add_dirac_obmc_8_c:2299.1 ( 1.00x)
> diracdsp.add_dirac_obmc_8_mmx:   237.6 ( 9.68x)
> diracdsp.add_dirac_obmc_8_sse2:  109.1 (21.07x)
>
> Signed-off-by: Kyosuke Kawakami 
> ---
>  libavcodec/x86/diracdsp.asm| 23 +++
>  libavcodec/x86/diracdsp_init.c | 10 +++---
>  2 files changed, 22 insertions(+), 11 deletions(-)
>

LGTM, thanks.

Ronald
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".


Re: [FFmpeg-devel] [PATCH v2 1/3] checkasm/diracdsp: test add_dirac_obmc

2024-11-14 Thread Kyosuke Kawakami
Thanks for feedback!

On Thu, Nov 14, 2024 at 9:40 PM Ronald S. Bultje  wrote:

> In terms of binary size, the above will triplicate nearly identical code in 
> the binary, and that doesn't really seem necessary. You should be able to 
> write the above out as a function and call it with appropriate xblen and 
> func_index arguments (check_add_obmc(0, 8); check_add_obmc(1, 16); 
> check_add_obmc(2, 32)) without duplicating these in the binary. Done for many 
> tests, this reduces binary size and compile time significantly.

Sounds good. I'm going to implement these since I have some free time now.

> You can also get bonus points if you randomize yblen, since the bitstream 
> allows multiple values for this (at least 4, 12, 16, 24, but possibly any 
> number), so maybe 1 + (random % 64) would be a good choice?

I'll try this one too.

Thanks.
Kyosuke

On Thu, Nov 14, 2024 at 9:40 PM Ronald S. Bultje  wrote:
>
> Hi,
>
> thanks for adding the test! This looks pretty good. Minor suggestion:
>
> On Wed, Nov 13, 2024 at 5:39 PM Kyosuke Kawakami  
> wrote:
>>
>> +#define RANDOMIZE_DESTS(name, size) \
>> +do {\
>> +int i;  \
>> +for (i = 0; i < size; ++i) {\
>> +uint16_t r = rnd(); \
>> +AV_WN16A(name##0 + i, r);   \
>> +AV_WN16A(name##1 + i, r);   \
>> +}   \
>> +} while (0)
>> +
>> +#define RANDOMIZE_BUFFER8(name, size) \
>> +do {  \
>> +int i;\
>> +for (i = 0; i < size; ++i) {  \
>> +uint8_t r = rnd();\
>> +name[i] = r;  \
>> +} \
>> +} while (0)
>> +
>> +#define OBMC_STRIDE 32
>> +
>> +#define CHECK_ADD_OBMC(func_index, yblen, xblen)
>>   \
>> +static void check_add_obmc ## xblen(void)   
>>   \
>> +{   
>>   \
>> +LOCAL_ALIGNED_8(uint8_t, src, [yblen * xblen]); 
>>   \
>> +LOCAL_ALIGNED_16(uint16_t, dst0, [yblen * xblen]);  
>>   \
>> +LOCAL_ALIGNED_16(uint16_t, dst1, [yblen * xblen]);  
>>   \
>> +LOCAL_ALIGNED_8(uint8_t, obmc_weight, [yblen * OBMC_STRIDE]);   
>>   \
>> +DiracDSPContext h;  
>>   \
>> +ff_diracdsp_init(&h);   
>>   \
>> +if (check_func(h.add_dirac_obmc[func_index], 
>> "diracdsp.add_dirac_obmc_%d", xblen)) {  \
>> +declare_func(void, uint16_t*, const uint8_t*, int, const 
>> uint8_t *, int); \
>> +RANDOMIZE_BUFFER8(src, yblen * xblen);  
>>   \
>> +RANDOMIZE_DESTS(dst, yblen * xblen);
>>   \
>> +RANDOMIZE_BUFFER8(obmc_weight, yblen * OBMC_STRIDE);
>>   \
>> +call_ref(dst0, src, xblen, obmc_weight, yblen); 
>>   \
>> +call_new(dst1, src, xblen, obmc_weight, yblen); 
>>   \
>> +if (memcmp(dst0, dst1, yblen * xblen))  
>>   \
>> +fail(); 
>>   \
>> +bench_new(dst1, src, xblen, obmc_weight, yblen);
>>   \
>> +}   
>>   \
>> +}   
>>   \
>> +
>> +CHECK_ADD_OBMC(0, 64, 8)
>> +CHECK_ADD_OBMC(1, 64, 16)
>> +CHECK_ADD_OBMC(2, 64, 32)
>> +
>> +void checkasm_check_diracdsp(void)
>> +{
>> +check_add_obmc8();
>> +check_add_obmc16();
>> +check_add_obmc32();
>> +report("diracdsp");
>> +}
>
>
> In terms of binary size, the above will triplicate nearly identical code in 
> the binary, and that doesn't really seem necessary. You should be able to 
> write the above out as a function and call it with appropriate xblen and 
> func_index arguments (check_add_obmc(0, 8); check_add_obmc(1, 16); 
> check_add_obmc(2, 32)) without duplicating these in the binary. Done for many 
> tests, this reduces binary size and compile time significantly.
>
> You can also get bonus points if you randomize yblen, since the bitstream 
> allows multiple values for this (at least 4, 12, 16, 24, but possibly an

Re: [FFmpeg-devel] [PATCH 3/3] avcodec/hevc: Add wasm simd128 idct

2024-11-14 Thread Zhao Zhili
> Le 13 novembre 2024 16:16:36 GMT+02:00, Zhao Zhili  > a écrit :
> >From: Zhao Zhili  >>
> >
> >We don't use intrinsics normally for performance reasons. However,
> >WASM isn't machine instruction.
> 
> I don't think that's relevant to the matter.
> 
> What may be relevant is that WASM is an infinite register ISA, so that the 
> compiler's register allocator will never run out and spill.
> 
> But then again, it will have to be recompiled to a finite register machine 
> ISA, with probably 8, 16 or 32 vector registers, and if we break whatever the 
> machine's limit is, performance will suck.
> 
> So I actually expect that assembler will be about as relevant for WASM as it 
> is for the machine ISA with the fewest vector registers... on that 
> architecture.
> 
> But in any case, I cannot agree with that unsubstantiated statement. Unless 
> proven otherwise, assembler is relevant here too.
OK, I can remove this misleading information from commit message. I still
have no interest to write WASM in handwritten assembly. Does anyone
has a plan on WASM SIMD support? If the answer is yes, I can stop at here.
The work is only a proof of concept.
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".


[FFmpeg-devel] [PATCH v3 0/3] avcodec/x86/diracdsp: migrate last remaining MMX function to SSE2

2024-11-14 Thread Kyosuke Kawakami
This series of patch migrates the last remaining MMX function in
diracdsp to SSE2.

Changes from v2 are:
- Rewrite tests with a normal function instead of a macro
- Fix typo in a commit message


GIT: [PATCH v3 1/3] checkasm/diracdsp: test add_dirac_obmc
GIT: [PATCH v3 2/3] avcodec/x86/diracdsp: migrate last remaining MMX
GIT: [PATCH v3 3/3] avcodec/x86/diracdsp_init: remove unused macro
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".


[FFmpeg-devel] [PATCH v3 0/3] avcodec/x86/diracdsp: migrate last remaining MMX function to SSE2

2024-11-14 Thread Kyosuke Kawakami
This series of patch migrates the last remaining MMX function in
diracdsp to SSE2.

Changes from v2 are:
- Rewrite tests with a normal function instead of a macro
- Fix typo in a commit message

___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".


[FFmpeg-devel] [PATCH v3 3/3] avcodec/x86/diracdsp_init: remove unused macro

2024-11-14 Thread Kyosuke Kawakami
PIXFUNC macro is unused since d29a9c2aa68fc3eb6d61ff95c698e29316037583.

Signed-off-by: Kyosuke Kawakami 
---
 libavcodec/x86/diracdsp_init.c | 5 -
 1 file changed, 5 deletions(-)

diff --git a/libavcodec/x86/diracdsp_init.c b/libavcodec/x86/diracdsp_init.c
index 08247133e1..ef01ebdf2e 100644
--- a/libavcodec/x86/diracdsp_init.c
+++ b/libavcodec/x86/diracdsp_init.c
@@ -56,11 +56,6 @@ void ff_dequant_subband_32_sse4(uint8_t *src, uint8_t *dst, 
ptrdiff_t stride, co
 }  
  \
 }
 
-#define PIXFUNC(PFX, IDX, EXT) 
  \
-/*MMXDISABLEDc->PFX ## _dirac_pixels_tab[0][IDX] = PFX ## _dirac_pixels8_ 
## EXT;*/  \
-c->PFX ## _dirac_pixels_tab[1][IDX] = PFX ## _dirac_pixels16_ ## EXT; \
-c->PFX ## _dirac_pixels_tab[2][IDX] = PFX ## _dirac_pixels32_ ## EXT
-
 #define DIRAC_PIXOP(OPNAME, EXT)\
 static void OPNAME ## _dirac_pixels16_ ## EXT(uint8_t *dst, const uint8_t 
*src[5], \
   int stride, int h) \
-- 
2.47.0

___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".


[FFmpeg-devel] [PATCH v3 2/3] avcodec/x86/diracdsp: migrate last remaining MMX function to SSE2

2024-11-14 Thread Kyosuke Kawakami
The add_dirac_obmc8_mmx function was the only MMX function left. This
patch migrates it to SSE2.

Here are the checkasm benchmark results:

diracdsp.add_dirac_obmc_8_c:2299.1 ( 1.00x)
diracdsp.add_dirac_obmc_8_mmx:   237.6 ( 9.68x)
diracdsp.add_dirac_obmc_8_sse2:  109.1 (21.07x)

Signed-off-by: Kyosuke Kawakami 
---
 libavcodec/x86/diracdsp.asm| 23 +++
 libavcodec/x86/diracdsp_init.c | 10 +++---
 2 files changed, 22 insertions(+), 11 deletions(-)

diff --git a/libavcodec/x86/diracdsp.asm b/libavcodec/x86/diracdsp.asm
index e5e2b11846..e708400b66 100644
--- a/libavcodec/x86/diracdsp.asm
+++ b/libavcodec/x86/diracdsp.asm
@@ -227,7 +227,7 @@ cglobal add_dirac_obmc%1_%2, 6,6,5, dst, src, stride, obmc, 
yblen
 punpckhbw   m1, m4
 movam2, [obmcq+i]
 movam3, m2
-   punpcklbw   m2, m4
+punpcklbw   m2, m4
 punpckhbw   m3, m4
 pmullw  m0, m2
 pmullw  m1, m3
@@ -247,9 +247,6 @@ cglobal add_dirac_obmc%1_%2, 6,6,5, dst, src, stride, obmc, 
yblen
 RET
 %endm
 
-INIT_MMX
-ADD_OBMC 8, mmx
-
 INIT_XMM
 PUT_RECT sse2
 ADD_RECT sse2
@@ -258,6 +255,24 @@ HPEL_FILTER sse2
 ADD_OBMC 32, sse2
 ADD_OBMC 16, sse2
 
+cglobal add_dirac_obmc8_sse2, 6,6,5, dst, src, stride, obmc, yblen
+pxorm4, m4
+.loop:
+movhm0, [srcq]
+punpcklbw   m0, m4
+movhm1, [obmcq]
+punpcklbw   m1, m4
+pmullw  m0, m1
+movum1, [dstq]
+paddw   m0, m1
+movu[dstq], m0
+lea srcq, [srcq+strideq]
+lea dstq, [dstq+2*strideq]
+add obmcq, 32
+sub yblend, 1
+jg  .loop
+RET
+
 INIT_XMM sse4
 
 ; void dequant_subband_32(uint8_t *src, uint8_t *dst, ptrdiff_t stride, const 
int qf, const int qs, int tot_v, int tot_h)
diff --git a/libavcodec/x86/diracdsp_init.c b/libavcodec/x86/diracdsp_init.c
index f678759dc0..08247133e1 100644
--- a/libavcodec/x86/diracdsp_init.c
+++ b/libavcodec/x86/diracdsp_init.c
@@ -24,8 +24,7 @@
 
 void ff_add_rect_clamped_sse2(uint8_t *, const uint16_t *, int, const int16_t 
*, int, int, int);
 
-void ff_add_dirac_obmc8_mmx(uint16_t *dst, const uint8_t *src, int stride, 
const uint8_t *obmc_weight, int yblen);
-
+void ff_add_dirac_obmc8_sse2(uint16_t *dst, const uint8_t *src, int stride, 
const uint8_t *obmc_weight, int yblen);
 void ff_add_dirac_obmc16_sse2(uint16_t *dst, const uint8_t *src, int stride, 
const uint8_t *obmc_weight, int yblen);
 void ff_add_dirac_obmc32_sse2(uint16_t *dst, const uint8_t *src, int stride, 
const uint8_t *obmc_weight, int yblen);
 
@@ -94,15 +93,12 @@ void ff_diracdsp_init_x86(DiracDSPContext* c)
 #if HAVE_X86ASM
 int mm_flags = av_get_cpu_flags();
 
-if (EXTERNAL_MMX(mm_flags)) {
-c->add_dirac_obmc[0] = ff_add_dirac_obmc8_mmx;
-}
-
 if (EXTERNAL_SSE2(mm_flags)) {
 c->dirac_hpel_filter = dirac_hpel_filter_sse2;
 c->add_rect_clamped = ff_add_rect_clamped_sse2;
 c->put_signed_rect_clamped[0] = (void 
*)ff_put_signed_rect_clamped_sse2;
 
+c->add_dirac_obmc[0] = ff_add_dirac_obmc8_sse2;
 c->add_dirac_obmc[1] = ff_add_dirac_obmc16_sse2;
 c->add_dirac_obmc[2] = ff_add_dirac_obmc32_sse2;
 
@@ -116,5 +112,5 @@ void ff_diracdsp_init_x86(DiracDSPContext* c)
 c->dequant_subband[1] = ff_dequant_subband_32_sse4;
 c->put_signed_rect_clamped[1] = ff_put_signed_rect_clamped_10_sse4;
 }
-#endif
+#endif // HAVE_X86ASM
 }
-- 
2.47.0

___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".


[FFmpeg-devel] [PATCH v3 1/3] checkasm/diracdsp: test add_dirac_obmc

2024-11-14 Thread Kyosuke Kawakami
Signed-off-by: Kyosuke Kawakami 
---
 tests/checkasm/Makefile   |  1 +
 tests/checkasm/checkasm.c |  3 ++
 tests/checkasm/checkasm.h |  1 +
 tests/checkasm/diracdsp.c | 86 +++
 tests/fate/checkasm.mak   |  1 +
 5 files changed, 92 insertions(+)
 create mode 100644 tests/checkasm/diracdsp.c

diff --git a/tests/checkasm/Makefile b/tests/checkasm/Makefile
index ae324ced3f..c7268d836e 100644
--- a/tests/checkasm/Makefile
+++ b/tests/checkasm/Makefile
@@ -29,6 +29,7 @@ AVCODECOBJS-$(CONFIG_AAC_DECODER)   += aacpsdsp.o \
 AVCODECOBJS-$(CONFIG_AAC_ENCODER)   += aacencdsp.o
 AVCODECOBJS-$(CONFIG_ALAC_DECODER)  += alacdsp.o
 AVCODECOBJS-$(CONFIG_DCA_DECODER)   += synth_filter.o
+AVCODECOBJS-$(CONFIG_DIRAC_DECODER) += diracdsp.o
 AVCODECOBJS-$(CONFIG_EXR_DECODER)   += exrdsp.o
 AVCODECOBJS-$(CONFIG_FLAC_DECODER)  += flacdsp.o
 AVCODECOBJS-$(CONFIG_HUFFYUV_DECODER)   += huffyuvdsp.o
diff --git a/tests/checkasm/checkasm.c b/tests/checkasm/checkasm.c
index c9d2b5faf1..fb307af0ae 100644
--- a/tests/checkasm/checkasm.c
+++ b/tests/checkasm/checkasm.c
@@ -138,6 +138,9 @@ static const struct {
 #if CONFIG_DCA_DECODER
 { "synth_filter", checkasm_check_synth_filter },
 #endif
+#if CONFIG_DIRAC_DECODER
+{ "diracdsp", checkasm_check_diracdsp },
+#endif
 #if CONFIG_EXR_DECODER
 { "exrdsp", checkasm_check_exrdsp },
 #endif
diff --git a/tests/checkasm/checkasm.h b/tests/checkasm/checkasm.h
index 866eef01e9..0ba5c3040d 100644
--- a/tests/checkasm/checkasm.h
+++ b/tests/checkasm/checkasm.h
@@ -84,6 +84,7 @@ void checkasm_check_blend(void);
 void checkasm_check_blockdsp(void);
 void checkasm_check_bswapdsp(void);
 void checkasm_check_colorspace(void);
+void checkasm_check_diracdsp(void);
 void checkasm_check_exrdsp(void);
 void checkasm_check_fdctdsp(void);
 void checkasm_check_fixed_dsp(void);
diff --git a/tests/checkasm/diracdsp.c b/tests/checkasm/diracdsp.c
new file mode 100644
index 00..8833c2d223
--- /dev/null
+++ b/tests/checkasm/diracdsp.c
@@ -0,0 +1,86 @@
+/*
+ * Copyright (c) 2024 Kyosuke Kawakami
+ *
+ * This file is part of FFmpeg.
+ *
+ * FFmpeg is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ *
+ * FFmpeg is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License along
+ * with FFmpeg; if not, write to the Free Software Foundation, Inc.,
+ * 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA.
+ */
+
+#include "checkasm.h"
+
+#include "libavcodec/diracdsp.h"
+
+#include "libavutil/intreadwrite.h"
+#include "libavutil/mem_internal.h"
+
+#define RANDOMIZE_DESTS(name, size) \
+do {\
+int i;  \
+for (i = 0; i < size; ++i) {\
+uint16_t r = rnd(); \
+AV_WN16A(name##0 + i, r);   \
+AV_WN16A(name##1 + i, r);   \
+}   \
+} while (0)
+
+#define RANDOMIZE_BUFFER8(name, size) \
+do {  \
+int i;\
+for (i = 0; i < size; ++i) {  \
+uint8_t r = rnd();\
+name[i] = r;  \
+} \
+} while (0)
+
+#define OBMC_STRIDE 32
+#define XBLEN_MAX 32
+#define YBLEN_MAX 64
+
+static void check_add_obmc(size_t func_index, int xblen)
+{
+LOCAL_ALIGNED_8(uint8_t, src, [XBLEN_MAX * YBLEN_MAX]);
+LOCAL_ALIGNED_16(uint16_t, dst0, [XBLEN_MAX * YBLEN_MAX]);
+LOCAL_ALIGNED_16(uint16_t, dst1, [XBLEN_MAX * YBLEN_MAX]);
+LOCAL_ALIGNED_8(uint8_t, obmc_weight, [XBLEN_MAX * YBLEN_MAX]);
+
+int yblen;
+DiracDSPContext h;
+
+ff_diracdsp_init(&h);
+
+if (check_func(h.add_dirac_obmc[func_index], "diracdsp.add_dirac_obmc_%d", 
xblen)) {
+declare_func(void, uint16_t*, const uint8_t*, int, const uint8_t *, 
int);
+
+yblen = 1 + (rnd() % YBLEN_MAX);
+RANDOMIZE_BUFFER8(src, yblen * xblen);
+RANDOMIZE_DESTS(dst, yblen * xblen);
+RANDOMIZE_BUFFER8(obmc_weight, yblen * OBMC_STRIDE);
+
+call_ref(dst0, src, xblen, obmc_weight, yblen);
+call_new(dst1, src, xblen, obmc_weight, yblen);
+if (memcmp(dst0, dst1, yblen * xblen))
+fail();
+
+bench_new(dst1, src, xblen, obmc_weight, yblen);
+}
+}
+
+void checkasm_check_diracdsp(void)
+{
+check_add_obmc(0, 8);
+check_add_obm

Re: [FFmpeg-devel] [PATCH v3 1/3] checkasm/diracdsp: test add_dirac_obmc

2024-11-14 Thread Ronald S. Bultje
Hi,

On Thu, Nov 14, 2024 at 9:32 AM Kyosuke Kawakami 
wrote:

> Signed-off-by: Kyosuke Kawakami 
> ---
>  tests/checkasm/Makefile   |  1 +
>  tests/checkasm/checkasm.c |  3 ++
>  tests/checkasm/checkasm.h |  1 +
>  tests/checkasm/diracdsp.c | 86 +++
>  tests/fate/checkasm.mak   |  1 +
>  5 files changed, 92 insertions(+)
>  create mode 100644 tests/checkasm/diracdsp.c
>

LGTM. I'll merge this tomorrow if there's no further comments.

Ronald
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".


Re: [FFmpeg-devel] [PATCH v3 0/3] avcodec/x86/diracdsp: migrate last remaining MMX function to SSE2

2024-11-14 Thread Kyosuke Kawakami
Sorry, I sent a top message twice by mistake. Please ignore this one.

Kyosuke
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".