date:20240615

Re: [FFmpeg-devel] [PATCH 1/2] avcodec/jpeg2000dec: Add support for placeholder passes, CAP, and CPF markers

2024-06-15 Thread WATANABE Osamu

Hi Pierre,

Here is the list of supported J2K reference codestreams.
p0_10.j2k
p1_01.j2k
p1_02.j2k
p1_03.j2k
p1_04.j2k
p1_05.j2k
p1_06.j2k
ds0_hm_15_b8.j2k
ds0_ht_02_b11.j2k
ds0_ht_02_b12.j2k
ds0_ht_03_b11.j2k
ds0_ht_03_b14.j2k
ds0_ht_04_b11.j2k
ds0_ht_04_b12.j2k
ds0_ht_05_b11.j2k
ds0_ht_05_b12.j2k
ds0_ht_07_b11.j2k
ds0_ht_07_b15.j2k
ds0_ht_07_b16.j2k
ds0_ht_08_b11.j2k
ds0_ht_08_b15.j2k
ds0_ht_08_b16.j2k
ds0_ht_09_b11.j2k
ds0_ht_10_b11.j2k
ds0_ht_11_b10.j2k
ds0_ht_12_b11.j2k
ds0_ht_14_b11.j2k
ds0_ht_15_b11.j2k
ds0_ht_15_b14.j2k
ds0_ht_16_b11.j2k
ds1_ht_01_b11.j2k
ds1_ht_01_b12.j2k
ds1_ht_02_b11.j2k
ds1_ht_02_b12.j2k
ds1_ht_03_b11.j2k
ds1_ht_03_b12.j2k
ds1_ht_04_b9.j2k
ds1_ht_05_b11.j2k
ds1_ht_06_b11.j2k
hifi_ht1_02.j2k
hifi_p1_02.j2k

FYI - the following codestreams are excluded from the list due to their 
unsupported pixel format in FFMPEG.
p0_06.j2k
p0_13.j2k
p1_07.j2k
ds0_ht_06_b11.j2k
ds0_ht_06_b15.j2k
ds0_ht_06_b18.j2k
ds0_hm_06_b11.j2k
ds0_hm_06_b18.j2k
ds0_ht_13_b11.j2k
ds1_ht_07_b11.j2k

Separate color component should be decodable and I can provide them if 
necessary.

Best,
Osau

> On Jun 15, 2024, at 15:00, Pierre-Anthony Lemieux  wrote:
> 
> Hi Osamu,
> 
> Can you provide the list of J2K reference codestreams that this code
> allows FFMPEG to support -- not including those already supported [1]?
> 
> [1] https://github.com/FFmpeg/FFmpeg/blob/master/tests/fate/jpeg2000.mak
> 
> Best,
> 
> -- Pierre
> 
> On Fri, Jun 14, 2024 at 8:15?PM Osamu Watanabe
>  wrote:
>> 
>> Signed-off-by: Osamu Watanabe 
>> ---
>> libavcodec/jpeg2000.h  |  10 +
>> libavcodec/jpeg2000dec.c   | 454 ++---
>> libavcodec/jpeg2000dec.h   |   7 +
>> libavcodec/jpeg2000htdec.c | 225 ++
>> libavcodec/jpeg2000htdec.h |   2 +-
>> 5 files changed, 518 insertions(+), 180 deletions(-)
>> 
>> diff --git a/libavcodec/jpeg2000.h b/libavcodec/jpeg2000.h
>> index d004c08f10..93221d90ca 100644
>> --- a/libavcodec/jpeg2000.h
>> +++ b/libavcodec/jpeg2000.h
>> @@ -37,12 +37,14 @@
>> 
>> enum Jpeg2000Markers {
>> JPEG2000_SOC = 0xff4f, // start of codestream
>> +JPEG2000_CAP = 0xff50, // extended capabilities
>> JPEG2000_SIZ = 0xff51, // image and tile size
>> JPEG2000_COD,  // coding style default
>> JPEG2000_COC,  // coding style component
>> JPEG2000_TLM = 0xff55, // tile-part length, main header
>> JPEG2000_PLM = 0xff57, // packet length, main header
>> JPEG2000_PLT,  // packet length, tile-part header
>> +JPEG2000_CPF,  // corresponding profile
>> JPEG2000_QCD = 0xff5c, // quantization default
>> JPEG2000_QCC,  // quantization component
>> JPEG2000_RGN,  // region of interest
>> @@ -58,6 +60,12 @@ enum Jpeg2000Markers {
>> JPEG2000_EOC = 0xffd9, // end of codestream
>> };
>> 
>> +enum JPEG2000_Ccap15_b14_15_params {
>> +HTJ2K_HTONLY = 0,  // HTONLY, bit 14 and 15 are 0
>> +HTJ2K_HTDECLARED,  // HTDECLARED, bit 14 = 1 and bit 15 = 0
>> +HTJ2K_MIXED = 3,   // MIXED, bit 14 and 15 are 1
>> +};
>> +
>> #define JPEG2000_SOP_FIXED_BYTES 0xFF910004
>> #define JPEG2000_SOP_BYTE_LENGTH 6
>> 
>> @@ -192,6 +200,8 @@ typedef struct Jpeg2000Cblk {
>> /* specific to HT code-blocks */
>> int zbp;
>> int pass_lengths[2];
>> +uint8_t modes; // copy of SPcod/SPcoc field to parse HT-MIXED mode
>> +uint8_t ht_plhd; // are we looking for HT placeholder passes?
>> } Jpeg2000Cblk; // code block
>> 
>> typedef struct Jpeg2000Prec {
>> diff --git a/libavcodec/jpeg2000dec.c b/libavcodec/jpeg2000dec.c
>> index d15502a527..d299c67cc7 100644
>> --- a/libavcodec/jpeg2000dec.c
>> +++ b/libavcodec/jpeg2000dec.c
>> @@ -54,6 +54,15 @@
>> #define HAD_COC 0x01
>> #define HAD_QCC 0x02
>> 
>> +// Values of flag for placeholder passes
>> +enum HT_PLHD_STATUS {
>> +HT_PLHD_OFF,
>> +HT_PLHD_ON
>> +};
>> +
>> +#define HT_MIXED 0x80 // bit 7 of SPcod/SPcoc
>> +
>> +
>> /* get_bits functions for JPEG2000 packet bitstream
>>  * It is a get_bit function with a bit-stuffing routine. If the value of the
>>  * byte is 0xFF, the next byte includes an extra zero bit stuffed into the 
>> MSB.
>> @@ -382,6 +391,9 @@ static int get_siz(Jpeg2000DecoderContext *s)
>> } else if (ncomponents == 1 && s->precision == 8) {
>> s->avctx->pix_fmt = AV_PIX_FMT_GRAY8;
>> i = 0;
>> +} else if (ncomponents == 1 && s->precision == 12) {
>> +s->avctx->pix_fmt = AV_PIX_FMT_GRAY16LE;
>> +i = 0;
>> }
>> }
>> 
>> @@ -408,6 +420,73 @@ static int get_siz(Jpeg2000DecoderContext *s)
>> s->avctx->bits_per_raw_sample = s->precision;
>> return 0;
>> }
>> +/* get extended capabilities (CAP) marker segment */
>> +static int get_cap(Jpeg2000DecoderContext *s, Jpeg2000CodingStyle *c)
>> +{
>> +uint32_t Pcap;
>> +uint16_t Ccap_i[32] = { 0 };
>> +uint16_t Ccap_15;
>> +uint8_t P;
>> +
>> +if (bytestream2_get_bytes_left(&s-

Re: [FFmpeg-devel] [PATCH v2] movenc: Add an option for hiding fragments at the end

2024-06-15 Thread Gyan Doshi




On 2024-06-15 03:54 am, Dennis Sädtler via ffmpeg-devel wrote:

On 2024-06-14 13:23, Gyan Doshi wrote:



On 2024-06-14 04:35 pm, Timo Rothenpieler wrote:

On 14/06/2024 12:44, Martin Storsjö wrote:

On Fri, 14 Jun 2024, Gyan Doshi wrote:


On 2024-06-14 02:18 am, Martin Storsjö wrote:

On Thu, 13 Jun 2024, Gyan Doshi wrote:


On 2024-06-13 06:20 pm, Martin Storsjö wrote:


I'd otherwise want to push this, but I'm not entirely satisfied 
with the option name quite yet. I'm pondering if we should call 
it "hybrid_fragmented" - any opinions, Dennis or Timo?


How about `resilient_mode` or `recoverable`?
I agree that the how is secondary.


Those are good suggestions as well - but I think I prefer 
"hybrid_fragmented" still.


In theory, I guess one could implement resilient writing in a 
number of different ways, whereas the hybrid 
fragmented/non-fragmented only is one.


So with a couple other voices agreeing with the name 
"hybrid_fragmented", I'll post a new patch with the option in 
that form - hopefully you don't object to it.


The term hybrid is not applicable here. The fragmented state is 
transient during writing and contingent in the finished artifact 
depending on how the writing process concluded.
Hybrid implies both modes available e.g.. a hybrid vehicle can use 
both types of energy sources. The artifact here will be one _or_ 
the other.


Sure, the file itself is either or, but the process of writing will 
have utilized both. TBH, I don't see it as such a black-or-white 
thing.


What do the others who have chimed in on the thread think, compared 
to calling it "recoverable" or "resilient_mode"?


I don't have a super strong opinion on it, but out of the options 
provided, I'd prefer the hybrid_ one, since there's a good chance 
it'll become an established term now that OBS presents it quite 
publicly visible.


The OBS dev intends to change the term:

"Come up with a better name than "Hybrid MP4" that hopefully won't 
confuse users"
https://github.com/obsproject/obs-studio/pull/10608#issuecomment-2095222024 



Regards,
Gyan


Now that it's merged and in the hands of users I don't have any 
intention of changing the name any more.
We had some chats about about it, but nobody suggested anything that 
people agreed was better, so it stuck.


While "resilient" certainly fits, it could equally apply to regular 
fragmented MP4 (e.g. vMix uses that terminology for fMP4 if I'm not 
mistaken).
The important attribute with this approach is that it's resilient 
*and* compatible, and I'm still not sure how to get that across in 
name alone.


How about `failsafe`?

Regards,
Gyan

___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".

Re: [FFmpeg-devel] [PATCH v6 1/4] doc: Explain what "context" means

2024-06-15 Thread Stefano Sabatini

On date Thursday 2024-06-13 15:20:38 +0100, Andrew Sayers wrote:
> On Wed, Jun 12, 2024 at 10:52:00PM +0200, Stefano Sabatini wrote:
[...]
> > > +@section Context_general “Context” as a general concept
> [...]
> > I'd skip all this part, as we assume the reader is already familiar
> > with C language and with data encapsulation through struct, if he is
> > not this is not the right place where to teach about C language
> > fundamentals.
> 
> I disagree, for a reason I've been looking for an excuse to mention :)
> 

> Let's assume 90% of people who use FFmpeg already know something in the doc.
> You could say that part of the doc is useless to 90% of the audience.
> Or you could say that 90% of FFmpeg users are not our audience.
> 
> Looking at it the second way means you need to spend more time on "routing" -
> linking to the document in ways that (only) attract your target audience,
> making a table of contents with headings that aid skip-readers, etc.
> But once you've routed people around the bits they don't care about,
> it's fine to have documentation that's only needed by a minority.
> 

> Also, less interesting but equally important - context is not a C language
> fundamental, it's more like an emergent property of large C projects.  A
> developer that came here without knowing e.g. what a struct is could read
> any of the online tutorials that explain the concept better than we could.
> I'd be happy to link to a good tutorial about contexts if we found one,
> but we have to meet people where they are, and this is the best solution
> I've been able to find.

The context is just another way to call a struct used to keep an
entity state operated by several functions (that is in other words an
object and its methods), it's mostly about the specific jargon used by
FFmpeg (and used by other C projects as well). In addition to this we
provide some generic utilities (logging+avoptions) which can be used
through AVClass employment.

Giving a longer explanation is making this appear something more
complicated than actually is. My point is that providing more
information than actually needed provides the long-wall-of-text effect
(I need to read through all this to understand it - nah I'd rather
give-up), thus discouraging readers.

> 
> > 
> > > +
> > > +When reading code that *is* explicitly described in terms of contexts,
> > > +remember that the term's meaning is guaranteed by *the project's 
> > > community*,
> > > +not *the language it's written in*.  That means guarantees may be more 
> > > flexible
> > > +and change more over time.  For example, programming languages that use
> > > +[encapsulation](https://en.wikipedia.org/wiki/Encapsulation_(computer_programming))
> > > +will simply refuse to compile code that violates its rules about access,
> > > +while communities can put up with special cases if they improve code 
> > > quality.
> > > +
> > 
> > This looks a bit vague so I'd rather drop this.

I mean, if you read for the first time:
| [the context] term's meaning is guaranteed by *the project's
| community*, not the languaguage it's written for.
| That means guarantees may be more flexible and change more over time.

it's very hard to figure out what these guarantees are about, and this
might apply to every specific language and to every specific term,
that's why I consider this "vague".
 
[...]
> > > +Some functions fit awkwardly within FFmpeg's context idiom, so they send 
> > > mixed
> > > +signals.  For example, av_ambient_viewing_environment_create_side_data() 
> > > creates
> > > +an AVAmbientViewingEnvironment context, then adds it to the side-data of 
> > > an
> > > +AVFrame context.  So its name hints at one context, its parameter hints 
> > > at
> > > +another, and its documentation is silent on the issue.  You might prefer 
> > > to
> > > +think of such functions as not having a context, or as “receiving” one 
> > > context
> > > +and “producing” another.
> > 
> > I'd skip this paragraph. In fact, I think that API makes perfect
> > sense, OOP languages adopt such constructs all the time, for example
> > this could be a static module/class constructor. In other words, we
> > are not telling anywhere that all functions should take a "context" as
> > its first argument, and the documentation specify exactly how this
> > works, if you feel this is not clear or silent probably this is a sign
> > that that function documentation should be extended.
> 

> That would be fine if it were just this function, but FFmpeg is littered
> with special cases that don't quite fit.

I still fail to see the general rule for which this is creating a
special case. If this is a special case, what is this special case
for?

> Another example might be swr_alloc_set_opts2(), which can take an
> SwrContext in a way that resembles a context, or can take NULL and
> allocate a new SwrContext.  And yes, we could document that edge
> case, and the next one, and the one after that. But even if we
> documented every litt

[FFmpeg-devel] [PATCH 1/2] swscale/aarch64: Add bgr24 to yuv

2024-06-15 Thread Zhao Zhili

From: Zhao Zhili 

Test on Apple M1 with kperf

bgr24_to_uv_8_c: 41.5
bgr24_to_uv_8_neon: 41.8
bgr24_to_uv_128_c: 133.5
bgr24_to_uv_128_neon: 94.3
bgr24_to_uv_1080_c: 960.5
bgr24_to_uv_1080_neon: 751.0
bgr24_to_uv_1920_c: 1695.3
bgr24_to_uv_1920_neon: 1357.3
bgr24_to_uv_half_8_c: 45.0
bgr24_to_uv_half_8_neon: 11.0
bgr24_to_uv_half_128_c: 130.5
bgr24_to_uv_half_128_neon: 51.8
bgr24_to_uv_half_1080_c: 877.3
bgr24_to_uv_half_1080_neon: 414.0
bgr24_to_uv_half_1920_c: 1540.0
bgr24_to_uv_half_1920_neon: 695.0
bgr24_to_y_8_c: 24.3
bgr24_to_y_8_neon: 12.8
bgr24_to_y_128_c: 94.3
bgr24_to_y_128_neon: 47.5
bgr24_to_y_1080_c: 611.5
bgr24_to_y_1080_neon: 437.5
bgr24_to_y_1920_c: 1077.3
bgr24_to_y_1920_neon: 765.3
---
 libswscale/aarch64/input.S   | 79 
 libswscale/aarch64/swscale.c | 32 +--
 2 files changed, 80 insertions(+), 31 deletions(-)

diff --git a/libswscale/aarch64/input.S b/libswscale/aarch64/input.S
index 33afa34111..2b956fe5c2 100644
--- a/libswscale/aarch64/input.S
+++ b/libswscale/aarch64/input.S
@@ -20,7 +20,7 @@
 
 #include "libavutil/aarch64/asm.S"
 
-.macro rgb24_to_yuv_load_rgb, src
+.macro rgb_to_yuv_load_rgb src
 ld3 { v16.16b, v17.16b, v18.16b }, [\src]
 uxtlv19.8h, v16.8b // v19: r
 uxtlv20.8h, v17.8b // v20: g
@@ -30,7 +30,7 @@
 uxtl2   v24.8h, v18.16b// v24: b
 .endm
 
-.macro rgb24_to_yuv_product, r, g, b, dst1, dst2, dst, coef0, coef1, coef2, 
right_shift
+.macro rgb_to_yuv_product r, g, b, dst1, dst2, dst, coef0, coef1, coef2, 
right_shift
 mov \dst1\().16b, v6.16b// dst1 = 
const_offset
 mov \dst2\().16b, v6.16b// dst2 = 
const_offset
 smlal   \dst1\().4s, \coef0\().4h, \r\().4h // dst1 += rx 
* r
@@ -43,12 +43,20 @@
 sqshrn2 \dst\().8h, \dst2\().4s, \right_shift   // 
dst_higher_half = dst2 >> right_shift
 .endm
 
-function ff_rgb24ToY_neon, export=1
+.macro rgbToY bgr
 cmp w4, #0  // check width > 0
+.if \bgr
+ldr w12, [x5]   // w12: ry
+ldr w11, [x5, #4]   // w11: gy
+ldr w10, [x5, #8]   // w10: by
+.else
 ldp w10, w11, [x5]  // w10: ry, w11: gy
 ldr w12, [x5, #8]   // w12: by
+.endif
 b.le3f
 
+// The following comments assume RGB order. The logic for RGB and BGR 
is the same.
+
 mov w9, #256// w9 = 1 << (RGB2YUV_SHIFT - 
7)
 movkw9, #8, lsl #16 // w9 += 32 << (RGB2YUV_SHIFT 
- 1)
 dup v6.4s, w9   // w9: const_offset
@@ -59,9 +67,9 @@ function ff_rgb24ToY_neon, export=1
 dup v2.8h, w12
 b.lt2f
 1:
-rgb24_to_yuv_load_rgb x1
-rgb24_to_yuv_product v19, v20, v21, v25, v26, v16, v0, v1, v2, #9
-rgb24_to_yuv_product v22, v23, v24, v27, v28, v17, v0, v1, v2, #9
+rgb_to_yuv_load_rgb x1
+rgb_to_yuv_product v19, v20, v21, v25, v26, v16, v0, v1, v2, #9
+rgb_to_yuv_product v22, v23, v24, v27, v28, v17, v0, v1, v2, #9
 sub w4, w4, #16 // width -= 16
 add x1, x1, #48 // src += 48
 cmp w4, #16 // width >= 16 ?
@@ -83,12 +91,29 @@ function ff_rgb24ToY_neon, export=1
 cbnzw4, 2b
 3:
 ret
+.endm
+
+function ff_rgb24ToY_neon, export=1
+rgbToY  bgr=0
+endfunc
+
+function ff_bgr24ToY_neon, export=1
+rgbToY  bgr=1
 endfunc
 
-.macro rgb24_load_uv_coeff half
+.macro rgb_load_uv_coeff half, bgr
+.if \bgr
+ldr w12, [x6, #12]
+ldr w11, [x6, #16]
+ldr w10, [x6, #20]
+ldr w15, [x6, #24]
+ldr w14, [x6, #28]
+ldr w13, [x6, #32]
+.else
 ldp w10, w11, [x6, #12] // w10: ru, w11: gu
 ldp w12, w13, [x6, #20] // w12: bu, w13: rv
 ldp w14, w15, [x6, #28] // w14: gv, w15: bv
+.endif
 .if \half
 mov w9, #512
 movkw9, #128, lsl #16   // w9: const_offset
@@ -105,21 +130,22 @@ endfunc
 dup v6.4s, w9
 .endm
 
-function ff_rgb24ToUV_half_neon, export=1
+.macro rgbToUV_half bgr
 cmp w5, #0  // check width > 0
 b.le3f
 
 cmp w5, #8
-rgb24_load_uv_coeff half=1
+rgb_load_uv_coeff half=1, bgr=\bgr
 b.lt2f
+// The following comments assume RGB order. The logic for RGB and BGR 
is the same.
 1:
 ld3 { v16.16b, v17.16

[FFmpeg-devel] [PATCH 2/2] swscale/aarch64: Add bgra/rgba to yuv

2024-06-15 Thread Zhao Zhili

From: Zhao Zhili 

Test on Apple M1 with kperf

bgra_to_uv_8_c: 13.4
bgra_to_uv_8_neon: 37.4
bgra_to_uv_128_c: 155.9
bgra_to_uv_128_neon: 91.7
bgra_to_uv_1080_c: 1173.2
bgra_to_uv_1080_neon: 822.7
bgra_to_uv_1920_c: 2078.2
bgra_to_uv_1920_neon: 1437.7
bgra_to_uv_half_8_c: 17.9
bgra_to_uv_half_8_neon: 37.4
bgra_to_uv_half_128_c: 103.9
bgra_to_uv_half_128_neon: 73.9
bgra_to_uv_half_1080_c: 850.2
bgra_to_uv_half_1080_neon: 484.2
bgra_to_uv_half_1920_c: 1479.2
bgra_to_uv_half_1920_neon: 824.2
bgra_to_y_8_c: 8.2
bgra_to_y_8_neon: 18.2
bgra_to_y_128_c: 101.4
bgra_to_y_128_neon: 74.9
bgra_to_y_1080_c: 739.4
bgra_to_y_1080_neon: 613.4
bgra_to_y_1920_c: 1298.7
bgra_to_y_1920_neon: 918.7
---
 libswscale/aarch64/input.S   | 81 +++-
 libswscale/aarch64/swscale.c | 16 +++
 2 files changed, 86 insertions(+), 11 deletions(-)

diff --git a/libswscale/aarch64/input.S b/libswscale/aarch64/input.S
index 2b956fe5c2..37f1158504 100644
--- a/libswscale/aarch64/input.S
+++ b/libswscale/aarch64/input.S
@@ -20,8 +20,12 @@
 
 #include "libavutil/aarch64/asm.S"
 
-.macro rgb_to_yuv_load_rgb src
+.macro rgb_to_yuv_load_rgb src, element=3
+.if \element == 3
 ld3 { v16.16b, v17.16b, v18.16b }, [\src]
+.else
+ld4 { v16.16b, v17.16b, v18.16b, v19.16b }, [\src]
+.endif
 uxtlv19.8h, v16.8b // v19: r
 uxtlv20.8h, v17.8b // v20: g
 uxtlv21.8h, v18.8b // v21: b
@@ -43,7 +47,7 @@
 sqshrn2 \dst\().8h, \dst2\().4s, \right_shift   // 
dst_higher_half = dst2 >> right_shift
 .endm
 
-.macro rgbToY bgr
+.macro rgbToY bgr, element=3
 cmp w4, #0  // check width > 0
 .if \bgr
 ldr w12, [x5]   // w12: ry
@@ -67,11 +71,15 @@
 dup v2.8h, w12
 b.lt2f
 1:
-rgb_to_yuv_load_rgb x1
+rgb_to_yuv_load_rgb x1, \element
 rgb_to_yuv_product v19, v20, v21, v25, v26, v16, v0, v1, v2, #9
 rgb_to_yuv_product v22, v23, v24, v27, v28, v17, v0, v1, v2, #9
 sub w4, w4, #16 // width -= 16
+.if \element == 3
 add x1, x1, #48 // src += 48
+.else
+add x1, x1, #64
+.endif
 cmp w4, #16 // width >= 16 ?
 stp q16, q17, [x0], #32 // store to dst
 b.ge1b
@@ -86,7 +94,7 @@
 smaddl  x13, w15, w12, x13  // x13 += by * b
 asr w13, w13, #9// x13 >>= 9
 sub w4, w4, #1  // width--
-add x1, x1, #3  // src += 3
+add x1, x1, \element
 strhw13, [x0], #2   // store to dst
 cbnzw4, 2b
 3:
@@ -101,6 +109,14 @@ function ff_bgr24ToY_neon, export=1
 rgbToY  bgr=1
 endfunc
 
+function ff_rgba32ToY_neon, export=1
+rgbToY  bgr=0, element=4
+endfunc
+
+function ff_bgra32ToY_neon, export=1
+rgbToY  bgr=1, element=4
+endfunc
+
 .macro rgb_load_uv_coeff half, bgr
 .if \bgr
 ldr w12, [x6, #12]
@@ -130,7 +146,7 @@ endfunc
 dup v6.4s, w9
 .endm
 
-.macro rgbToUV_half bgr
+.macro rgbToUV_half bgr, element=3
 cmp w5, #0  // check width > 0
 b.le3f
 
@@ -139,7 +155,11 @@ endfunc
 b.lt2f
 // The following comments assume RGB order. The logic for RGB and BGR 
is the same.
 1:
+.if \element == 3
 ld3 { v16.16b, v17.16b, v18.16b }, [x3]
+.else
+ld4 { v16.16b, v17.16b, v18.16b, v19.16b }, [x3]
+.endif
 uaddlp  v19.8h, v16.16b // v19: r
 uaddlp  v20.8h, v17.16b // v20: g
 uaddlp  v21.8h, v18.16b // v21: b
@@ -147,7 +167,11 @@ endfunc
 rgb_to_yuv_product v19, v20, v21, v22, v23, v16, v0, v1, v2, #10
 rgb_to_yuv_product v19, v20, v21, v24, v25, v17, v3, v4, v5, #10
 sub w5, w5, #8  // width -= 8
-add x3, x3, #48 // src += 48
+.if \element == 3
+add x3, x3, #48
+.else
+add x3, x3, #64
+.endif
 cmp w5, #8  // width >= 8 ?
 str q16, [x0], #16  // store dst_u
 str q17, [x1], #16  // store dst_v
@@ -155,9 +179,10 @@ endfunc
 cbz w5, 3f
 2:
 ldrbw2, [x3]// w2: r1
-ldrbw4, [x3, #3]// w4: r2
+ldrbw4, [x3, \element]  // w4: r2
 add w2, w2, w4  // w2 = r1 + r2
 
+.if \element == 3

[FFmpeg-devel] [PATCH v4 3/4] lavc/vp9dsp: R-V V mc tap h v

2024-06-15 Thread uk7b

From: sunyuechi 

 C908   X60
vp9_avg_8tap_smooth_4h_8bpp_c  :   12.7   11.2
vp9_avg_8tap_smooth_4h_8bpp_rvv_i32:4.74.2
vp9_avg_8tap_smooth_4v_8bpp_c  :   29.7   12.5
vp9_avg_8tap_smooth_4v_8bpp_rvv_i32:4.74.2
vp9_avg_8tap_smooth_8h_8bpp_c  :   48.7   42.2
vp9_avg_8tap_smooth_8h_8bpp_rvv_i32:9.58.5
vp9_avg_8tap_smooth_8v_8bpp_c  :   49.7   45.5
vp9_avg_8tap_smooth_8v_8bpp_rvv_i32:9.58.5
vp9_avg_8tap_smooth_16h_8bpp_c :  192.0  166.5
vp9_avg_8tap_smooth_16h_8bpp_rvv_i32   :   21.7   19.5
vp9_avg_8tap_smooth_16v_8bpp_c :  191.2  175.2
vp9_avg_8tap_smooth_16v_8bpp_rvv_i32   :   21.2   19.0
vp9_avg_8tap_smooth_32h_8bpp_c :  780.2  663.2
vp9_avg_8tap_smooth_32h_8bpp_rvv_i32   :   68.2   60.5
vp9_avg_8tap_smooth_32v_8bpp_c :  770.0  685.7
vp9_avg_8tap_smooth_32v_8bpp_rvv_i32   :   67.0   59.5
vp9_avg_8tap_smooth_64h_8bpp_c : 3116.2 2648.2
vp9_avg_8tap_smooth_64h_8bpp_rvv_i32   :  270.7  120.7
vp9_avg_8tap_smooth_64v_8bpp_c : 3058.5 2731.7
vp9_avg_8tap_smooth_64v_8bpp_rvv_i32   :  266.5  119.0
vp9_put_8tap_smooth_4h_8bpp_c  :   11.09.7
vp9_put_8tap_smooth_4h_8bpp_rvv_i32:4.23.7
vp9_put_8tap_smooth_4v_8bpp_c  :   11.7   10.5
vp9_put_8tap_smooth_4v_8bpp_rvv_i32:4.03.7
vp9_put_8tap_smooth_8h_8bpp_c  :   42.0   37.5
vp9_put_8tap_smooth_8h_8bpp_rvv_i32:8.57.7
vp9_put_8tap_smooth_8v_8bpp_c  :   43.5   38.5
vp9_put_8tap_smooth_8v_8bpp_rvv_i32:8.77.7
vp9_put_8tap_smooth_16h_8bpp_c :  181.7  147.2
vp9_put_8tap_smooth_16h_8bpp_rvv_i32   :   20.0   18.0
vp9_put_8tap_smooth_16v_8bpp_c :  168.5  149.7
vp9_put_8tap_smooth_16v_8bpp_rvv_i32   :   19.7   17.5
vp9_put_8tap_smooth_32h_8bpp_c :  675.0  586.5
vp9_put_8tap_smooth_32h_8bpp_rvv_i32   :   65.2   58.0
vp9_put_8tap_smooth_32v_8bpp_c :  664.7  591.2
vp9_put_8tap_smooth_32v_8bpp_rvv_i32   :   64.0   57.0
vp9_put_8tap_smooth_64h_8bpp_c : 2696.2 2339.0
vp9_put_8tap_smooth_64h_8bpp_rvv_i32   :  259.7  115.7
vp9_put_8tap_smooth_64v_8bpp_c : 2691.0 2348.5
vp9_put_8tap_smooth_64v_8bpp_rvv_i32   :  255.5  114.0
---
 libavcodec/riscv/vp9_mc_rvv.S  | 200 +
 libavcodec/riscv/vp9dsp.h  |  72 
 libavcodec/riscv/vp9dsp_init.c |  38 ++-
 3 files changed, 285 insertions(+), 25 deletions(-)

diff --git a/libavcodec/riscv/vp9_mc_rvv.S b/libavcodec/riscv/vp9_mc_rvv.S
index 5241562531..5e81301aa5 100644
--- a/libavcodec/riscv/vp9_mc_rvv.S
+++ b/libavcodec/riscv/vp9_mc_rvv.S
@@ -36,6 +36,18 @@
 .endif
 .endm
 
+.macro vsetvlstatic16 len
+.ifc \len,4
+vsetvli zero, zero, e16, mf2, ta, ma
+.elseif \len == 8
+vsetvli zero, zero, e16, m1, ta, ma
+.elseif \len == 16
+vsetvli zero, zero, e16, m2, ta, ma
+.else
+vsetvli zero, zero, e16, m4, ta, ma
+.endif
+.endm
+
 .macro copy_avg len
 func ff_vp9_avg\len\()_rvv, zve32x
 csrwi   vxrm, 0
@@ -181,8 +193,196 @@ func ff_\op\()_vp9_bilin_64hv_rvv, zve32x
 endfunc
 .endm
 
+.equ ff_vp9_subpel_filters_smooth, ff_vp9_subpel_filters
+.equ ff_vp9_subpel_filters_regular, ff_vp9_subpel_filters + 16*8*2
+.equ ff_vp9_subpel_filters_sharp, ff_vp9_subpel_filters + 16*8*2*2
+
+.macro epel_filter name, type, regtype
+lla \regtype\()2, ff_vp9_subpel_filters_\name
+
+.ifc \type,v
+slli\regtype\()0, a6, 4
+.else
+slli\regtype\()0, a5, 4
+.endif
+add \regtype\()0, \regtype\()0, \regtype\()2
+
+lh  \regtype\()1, 2(\regtype\()0)
+lh  \regtype\()2, 4(\regtype\()0)
+lh  \regtype\()3, 6(\regtype\()0)
+lh  \regtype\()4, 8(\regtype\()0)
+lh  \regtype\()5, 10(\regtype\()0)
+lh  \regtype\()6, 12(\regtype\()0)
+
+.ifc \regtype,t
+lh  a7, 14(\regtype\()0)
+.else
+lh  s7, 14(\regtype\()0)
+.endif
+lh  \regtype\()0, 0(\regtype\()0)
+.endm
+
+.macro epel_load dst, len, op, name, type, from_mem, regtype
+.ifc \from_mem, 1
+vle8.v  v22, (a2)
+.ifc \type,v
+add a5, a3, a2
+sub a2, a2, a3
+vle8.v  v24, (a5)
+vle8.v  v20, (a2)
+sh1add

[FFmpeg-devel] [PATCH v4 4/4] lavc/vp9dsp: R-V V mc tap hv

2024-06-15 Thread uk7b

From: sunyuechi 

 C908   X60
vp9_avg_8tap_smooth_4hv_8bpp_c :   32.0   28.0
vp9_avg_8tap_smooth_4hv_8bpp_rvv_i32   :   15.0   13.2
vp9_avg_8tap_smooth_8hv_8bpp_c :   98.0   86.2
vp9_avg_8tap_smooth_8hv_8bpp_rvv_i32   :   23.7   21.2
vp9_avg_8tap_smooth_16hv_8bpp_c:  355.7  297.0
vp9_avg_8tap_smooth_16hv_8bpp_rvv_i32  :   47.0   41.5
vp9_avg_8tap_smooth_32hv_8bpp_c: 1272.7 1099.7
vp9_avg_8tap_smooth_32hv_8bpp_rvv_i32  :  134.7  119.7
vp9_avg_8tap_smooth_64hv_8bpp_c: 4937.0 4224.2
vp9_avg_8tap_smooth_64hv_8bpp_rvv_i32  :  528.5  228.5
vp9_put_8tap_smooth_4hv_8bpp_c :   30.2   26.7
vp9_put_8tap_smooth_4hv_8bpp_rvv_i32   :   30.5   12.5
vp9_put_8tap_smooth_8hv_8bpp_c :   91.5   81.2
vp9_put_8tap_smooth_8hv_8bpp_rvv_i32   :   22.7   20.2
vp9_put_8tap_smooth_16hv_8bpp_c:  313.2  277.5
vp9_put_8tap_smooth_16hv_8bpp_rvv_i32  :   45.2   40.2
vp9_put_8tap_smooth_32hv_8bpp_c: 1166.7 1022.2
vp9_put_8tap_smooth_32hv_8bpp_rvv_i32  :  131.7  117.2
vp9_put_8tap_smooth_64hv_8bpp_c: 4560.5 3961.7
vp9_put_8tap_smooth_64hv_8bpp_rvv_i32  :  517.0  223.2
---
 libavcodec/riscv/vp9_mc_rvv.S  | 75 ++
 libavcodec/riscv/vp9dsp_init.c |  8 
 2 files changed, 83 insertions(+)

diff --git a/libavcodec/riscv/vp9_mc_rvv.S b/libavcodec/riscv/vp9_mc_rvv.S
index 5e81301aa5..474c9035ae 100644
--- a/libavcodec/riscv/vp9_mc_rvv.S
+++ b/libavcodec/riscv/vp9_mc_rvv.S
@@ -373,6 +373,77 @@ func 
ff_\op\()_vp9_8tap_\name\()_\len\()\type\()_rvv\vlen\(), zve32x
 endfunc
 .endm
 
+#if __riscv_xlen == 64
+.macro epel_hv_once len, name, op
+sub a2, a2, a3
+sub a2, a2, a3
+sub a2, a2, a3
+.irp n,0,2,4,6,8,10,12,14
+epel_load_inc   v\n, \len, put, \name, h, 1, t
+.endr
+addia4, a4, -1
+1:
+addia4, a4, -1
+epel_load   v30, \len, \op, \name, v, 0, s
+vse8.v  v30, (a0)
+vmv.v.v v0, v2
+vmv.v.v v2, v4
+vmv.v.v v4, v6
+vmv.v.v v6, v8
+vmv.v.v v8, v10
+vmv.v.v v10, v12
+vmv.v.v v12, v14
+epel_load   v14, \len, put, \name, h, 1, t
+add a2, a2, a3
+add a0, a0, a1
+bneza4, 1b
+epel_load   v30, \len, \op, \name, v, 0, s
+vse8.v  v30, (a0)
+.endm
+
+.macro epel_hv op, name, len, vlen
+func ff_\op\()_vp9_8tap_\name\()_\len\()hv_rvv\vlen\(), zve32x
+addisp, sp, -64
+.irp n,0,1,2,3,4,5,6,7
+sd  s\n, \n\()<<3(sp)
+.endr
+.if \len == 64 && \vlen < 256
+addisp, sp, -48
+.irp n,0,1,2,3,4,5
+sd  a\n, \n\()<<3(sp)
+.endr
+.endif
+.ifc \op,avg
+csrwi   vxrm, 0
+.endif
+epel_filter \name, h, t
+epel_filter \name, v, s
+.if \vlen < 256
+vsetvlstatic8   \len, a6, 32, m2
+.else
+vsetvlstatic8   \len, a6, 64, m2
+.endif
+epel_hv_once\len, \name, \op
+.if \len == 64 && \vlen < 256
+.irp n,0,1,2,3,4,5
+ld  a\n, \n\()<<3(sp)
+.endr
+addisp, sp, 48
+addia0, a0, 32
+addia2, a2, 32
+epel_filter \name, h, t
+epel_hv_once\len, \name, \op
+.endif
+.irp n,0,1,2,3,4,5,6,7
+ld  s\n, \n\()<<3(sp)
+.endr
+addisp, sp, 64
+
+ret
+endfunc
+.endm
+#endif
+
 .irp len, 64, 32, 16, 8, 4
 copy_avg \len
 .irp op, put, avg
@@ -381,6 +452,10 @@ endfunc
 epel \len, \op, \name, \type, 128
 epel \len, \op, \name, \type, 256
 .endr
+#if __riscv_xlen == 64
+epel_hv \op, \name, \len, 128
+epel_hv \op, \name, \len, 256
+#endif
 .endr
 .endr
 .endr
diff --git a/libavcodec/riscv/vp9dsp_init.c b/libavcodec/riscv/vp9dsp_init.c
index 3669070fca..7b090c9889 100644
--- a/libavcodec/riscv/vp9dsp_init.c
+++ b/libavcodec/riscv/vp9dsp_init.c
@@ -119,6 +119,10 @@ static av_cold void vp9dsp_mc_init_riscv(VP9DSPContext 
*dsp, int bpp)
 if (flags & AV_CPU_FLAG_RVB_ADDR) {
 init_subpel2(0, 0, 1, v, put, 128);
 init_subpel2(1, 0, 1, v, avg, 128);
+# if __riscv_xlen == 64
+init_subpel2(0, 1, 1, hv, put, 128);
+init_subpel2(1, 1, 1, hv, avg, 128);
+# endif

[FFmpeg-devel] [PATCH v4 1/4] lavc/vp9dsp: R-V V mc bilin h v

2024-06-15 Thread uk7b

From: sunyuechi 

 C908   X60
vp9_avg_bilin_4h_8bpp_c:5.54.7
vp9_avg_bilin_4h_8bpp_rvv_i32  :1.71.5
vp9_avg_bilin_4v_8bpp_c:5.54.7
vp9_avg_bilin_4v_8bpp_rvv_i32  :1.51.2
vp9_avg_bilin_8h_8bpp_c:   20.0   17.7
vp9_avg_bilin_8h_8bpp_rvv_i32  :3.02.7
vp9_avg_bilin_8v_8bpp_c:   20.7   18.7
vp9_avg_bilin_8v_8bpp_rvv_i32  :3.02.7
vp9_avg_bilin_16h_8bpp_c   :   78.2   69.7
vp9_avg_bilin_16h_8bpp_rvv_i32 :7.06.2
vp9_avg_bilin_16v_8bpp_c   :   98.5   73.2
vp9_avg_bilin_16v_8bpp_rvv_i32 :7.06.0
vp9_avg_bilin_32h_8bpp_c   :  325.5  275.5
vp9_avg_bilin_32h_8bpp_rvv_i32 :   23.0   20.5
vp9_avg_bilin_32v_8bpp_c   :  342.2  290.0
vp9_avg_bilin_32v_8bpp_rvv_i32 :   21.7   19.5
vp9_avg_bilin_64h_8bpp_c   : 1263.7 1095.7
vp9_avg_bilin_64h_8bpp_rvv_i32 :   91.2   81.2
vp9_avg_bilin_64v_8bpp_c   : 1331.7 1155.2
vp9_avg_bilin_64v_8bpp_rvv_i32 :   91.2   81.0
vp9_put_bilin_4h_8bpp_c:4.54.0
vp9_put_bilin_4h_8bpp_rvv_i32  :1.01.0
vp9_put_bilin_4v_8bpp_c:4.74.2
vp9_put_bilin_4v_8bpp_rvv_i32  :1.01.0
vp9_put_bilin_8h_8bpp_c:   16.7   15.0
vp9_put_bilin_8h_8bpp_rvv_i32  :2.22.0
vp9_put_bilin_8v_8bpp_c:   17.5   15.7
vp9_put_bilin_8v_8bpp_rvv_i32  :2.22.0
vp9_put_bilin_16h_8bpp_c   :   65.2   58.0
vp9_put_bilin_16h_8bpp_rvv_i32 :6.05.5
vp9_put_bilin_16v_8bpp_c   :   69.2   61.7
vp9_put_bilin_16v_8bpp_rvv_i32 :5.75.2
vp9_put_bilin_32h_8bpp_c   :  273.2  229.0
vp9_put_bilin_32h_8bpp_rvv_i32 :   19.7   17.7
vp9_put_bilin_32v_8bpp_c   :  290.5  243.7
vp9_put_bilin_32v_8bpp_rvv_i32 :   18.7   16.7
vp9_put_bilin_64h_8bpp_c   : 1040.5  910.5
vp9_put_bilin_64h_8bpp_rvv_i32 :   82.5   73.0
vp9_put_bilin_64v_8bpp_c   : 1108.5  971.0
vp9_put_bilin_64v_8bpp_rvv_i32 :   82.2   73.2
---
 libavcodec/riscv/vp9_mc_rvv.S  | 114 +
 libavcodec/riscv/vp9dsp.h  |  12 ++--
 libavcodec/riscv/vp9dsp_init.c |  21 ++
 3 files changed, 141 insertions(+), 6 deletions(-)

diff --git a/libavcodec/riscv/vp9_mc_rvv.S b/libavcodec/riscv/vp9_mc_rvv.S
index 7cb38ec94a..fb7377048a 100644
--- a/libavcodec/riscv/vp9_mc_rvv.S
+++ b/libavcodec/riscv/vp9_mc_rvv.S
@@ -53,6 +53,120 @@ func ff_vp9_avg\len\()_rvv, zve32x
 endfunc
 .endm
 
+.macro bilin_load_h dst, op, mn
+addit5, a2, 1
+vle8.v  v8, (a2)
+vle8.v  v0, (t5)
+vwmulu.vx   v16, v0, \mn
+vwmaccsu.vx v16, t1, v8
+vwadd.wxv16, v16, t4
+vnsra.wiv16, v16, 4
+vadd.vv \dst, v16, v8
+.ifc \op,avg
+vle8.v  v16, (a0)
+vaaddu.vv   \dst, \dst, v16
+.endif
+.endm
+
+.macro bilin_h_v op, type, mn
+func ff_\op\()_vp9_bilin_64\type\()_rvv, zve32x
+vsetvlstatic8   64, t0, 64
+.ifc \op,avg
+csrwi   vxrm, 0
+.endif
+li  t4, 8
+neg t1, \mn
+1:
+addia4, a4, -1
+.ifc \type,v
+add t5, a2, a3
+.else
+addit5, a2, 1
+.endif
+vle8.v  v8, (a2)
+vle8.v  v0, (t5)
+vwmulu.vx   v16, v0, \mn
+vwmaccsu.vx v16, t1, v8
+vwadd.wxv16, v16, t4
+vnsra.wiv16, v16, 4
+vadd.vv v0, v16, v8
+.ifc \op,avg
+vle8.v  v16, (a0)
+vaaddu.vv   v0, v0, v16
+.endif
+vse8.v  v0, (a0)
+add a2, a2, a3
+add a0, a0, a1
+bneza4, 1b
+ret
+
+.Lbilin_\type\op:
+.ifc \op,avg
+csrwi   vxrm, 0
+.endif
+li  t4, 8
+neg t1, \mn
+1:
+addia4, a4, -2
+add t6, a0, a1
+add t0, a2, a3
+vle8.v  v8, (a2)
+vle8.v  v4, (t0)
+.ifc \type,v
+add t2, t0, a3
+vwmulu.vx   v16, v4, \mn
+.else
+addit3, a2, 1
+

[FFmpeg-devel] [PATCH v4 2/4] lavc/vp9dsp: R-V V mc bilin hv

2024-06-15 Thread uk7b

From: sunyuechi 

 C908   X60
vp9_avg_bilin_4hv_8bpp_c   :   10.79.5
vp9_avg_bilin_4hv_8bpp_rvv_i32 :4.03.5
vp9_avg_bilin_8hv_8bpp_c   :   38.5   34.2
vp9_avg_bilin_8hv_8bpp_rvv_i32 :7.26.5
vp9_avg_bilin_16hv_8bpp_c  :  147.2  130.5
vp9_avg_bilin_16hv_8bpp_rvv_i32:   14.5   12.7
vp9_avg_bilin_32hv_8bpp_c  :  574.2  509.7
vp9_avg_bilin_32hv_8bpp_rvv_i32:   42.5   38.0
vp9_avg_bilin_64hv_8bpp_c  : 2321.2 2017.7
vp9_avg_bilin_64hv_8bpp_rvv_i32:  163.5  131.0
vp9_put_bilin_4hv_8bpp_c   :   10.08.7
vp9_put_bilin_4hv_8bpp_rvv_i32 :3.53.0
vp9_put_bilin_8hv_8bpp_c   :   35.2   31.2
vp9_put_bilin_8hv_8bpp_rvv_i32 :6.55.7
vp9_put_bilin_16hv_8bpp_c  :  134.0  119.0
vp9_put_bilin_16hv_8bpp_rvv_i32:   12.7   11.5
vp9_put_bilin_32hv_8bpp_c  :  538.5  464.2
vp9_put_bilin_32hv_8bpp_rvv_i32:   39.7   35.2
vp9_put_bilin_64hv_8bpp_c  : 2111.7 1833.2
vp9_put_bilin_64hv_8bpp_rvv_i32:  138.5  122.5
---
 libavcodec/riscv/vp9_mc_rvv.S  | 38 +-
 libavcodec/riscv/vp9dsp_init.c | 10 +
 2 files changed, 47 insertions(+), 1 deletion(-)

diff --git a/libavcodec/riscv/vp9_mc_rvv.S b/libavcodec/riscv/vp9_mc_rvv.S
index fb7377048a..5241562531 100644
--- a/libavcodec/riscv/vp9_mc_rvv.S
+++ b/libavcodec/riscv/vp9_mc_rvv.S
@@ -147,6 +147,40 @@ func ff_\op\()_vp9_bilin_64\type\()_rvv, zve32x
 endfunc
 .endm
 
+.macro bilin_hv op
+func ff_\op\()_vp9_bilin_64hv_rvv, zve32x
+vsetvlstatic8   64, t0, 64
+.Lbilin_hv\op:
+.ifc \op,avg
+csrwi   vxrm, 0
+.endif
+neg t1, a5
+neg t2, a6
+li  t4, 8
+bilin_load_hv24, put, a5
+add a2, a2, a3
+1:
+addia4, a4, -1
+bilin_load_hv4, put, a5
+vwmulu.vx   v16, v4, a6
+vwmaccsu.vx v16, t2, v24
+vwadd.wxv16, v16, t4
+vnsra.wiv16, v16, 4
+vadd.vv v0, v16, v24
+.ifc \op,avg
+vle8.v  v16, (a0)
+vaaddu.vv   v0, v0, v16
+.endif
+vse8.v  v0, (a0)
+vmv.v.v v24, v4
+add a2, a2, a3
+add a0, a0, a1
+bneza4, 1b
+
+ret
+endfunc
+.endm
+
 .irp len, 64, 32, 16, 8, 4
 copy_avg \len
 .endr
@@ -155,6 +189,8 @@ bilin_h_v  put, h, a5
 bilin_h_v  avg, h, a5
 bilin_h_v  put, v, a6
 bilin_h_v  avg, v, a6
+bilin_hv   put
+bilin_hv   avg
 
 .macro func_bilin_h_v len, op, type
 func ff_\op\()_vp9_bilin_\len\()\type\()_rvv, zve32x
@@ -165,7 +201,7 @@ endfunc
 
 .irp len, 32, 16, 8, 4
 .irp op, put, avg
-.irp type, h, v
+.irp type, h, v, hv
 func_bilin_h_v \len, \op, \type
 .endr
 .endr
diff --git a/libavcodec/riscv/vp9dsp_init.c b/libavcodec/riscv/vp9dsp_init.c
index 9606d8545f..b3700dfb08 100644
--- a/libavcodec/riscv/vp9dsp_init.c
+++ b/libavcodec/riscv/vp9dsp_init.c
@@ -83,6 +83,16 @@ static av_cold void vp9dsp_mc_init_riscv(VP9DSPContext *dsp, 
int bpp)
 dsp->mc[4][FILTER_BILINEAR ][0][1][0] = ff_put_vp9_bilin_4h_rvv;
 dsp->mc[4][FILTER_BILINEAR ][1][0][1] = ff_avg_vp9_bilin_4v_rvv;
 dsp->mc[4][FILTER_BILINEAR ][1][1][0] = ff_avg_vp9_bilin_4h_rvv;
+dsp->mc[0][FILTER_BILINEAR ][0][1][1] = ff_put_vp9_bilin_64hv_rvv;
+dsp->mc[0][FILTER_BILINEAR ][1][1][1] = ff_avg_vp9_bilin_64hv_rvv;
+dsp->mc[1][FILTER_BILINEAR ][0][1][1] = ff_put_vp9_bilin_32hv_rvv;
+dsp->mc[1][FILTER_BILINEAR ][1][1][1] = ff_avg_vp9_bilin_32hv_rvv;
+dsp->mc[2][FILTER_BILINEAR ][0][1][1] = ff_put_vp9_bilin_16hv_rvv;
+dsp->mc[2][FILTER_BILINEAR ][1][1][1] = ff_avg_vp9_bilin_16hv_rvv;
+dsp->mc[3][FILTER_BILINEAR ][0][1][1] = ff_put_vp9_bilin_8hv_rvv;
+dsp->mc[3][FILTER_BILINEAR ][1][1][1] = ff_avg_vp9_bilin_8hv_rvv;
+dsp->mc[4][FILTER_BILINEAR ][0][1][1] = ff_put_vp9_bilin_4hv_rvv;
+dsp->mc[4][FILTER_BILINEAR ][1][1][1] = ff_avg_vp9_bilin_4hv_rvv;
 
 #undef init_fpel
 }
-- 
2.45.2

___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".

Re: [FFmpeg-devel] [PATCH v4 1/4] lavc/vp9dsp: R-V V mc bilin h v

2024-06-15 Thread flow gg

Just like in VP8, the unroll has been updated.

 于2024年6月15日周六 19:51写道：

> From: sunyuechi 
>
>  C908   X60
> vp9_avg_bilin_4h_8bpp_c:5.54.7
> vp9_avg_bilin_4h_8bpp_rvv_i32  :1.71.5
> vp9_avg_bilin_4v_8bpp_c:5.54.7
> vp9_avg_bilin_4v_8bpp_rvv_i32  :1.51.2
> vp9_avg_bilin_8h_8bpp_c:   20.0   17.7
> vp9_avg_bilin_8h_8bpp_rvv_i32  :3.02.7
> vp9_avg_bilin_8v_8bpp_c:   20.7   18.7
> vp9_avg_bilin_8v_8bpp_rvv_i32  :3.02.7
> vp9_avg_bilin_16h_8bpp_c   :   78.2   69.7
> vp9_avg_bilin_16h_8bpp_rvv_i32 :7.06.2
> vp9_avg_bilin_16v_8bpp_c   :   98.5   73.2
> vp9_avg_bilin_16v_8bpp_rvv_i32 :7.06.0
> vp9_avg_bilin_32h_8bpp_c   :  325.5  275.5
> vp9_avg_bilin_32h_8bpp_rvv_i32 :   23.0   20.5
> vp9_avg_bilin_32v_8bpp_c   :  342.2  290.0
> vp9_avg_bilin_32v_8bpp_rvv_i32 :   21.7   19.5
> vp9_avg_bilin_64h_8bpp_c   : 1263.7 1095.7
> vp9_avg_bilin_64h_8bpp_rvv_i32 :   91.2   81.2
> vp9_avg_bilin_64v_8bpp_c   : 1331.7 1155.2
> vp9_avg_bilin_64v_8bpp_rvv_i32 :   91.2   81.0
> vp9_put_bilin_4h_8bpp_c:4.54.0
> vp9_put_bilin_4h_8bpp_rvv_i32  :1.01.0
> vp9_put_bilin_4v_8bpp_c:4.74.2
> vp9_put_bilin_4v_8bpp_rvv_i32  :1.01.0
> vp9_put_bilin_8h_8bpp_c:   16.7   15.0
> vp9_put_bilin_8h_8bpp_rvv_i32  :2.22.0
> vp9_put_bilin_8v_8bpp_c:   17.5   15.7
> vp9_put_bilin_8v_8bpp_rvv_i32  :2.22.0
> vp9_put_bilin_16h_8bpp_c   :   65.2   58.0
> vp9_put_bilin_16h_8bpp_rvv_i32 :6.05.5
> vp9_put_bilin_16v_8bpp_c   :   69.2   61.7
> vp9_put_bilin_16v_8bpp_rvv_i32 :5.75.2
> vp9_put_bilin_32h_8bpp_c   :  273.2  229.0
> vp9_put_bilin_32h_8bpp_rvv_i32 :   19.7   17.7
> vp9_put_bilin_32v_8bpp_c   :  290.5  243.7
> vp9_put_bilin_32v_8bpp_rvv_i32 :   18.7   16.7
> vp9_put_bilin_64h_8bpp_c   : 1040.5  910.5
> vp9_put_bilin_64h_8bpp_rvv_i32 :   82.5   73.0
> vp9_put_bilin_64v_8bpp_c   : 1108.5  971.0
> vp9_put_bilin_64v_8bpp_rvv_i32 :   82.2   73.2
> ---
>  libavcodec/riscv/vp9_mc_rvv.S  | 114 +
>  libavcodec/riscv/vp9dsp.h  |  12 ++--
>  libavcodec/riscv/vp9dsp_init.c |  21 ++
>  3 files changed, 141 insertions(+), 6 deletions(-)
>
> diff --git a/libavcodec/riscv/vp9_mc_rvv.S b/libavcodec/riscv/vp9_mc_rvv.S
> index 7cb38ec94a..fb7377048a 100644
> --- a/libavcodec/riscv/vp9_mc_rvv.S
> +++ b/libavcodec/riscv/vp9_mc_rvv.S
> @@ -53,6 +53,120 @@ func ff_vp9_avg\len\()_rvv, zve32x
>  endfunc
>  .endm
>
> +.macro bilin_load_h dst, op, mn
> +addit5, a2, 1
> +vle8.v  v8, (a2)
> +vle8.v  v0, (t5)
> +vwmulu.vx   v16, v0, \mn
> +vwmaccsu.vx v16, t1, v8
> +vwadd.wxv16, v16, t4
> +vnsra.wiv16, v16, 4
> +vadd.vv \dst, v16, v8
> +.ifc \op,avg
> +vle8.v  v16, (a0)
> +vaaddu.vv   \dst, \dst, v16
> +.endif
> +.endm
> +
> +.macro bilin_h_v op, type, mn
> +func ff_\op\()_vp9_bilin_64\type\()_rvv, zve32x
> +vsetvlstatic8   64, t0, 64
> +.ifc \op,avg
> +csrwi   vxrm, 0
> +.endif
> +li  t4, 8
> +neg t1, \mn
> +1:
> +addia4, a4, -1
> +.ifc \type,v
> +add t5, a2, a3
> +.else
> +addit5, a2, 1
> +.endif
> +vle8.v  v8, (a2)
> +vle8.v  v0, (t5)
> +vwmulu.vx   v16, v0, \mn
> +vwmaccsu.vx v16, t1, v8
> +vwadd.wxv16, v16, t4
> +vnsra.wiv16, v16, 4
> +vadd.vv v0, v16, v8
> +.ifc \op,avg
> +vle8.v  v16, (a0)
> +vaaddu.vv   v0, v0, v16
> +.endif
> +vse8.v  v0, (a0)
> +add a2, a2, a3
> +add a0, a0, a1
> +bneza4, 1b
> +ret
> +
> +.Lbilin_\type\op:
> +.ifc \op,avg
> +csrwi   vxrm, 0
> +.endif
> +li  t4, 8
> +neg t1, \mn
> +1:
> +addi

Re: [FFmpeg-devel] [PATCH v4 2/4] lavc/vp9dsp: R-V V mc bilin hv

2024-06-15 Thread flow gg

> Copying vectors is rarely justified - mostly only before destructive
> instructions such as FMA.

It is slightly different from VP8. In VP8, many scalar values are positive,
so the related calculations can be easily replaced. However, in this
context of VP9, since t2 is a negative number, vwmaccsu is required.
Therefore, unlike the logic in VP8, we cannot use vwmulu.vx before
bilin_load to avoid vmv.


 于2024年6月15日周六 19:51写道：

> From: sunyuechi 
>
>  C908   X60
> vp9_avg_bilin_4hv_8bpp_c   :   10.79.5
> vp9_avg_bilin_4hv_8bpp_rvv_i32 :4.03.5
> vp9_avg_bilin_8hv_8bpp_c   :   38.5   34.2
> vp9_avg_bilin_8hv_8bpp_rvv_i32 :7.26.5
> vp9_avg_bilin_16hv_8bpp_c  :  147.2  130.5
> vp9_avg_bilin_16hv_8bpp_rvv_i32:   14.5   12.7
> vp9_avg_bilin_32hv_8bpp_c  :  574.2  509.7
> vp9_avg_bilin_32hv_8bpp_rvv_i32:   42.5   38.0
> vp9_avg_bilin_64hv_8bpp_c  : 2321.2 2017.7
> vp9_avg_bilin_64hv_8bpp_rvv_i32:  163.5  131.0
> vp9_put_bilin_4hv_8bpp_c   :   10.08.7
> vp9_put_bilin_4hv_8bpp_rvv_i32 :3.53.0
> vp9_put_bilin_8hv_8bpp_c   :   35.2   31.2
> vp9_put_bilin_8hv_8bpp_rvv_i32 :6.55.7
> vp9_put_bilin_16hv_8bpp_c  :  134.0  119.0
> vp9_put_bilin_16hv_8bpp_rvv_i32:   12.7   11.5
> vp9_put_bilin_32hv_8bpp_c  :  538.5  464.2
> vp9_put_bilin_32hv_8bpp_rvv_i32:   39.7   35.2
> vp9_put_bilin_64hv_8bpp_c  : 2111.7 1833.2
> vp9_put_bilin_64hv_8bpp_rvv_i32:  138.5  122.5
> ---
>  libavcodec/riscv/vp9_mc_rvv.S  | 38 +-
>  libavcodec/riscv/vp9dsp_init.c | 10 +
>  2 files changed, 47 insertions(+), 1 deletion(-)
>
> diff --git a/libavcodec/riscv/vp9_mc_rvv.S b/libavcodec/riscv/vp9_mc_rvv.S
> index fb7377048a..5241562531 100644
> --- a/libavcodec/riscv/vp9_mc_rvv.S
> +++ b/libavcodec/riscv/vp9_mc_rvv.S
> @@ -147,6 +147,40 @@ func ff_\op\()_vp9_bilin_64\type\()_rvv, zve32x
>  endfunc
>  .endm
>
> +.macro bilin_hv op
> +func ff_\op\()_vp9_bilin_64hv_rvv, zve32x
> +vsetvlstatic8   64, t0, 64
> +.Lbilin_hv\op:
> +.ifc \op,avg
> +csrwi   vxrm, 0
> +.endif
> +neg t1, a5
> +neg t2, a6
> +li  t4, 8
> +bilin_load_hv24, put, a5
> +add a2, a2, a3
> +1:
> +addia4, a4, -1
> +bilin_load_hv4, put, a5
> +vwmulu.vx   v16, v4, a6
> +vwmaccsu.vx v16, t2, v24
> +vwadd.wxv16, v16, t4
> +vnsra.wiv16, v16, 4
> +vadd.vv v0, v16, v24
> +.ifc \op,avg
> +vle8.v  v16, (a0)
> +vaaddu.vv   v0, v0, v16
> +.endif
> +vse8.v  v0, (a0)
> +vmv.v.v v24, v4
> +add a2, a2, a3
> +add a0, a0, a1
> +bneza4, 1b
> +
> +ret
> +endfunc
> +.endm
> +
>  .irp len, 64, 32, 16, 8, 4
>  copy_avg \len
>  .endr
> @@ -155,6 +189,8 @@ bilin_h_v  put, h, a5
>  bilin_h_v  avg, h, a5
>  bilin_h_v  put, v, a6
>  bilin_h_v  avg, v, a6
> +bilin_hv   put
> +bilin_hv   avg
>
>  .macro func_bilin_h_v len, op, type
>  func ff_\op\()_vp9_bilin_\len\()\type\()_rvv, zve32x
> @@ -165,7 +201,7 @@ endfunc
>
>  .irp len, 32, 16, 8, 4
>  .irp op, put, avg
> -.irp type, h, v
> +.irp type, h, v, hv
>  func_bilin_h_v \len, \op, \type
>  .endr
>  .endr
> diff --git a/libavcodec/riscv/vp9dsp_init.c
> b/libavcodec/riscv/vp9dsp_init.c
> index 9606d8545f..b3700dfb08 100644
> --- a/libavcodec/riscv/vp9dsp_init.c
> +++ b/libavcodec/riscv/vp9dsp_init.c
> @@ -83,6 +83,16 @@ static av_cold void vp9dsp_mc_init_riscv(VP9DSPContext
> *dsp, int bpp)
>  dsp->mc[4][FILTER_BILINEAR ][0][1][0] = ff_put_vp9_bilin_4h_rvv;
>  dsp->mc[4][FILTER_BILINEAR ][1][0][1] = ff_avg_vp9_bilin_4v_rvv;
>  dsp->mc[4][FILTER_BILINEAR ][1][1][0] = ff_avg_vp9_bilin_4h_rvv;
> +dsp->mc[0][FILTER_BILINEAR ][0][1][1] = ff_put_vp9_bilin_64hv_rvv;
> +dsp->mc[0][FILTER_BILINEAR ][1][1][1] = ff_avg_vp9_bilin_64hv_rvv;
> +dsp->mc[1][FILTER_BILINEAR ][0][1][1] = ff_put_vp9_bilin_32hv_rvv;
> +dsp->mc[1][FILTER_BILINEAR ][1][1][1] = ff_avg_vp9_bilin_32hv_rvv;
> +dsp->mc[2][FILTER_BILINEAR ][0][1][1] = ff_put_vp9_bilin_16hv_rvv;
> +dsp->mc[2][FILTER_BILINEAR ][1][1][1] = ff_avg_vp9_bilin_16hv_rvv;
> +dsp->mc[3][FILTER_BILINEAR ][0][1][1] = ff_put_vp9_bilin_8hv_rvv;
> +dsp->mc[3][FILTER_BILINEAR ][1][1][1] = ff_avg_vp9_bil

Re: [FFmpeg-devel] [PATCH v4 3/4] lavc/vp9dsp: R-V V mc tap h v

2024-06-15 Thread flow gg

> You can directly LLA filters + 16 * 8 * 2 and save one add. Same below.
You can
> also use .equ to alias the filter addresses, and avoid if's.

> That's a lot of address dependencies, which is going to hurt performance.
It
> might help to just spill more S registers if needed.

> This can be done in 3 instructions, even without mul. Of course you'll
again
> need a spare register.

Okay, updated them

> Use a macro parameter for the stride register.

Doing this will reduce one if-else statement in this patch, but in the next
patch, it will lead to adding multiple if-else statements. I think we can
leave it unchanged.

 于2024年6月15日周六 19:51写道：

> From: sunyuechi 
>
>  C908   X60
> vp9_avg_8tap_smooth_4h_8bpp_c  :   12.7   11.2
> vp9_avg_8tap_smooth_4h_8bpp_rvv_i32:4.74.2
> vp9_avg_8tap_smooth_4v_8bpp_c  :   29.7   12.5
> vp9_avg_8tap_smooth_4v_8bpp_rvv_i32:4.74.2
> vp9_avg_8tap_smooth_8h_8bpp_c  :   48.7   42.2
> vp9_avg_8tap_smooth_8h_8bpp_rvv_i32:9.58.5
> vp9_avg_8tap_smooth_8v_8bpp_c  :   49.7   45.5
> vp9_avg_8tap_smooth_8v_8bpp_rvv_i32:9.58.5
> vp9_avg_8tap_smooth_16h_8bpp_c :  192.0  166.5
> vp9_avg_8tap_smooth_16h_8bpp_rvv_i32   :   21.7   19.5
> vp9_avg_8tap_smooth_16v_8bpp_c :  191.2  175.2
> vp9_avg_8tap_smooth_16v_8bpp_rvv_i32   :   21.2   19.0
> vp9_avg_8tap_smooth_32h_8bpp_c :  780.2  663.2
> vp9_avg_8tap_smooth_32h_8bpp_rvv_i32   :   68.2   60.5
> vp9_avg_8tap_smooth_32v_8bpp_c :  770.0  685.7
> vp9_avg_8tap_smooth_32v_8bpp_rvv_i32   :   67.0   59.5
> vp9_avg_8tap_smooth_64h_8bpp_c : 3116.2 2648.2
> vp9_avg_8tap_smooth_64h_8bpp_rvv_i32   :  270.7  120.7
> vp9_avg_8tap_smooth_64v_8bpp_c : 3058.5 2731.7
> vp9_avg_8tap_smooth_64v_8bpp_rvv_i32   :  266.5  119.0
> vp9_put_8tap_smooth_4h_8bpp_c  :   11.09.7
> vp9_put_8tap_smooth_4h_8bpp_rvv_i32:4.23.7
> vp9_put_8tap_smooth_4v_8bpp_c  :   11.7   10.5
> vp9_put_8tap_smooth_4v_8bpp_rvv_i32:4.03.7
> vp9_put_8tap_smooth_8h_8bpp_c  :   42.0   37.5
> vp9_put_8tap_smooth_8h_8bpp_rvv_i32:8.57.7
> vp9_put_8tap_smooth_8v_8bpp_c  :   43.5   38.5
> vp9_put_8tap_smooth_8v_8bpp_rvv_i32:8.77.7
> vp9_put_8tap_smooth_16h_8bpp_c :  181.7  147.2
> vp9_put_8tap_smooth_16h_8bpp_rvv_i32   :   20.0   18.0
> vp9_put_8tap_smooth_16v_8bpp_c :  168.5  149.7
> vp9_put_8tap_smooth_16v_8bpp_rvv_i32   :   19.7   17.5
> vp9_put_8tap_smooth_32h_8bpp_c :  675.0  586.5
> vp9_put_8tap_smooth_32h_8bpp_rvv_i32   :   65.2   58.0
> vp9_put_8tap_smooth_32v_8bpp_c :  664.7  591.2
> vp9_put_8tap_smooth_32v_8bpp_rvv_i32   :   64.0   57.0
> vp9_put_8tap_smooth_64h_8bpp_c : 2696.2 2339.0
> vp9_put_8tap_smooth_64h_8bpp_rvv_i32   :  259.7  115.7
> vp9_put_8tap_smooth_64v_8bpp_c : 2691.0 2348.5
> vp9_put_8tap_smooth_64v_8bpp_rvv_i32   :  255.5  114.0
> ---
>  libavcodec/riscv/vp9_mc_rvv.S  | 200 +
>  libavcodec/riscv/vp9dsp.h  |  72 
>  libavcodec/riscv/vp9dsp_init.c |  38 ++-
>  3 files changed, 285 insertions(+), 25 deletions(-)
>
> diff --git a/libavcodec/riscv/vp9_mc_rvv.S b/libavcodec/riscv/vp9_mc_rvv.S
> index 5241562531..5e81301aa5 100644
> --- a/libavcodec/riscv/vp9_mc_rvv.S
> +++ b/libavcodec/riscv/vp9_mc_rvv.S
> @@ -36,6 +36,18 @@
>  .endif
>  .endm
>
> +.macro vsetvlstatic16 len
> +.ifc \len,4
> +vsetvli zero, zero, e16, mf2, ta, ma
> +.elseif \len == 8
> +vsetvli zero, zero, e16, m1, ta, ma
> +.elseif \len == 16
> +vsetvli zero, zero, e16, m2, ta, ma
> +.else
> +vsetvli zero, zero, e16, m4, ta, ma
> +.endif
> +.endm
> +
>  .macro copy_avg len
>  func ff_vp9_avg\len\()_rvv, zve32x
>  csrwi   vxrm, 0
> @@ -181,8 +193,196 @@ func ff_\op\()_vp9_bilin_64hv_rvv, zve32x
>  endfunc
>  .endm
>
> +.equ ff_vp9_subpel_filters_smooth, ff_vp9_subpel_filters
> +.equ ff_vp9_subpel_filters_regular, ff_vp9_subpel_filters + 16*8*2
> +.equ ff_vp9_subpel_filters_sharp, ff_vp9_subpel_filters + 16*8*2*2
> +
> +.macro epel_filter name, type, regtype
> +lla \regtype\()2, ff_vp9_subpel_filters_\name
> +
> +.ifc \type,v
> +slli\regtype\()0, a6, 4
> +.else
> +slli\regtype\()0, a5, 4
> +.endif
> +add \regtype\()0, \regtype\()0, \r

Re: [FFmpeg-devel] [PATCH] lavc/vvc: Invalidate PPSs which refer to a changed SPS

2024-06-15 Thread Nuo Mi

On Sat, Jun 15, 2024 at 2:35 PM Christophe Gisquet <
christophe.gisq...@gmail.com> wrote:

> Le ven. 14 juin 2024, 11:39, Frank Plowman  a
> écrit :
>
> > When the SPS associated with a particular SPS ID changes, invalidate all
> > the PPSs which use that SPS ID.  Fixes crashes with illegal bitstreams.
> > This is done in the CBS, rather than in libavcodec/vvc/ps.c like the SPS
> > ID reuse validation, as parts of the CBS parsing process for PPSs
> > depend on the SPS being referred to.
> >
>
> I am uncertain about this. I have no definite knowledge nor proof, but I
> would have thought these are persistent, IE it's legal to update some of
> them, their validity depending on something else.
>

> Wondering if the tested streams are thus conformant.
>
> But I don't know the actual rule. Maybe finding an EOB/EOS NUT? Related to
> some particular shape of a clean random access point, that would require
> retransmitting VPS/SPS/PPS/APS/... ?
>
> Asking Benjamin Bross might be a better option here.
>
Hi Chris,
spec said sps should not change in a CVS.  Frank has some patches to fix a
similar issue.
https://github.com/FFmpeg/FFmpeg/commit/2d79ae3f8a3306d24afe43ba505693a8dbefd21b


Hi Frank,
Did it crash before your error hand code in ps.c?
Could you send me the clip?

Thank you


> >
> ___
> ffmpeg-devel mailing list
> ffmpeg-devel@ffmpeg.org
> https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
>
> To unsubscribe, visit link above, or email
> ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".
>
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".

Re: [FFmpeg-devel] [PATCH v5 1/1] avcodec: add external enc libvvenc for H266/VVC

2024-06-15 Thread Nuo Mi

On Wed, Jun 12, 2024 at 9:36 PM Nuo Mi  wrote:

> Hi all,
> If there are no objections, I will push it in 3 days.
>
> Thank you,  Christian
>
>
> On Thu, Jun 6, 2024 at 3:52 AM Christian Bartnik 
> wrote:
>
>> From: Thomas Siedel 
>>
>> Add external encoder VVenC for H266/VVC encoding.
>> Register new encoder libvvenc.
>> Add libvvenc to wrap the vvenc interface.
>> libvvenc implements encoder option: preset,qp,qpa,period,
>> passlogfile,stats,vvenc-params,level,tier.
>> Enable encoder by adding --enable-libvvenc in configure step.e
>
> Applied,
Thank you, Christian, Thomas, Andreas, Anton, and everyone who contributed
to this.


>
>
>> Co-authored-by: Christian Bartnik chris1031...@gmail.com
>> Signed-off-by: Thomas Siedel 
>> ---
>>  configure |   4 +
>>  doc/encoders.texi |  64 +
>>  fftools/ffmpeg_mux_init.c |   2 +-
>>  libavcodec/Makefile   |   1 +
>>  libavcodec/allcodecs.c|   1 +
>>  libavcodec/libvvenc.c | 492 ++
>>  6 files changed, 563 insertions(+), 1 deletion(-)
>>  create mode 100644 libavcodec/libvvenc.c
>>
>> diff --git a/configure b/configure
>> index 6c5b8aab9a..37ece23376 100755
>> --- a/configure
>> +++ b/configure
>> @@ -293,6 +293,7 @@ External library support:
>>--enable-libvorbis   enable Vorbis en/decoding via libvorbis,
>> native implementation exists [no]
>>--enable-libvpx  enable VP8 and VP9 de/encoding via libvpx [no]
>> +  --enable-libvvencenable H.266/VVC encoding via vvenc [no]
>>--enable-libwebp enable WebP encoding via libwebp [no]
>>--enable-libx264 enable H.264 encoding via x264 [no]
>>--enable-libx265 enable HEVC encoding via x265 [no]
>> @@ -1966,6 +1967,7 @@ EXTERNAL_LIBRARY_LIST="
>>  libvmaf
>>  libvorbis
>>  libvpx
>> +libvvenc
>>  libwebp
>>  libxevd
>>  libxeve
>> @@ -3560,6 +3562,7 @@ libvpx_vp8_decoder_deps="libvpx"
>>  libvpx_vp8_encoder_deps="libvpx"
>>  libvpx_vp9_decoder_deps="libvpx"
>>  libvpx_vp9_encoder_deps="libvpx"
>> +libvvenc_encoder_deps="libvvenc"
>>  libwebp_encoder_deps="libwebp"
>>  libwebp_anim_encoder_deps="libwebp"
>>  libx262_encoder_deps="libx262"
>> @@ -7030,6 +7033,7 @@ enabled libvpx&& {
>>  fi
>>  }
>>
>> +enabled libvvenc  && require_pkg_config libvvenc "libvvenc >=
>> 1.6.1" "vvenc/vvenc.h" vvenc_get_version
>>  enabled libwebp   && {
>>  enabled libwebp_encoder  && require_pkg_config libwebp "libwebp
>> >= 0.2.0" webp/encode.h WebPGetEncoderVersion
>>  enabled libwebp_anim_encoder && check_pkg_config
>> libwebp_anim_encoder "libwebpmux >= 0.4.0" webp/mux.h
>> WebPAnimEncoderOptionsInit; }
>> diff --git a/doc/encoders.texi b/doc/encoders.texi
>> index c82f316f94..496852faeb 100644
>> --- a/doc/encoders.texi
>> +++ b/doc/encoders.texi
>> @@ -2378,6 +2378,70 @@ Indicates frame duration
>>  For more information about libvpx see:
>>  @url{http://www.webmproject.org/}
>>
>> +@section libvvenc
>> +
>> +VVenC H.266/VVC encoder wrapper.
>> +
>> +This encoder requires the presence of the libvvenc headers and library
>> +during configuration. You need to explicitly configure the build with
>> +@option{--enable-libvvenc}.
>> +
>> +The VVenC project website is at
>> +@url{https://github.com/fraunhoferhhi/vvenc}.
>> +
>> +@subsection Supported Pixel Formats
>> +
>> +VVenC supports only 10-bit color spaces as input. But the internal
>> (encoded)
>> +bit depth can be set to 8-bit or 10-bit at runtime.
>> +
>> +@subsection Options
>> +
>> +@table @option
>> +@item b
>> +Sets target video bitrate.
>> +
>> +@item g
>> +Set the GOP size. Currently support for g=1 (Intra only) or default.
>> +
>> +@item preset
>> +Set the VVenC preset.
>> +
>> +@item levelidc
>> +Set level idc.
>> +
>> +@item tier
>> +Set vvc tier.
>> +
>> +@item qp
>> +Set constant quantization parameter.
>> +
>> +@item subopt @var{boolean}
>> +Set subjective (perceptually motivated) optimization. Default is 1 (on).
>> +
>> +@item bitdepth8 @var{boolean}
>> +Set 8bit coding mode instead of using 10bit. Default is 0 (off).
>> +
>> +@item period
>> +set (intra) refresh period in seconds.
>> +
>> +@item vvenc-params
>> +Set vvenc options using a list of @var{key}=@var{value} couples separated
>> +by ":". See @command{vvencapp --fullhelp} or @command{vvencFFapp
>> --fullhelp} for a list of options.
>> +
>> +For example, the options might be provided as:
>> +
>> +@example
>> +intraperiod=64:decodingrefreshtype=idr:poc0idr=1:internalbitdepth=8
>> +@end example
>> +
>> +For example the encoding options might be provided with
>> @option{-vvenc-params}:
>> +
>> +@example
>> +ffmpeg -i input -c:v libvvenc -b 1M -vvenc-params
>> intraperiod=64:decodingrefreshtype=idr:poc0idr=1:internalbitdepth=8
>> output.mp4
>> +@end example
>> +
>> +@end table
>> +
>>  @section libwebp
>>
>>  libwebp WebP Image encoder wrapper
>> diff --git a/fftools/ffmpeg_mux_init.c b/fftoo

Re: [FFmpeg-devel] [PATCH] rtp enc/dec update for vvc

2024-06-15 Thread Nuo Mi

On Sat, Jun 15, 2024 at 2:38 AM Frank Plowman  wrote:

> Hi,
>
> Thanks for the patch.  Unfortunately it looks to be corrupted and does
> not apply.  Also, it looks as though you submitted five near-identical
> patches.  I would suggest you try directing patches to your own mailbox
> and re-applying them while debugging the formatting issues, rather than
> sending lots of corrupted patches to the ML.
>
Hi ftaft,
Thank you for the patch. you can refer to
https://ffmpeg.org/developer.html#Introduction for the checklist.


> > \ No newline at end of file
> > diff --git a/configure b/configure
> > index 83284427df..d331688eb4 100755
> > --- a/configure
> > +++ b/configure
> > @@ -296,6 +296,7 @@ External library support:
> >--enable-libwebp enable WebP encoding via libwebp [no]
> >--enable-libx264 enable H.264 encoding via x264 [no]
> >--enable-libx265 enable HEVC encoding via x265 [no]
> > +  --enable-libvvencenable H.266/VVC encoding via vvenc [no]
>
> This looks like you had the VVenC patchset applied when you created this
> commit.  If your patch depends on the VVenC patchset, it will have to
> wait until that is applied (which could well be in the next day or two).
>
Hi ftaft,
I have pushed the libvvenc patch. You can rebase and send your patch again.


>
> >--enable-libxeve enable EVC encoding via libxeve [no]
> >--enable-libxevd enable EVC decoding via libxevd [no]
> >--enable-libxavs enable AVS encoding via xavs [no]
> > @@ -1867,6 +1868,7 @@ EXTERNAL_LIBRARY_GPL_LIST="
> >  libvidstab
> >  libx264
>
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".

Re: [FFmpeg-devel] [PATCH v13 06/15] avcodec/vaapi_encode: move the dpb logic from VAAPI to base layer

2024-06-15 Thread Tong Wu

> >> From: ffmpeg-devel  On Behalf Of
> >> Lynne via ffmpeg-devel
> >> Sent: Monday, June 10, 2024 10:01 AM
> >> To: FFmpeg development discussions and patches  >> de...@ffmpeg.org>
> >> Cc: Lynne 
> >> Subject: Re: [FFmpeg-devel] [PATCH v13 06/15] avcodec/vaapi_encode:
> >> move the dpb logic from VAAPI to base layer
> >>
> >> On 07/06/2024 18:48, Lynne wrote:
> >>> On 07/06/2024 17:22, Wu, Tong1 wrote:
> > From: ffmpeg-devel  On Behalf Of
> >> Lynne
> > via ffmpeg-devel
> > Sent: Friday, June 7, 2024 11:10 PM
> > To: ffmpeg-devel@ffmpeg.org
> > Cc: Lynne 
> > Subject: Re: [FFmpeg-devel] [PATCH v13 06/15]
> avcodec/vaapi_encode:
> >> move
> > the dpb logic from VAAPI to base layer
> >
> > On 03/06/2024 11:18, tong1.wu-at-intel@ffmpeg.org wrote:
> >> From: Tong Wu 
> >>
> >> Move receive_packet function to base. This requires adding
> >> *alloc, *issue, *output, *free as hardware callbacks.
> >> HWBaseEncodePicture is introduced as the base layer structure.
> >> The related parameters in VAAPIEncodeContext are also extracted
> >> to HWBaseEncodeContext. Then
> >> DPB
> >> management logic can be fully extracted to base layer as-is.
> >>
> >> Signed-off-by: Tong Wu 
> >> ---
> >>     libavcodec/Makefile |   2 +-
> >>     libavcodec/hw_base_encode.c | 594
> >> 
> >>     libavcodec/hw_base_encode.h | 124 +
> >>     libavcodec/vaapi_encode.c   | 793 +
> >> ---
> >>     libavcodec/vaapi_encode.h   | 102 +---
> >>     libavcodec/vaapi_encode_av1.c   |  35 +-
> >>     libavcodec/vaapi_encode_h264.c  |  84 ++--
> >>     libavcodec/vaapi_encode_h265.c  |  53 ++-
> >>     libavcodec/vaapi_encode_mjpeg.c |  13 +-
> >>     libavcodec/vaapi_encode_mpeg2.c |  33 +-
> >>     libavcodec/vaapi_encode_vp8.c   |  18 +-
> >>     libavcodec/vaapi_encode_vp9.c   |  24 +-
> >>     12 files changed, 985 insertions(+), 890 deletions(-)
> >>     create mode 100644 libavcodec/hw_base_encode.c
> >
> > This patch doesn't apply,
> >
> > error: sha1 information is lacking or useless (libavcodec/
> > hw_base_encode.c).
> > error: could not build fake ancestor
> >
> > Could you resent the patchset or link me a repo so I can work with it?
> 
>  https://github.com/intel-media-ci/ffmpeg/pull/689 This is the same
>  as
>  v13 please have a try.
> >>>
> >>> That worked, thanks.
> >>
> >> I don't think the behaviour is correct when the encoding length is
> >> less than the decode delay. In my old Vulkan code, I had this piece
> >> of code in the initialization function:
> >>
> >>> if (!src) {
> >>>  ctx->end_of_stream = 1;
> >>>  /* Fix timestamps if we hit end-of-stream before the initial
> >>>   * decode delay has elapsed. */
> >>>  if (ctx->input_order < ctx->decode_delay)
> >>>  ctx->dts_pts_diff = ctx->pic_end->pts - ctx->first_pts;
> >>>  return AVERROR_EOF;
> >>> }
> >>
> >> I think a flush function should be added, to be called by each
> >> encoder, to make sure the timestamps remain correct.
> >>
> >
> > For the current patch set, this piece is in hw_base_encode_send_frame and
> works well for vaapi and d3d12 except when the encoding length is equal to
> the decode delay, which I'll sent a fix later. Do you mean Vulkan cannot
> integrate into this part and we have to make a callback for it?
> 
> No, I was just curious. Fair enough, it can be implemented in a later patch.
> 
> >
> >> Also, the D3D12VA structures need an FF prefix, e.g.
> >> D3D12VAEncodeContext -> FFD3D12VAEncodeContext.
> >
> > The current VAAPIEncodeContext has existed for a long time. Does it have
> any difference for D3D12VAEncodeContext? I mean both
> VAAPIEncodeContext and D3D12VAEncodeContext are parallel and only
> referenced in vaapi_encode_*.c (d3d12va_encode_*.c).
> >
> > Thanks,
> > Tong
> 
> I'm finishing up on the Vulkan test implementation, I'll see to pushing this
> patch over the weekend.

Sure. Thank you.

-Tong
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".

Re: [FFmpeg-devel] [PATCH] lavc/vvc: Invalidate PPSs which refer to a changed SPS

2024-06-15 Thread Frank Plowman

n 15/06/2024 13:24, Nuo Mi wrote:
> On Sat, Jun 15, 2024 at 2:35 PM Christophe Gisquet <
> christophe.gisq...@gmail.com> wrote:
> 
>> Le ven. 14 juin 2024, 11:39, Frank Plowman  a
>> écrit :
>>
>>> When the SPS associated with a particular SPS ID changes, invalidate all
>>> the PPSs which use that SPS ID.  Fixes crashes with illegal bitstreams.
>>> This is done in the CBS, rather than in libavcodec/vvc/ps.c like the SPS
>>> ID reuse validation, as parts of the CBS parsing process for PPSs
>>> depend on the SPS being referred to.
>>>
>>
>> I am uncertain about this. I have no definite knowledge nor proof, but I
>> would have thought these are persistent, IE it's legal to update some of
>> them, their validity depending on something else.
>>
> 
>> Wondering if the tested streams are thus conformant.
>>
>> But I don't know the actual rule. Maybe finding an EOB/EOS NUT? Related to
>> some particular shape of a clean random access point, that would require
>> retransmitting VPS/SPS/PPS/APS/... ?
>>
>> Asking Benjamin Bross might be a better option here.
>>
> Hi Chris,
> spec said sps should not change in a CVS.  Frank has some patches to fix a
> similar issue.
> https://github.com/FFmpeg/FFmpeg/commit/2d79ae3f8a3306d24afe43ba505693a8dbefd21b
> 
> 
> Hi Frank,
> Did it crash before your error hand code in ps.c?
> Could you send me the clip?
> 
> Thank you
> 

Hi both,

Thank you for your reviews.

An example of a crashing bitstream which is fixed by this patch is ID
295 available here: https://github.com/ffvvc/tests/pull/43.  The
relevant part of the bitstream is a sequence of NAL units

AU (decode_order=5)
18. SPS
sps_seq_parameter_set_id = 0
sps_ctb_log2_size_y = 5
19. PPS
pps_pic_parameter_set_id = 0
pps_seq_parameter_set_id = 0
20. IDR_N_LP
ph_pic_order_cnt_lsb = 0
NoOutputBeforeRecoveryFlag = 1
ph_pic_parameter_set_id = 0

AU (decode_order=6)
21. AUD
22. VPS
23. SPS
sps_seq_parameter_set_id = 0
sps_ctb_log2_size_y = 7
24. PREFIX_APS
25. IDR_N_LP
ph_pic_order_cnt_lsb = 0
NoOutputBeforeRecoveryFlag = 1
ph_pic_parameter_set_id = 0

The layout of SPSs alone is legal (not covered by the checks introduced
in 2d79ae3f8a3306d24afe43ba505693a8dbefd21b) as the second AU is a CLVSS
AU.  As a result, the bitstream crashes both before and after
2d79ae3f8a3306d24afe43ba505693a8dbefd21b.  What this patch does is
produce an error when the VCL NAL unit in the second AU (25.) tries to
use PPS ID 0, as the SPS NAL unit that PPS was defined with reference to
(18.) is no longer available.

Christophe, is my interpretation of your point correct when I say you
are suggesting that the above sequence may be legal, so long as the PPS
still satisfies the new bounds etc. derived from the second SPS?  I did
consider this, and I think it may be possible to implement by delaying
CBS element validation and inference until libavcodec/vvc/ps.c.
However, there are no bitstreams in the conformance suite which contain
such a structure and this is different to how the native HEVC decoder
behaves (see libavcodec/hevc/ps.c:72).

All the best,
-- 
Frank
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".

[FFmpeg-devel] [PATCH 68/73] avcodec/h261enc: Inline constants

2024-06-15 Thread Andreas Rheinhardt

Signed-off-by: Andreas Rheinhardt 
---
 libavcodec/h261enc.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/libavcodec/h261enc.c b/libavcodec/h261enc.c
index dd4419ec8c..8e08c749d1 100644
--- a/libavcodec/h261enc.c
+++ b/libavcodec/h261enc.c
@@ -133,8 +133,8 @@ static void h261_encode_motion(PutBitContext *pb, int val)
 {
 int sign, code;
 if (val == 0) {
-code = 0;
-put_bits(pb, ff_h261_mv_tab[code][1], ff_h261_mv_tab[code][0]);
+// Corresponds to ff_h261_mv_tab[0]
+put_bits(pb, 1, 1);
 } else {
 if (val > 15)
 val -= 32;
@@ -227,7 +227,7 @@ static void h261_encode_block(H261EncContext *h, int16_t 
*block, int n)
 }
 }
 if (last_index > -1)
-put_bits(&s->pb, rl->table_vlc[0][1], rl->table_vlc[0][0]); // EOB
+put_bits(&s->pb, 2, 0x2); // EOB
 }
 
 void ff_h261_encode_mb(MpegEncContext *s, int16_t block[6][64],
-- 
2.40.1

___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".

[FFmpeg-devel] [PATCH 69/73] avcodec/motion_est: Optimize dead code away

2024-06-15 Thread Andreas Rheinhardt

H.261 does not have B-frames.

Signed-off-by: Andreas Rheinhardt 
---
 libavcodec/motion_est.c | 14 +++---
 1 file changed, 7 insertions(+), 7 deletions(-)

diff --git a/libavcodec/motion_est.c b/libavcodec/motion_est.c
index e783e79a94..554fc9780e 100644
--- a/libavcodec/motion_est.c
+++ b/libavcodec/motion_est.c
@@ -537,7 +537,7 @@ static inline void set_p_mv_tables(MpegEncContext * s, int 
mx, int my, int mv4)
 /**
  * get fullpel ME search limits.
  */
-static inline void get_limits(MpegEncContext *s, int x, int y)
+static inline void get_limits(MpegEncContext *s, int x, int y, int bframe)
 {
 MotionEstContext * const c= &s->me;
 int range= c->avctx->me_range >> (1 + !!(c->flags&FLAG_QPEL));
@@ -551,7 +551,7 @@ static inline void get_limits(MpegEncContext *s, int x, int 
y)
 c->ymin = - y - 16;
 c->xmax = - x + s->width;
 c->ymax = - y + s->height;
-} else if (s->out_format == FMT_H261){
+} else if (!(av_builtin_constant_p(bframe) && bframe) && s->out_format == 
FMT_H261){
 // Search range of H.261 is different from other codec standards
 c->xmin = (x > 15) ? - 15 : 0;
 c->ymin = (y > 15) ? - 15 : 0;
@@ -921,7 +921,7 @@ void ff_estimate_p_frame_motion(MpegEncContext * s,
 c->mb_penalty_factor = get_penalty_factor(s->lambda, s->lambda2, 
c->avctx->mb_cmp);
 c->current_mv_penalty= c->mv_penalty[s->f_code] + MAX_DMV;
 
-get_limits(s, 16*mb_x, 16*mb_y);
+get_limits(s, 16*mb_x, 16*mb_y, 0);
 c->skip=0;
 
 /* intra / predictive decision */
@@ -1088,7 +1088,7 @@ int ff_pre_estimate_p_frame_motion(MpegEncContext * s,
 c->pre_penalty_factor= get_penalty_factor(s->lambda, s->lambda2, 
c->avctx->me_pre_cmp);
 c->current_mv_penalty= c->mv_penalty[s->f_code] + MAX_DMV;
 
-get_limits(s, 16*mb_x, 16*mb_y);
+get_limits(s, 16*mb_x, 16*mb_y, 0);
 c->skip=0;
 
 P_LEFT[0]   = s->p_mv_table[xy + 1][0];
@@ -1140,7 +1140,7 @@ static int estimate_motion_b(MpegEncContext *s, int mb_x, 
int mb_y,
 
 c->current_mv_penalty= mv_penalty;
 
-get_limits(s, 16*mb_x, 16*mb_y);
+get_limits(s, 16*mb_x, 16*mb_y, 1);
 
 if (s->motion_est != FF_ME_ZERO) {
 P_LEFT[0] = mv_table[mot_xy - 1][0];
@@ -1489,7 +1489,7 @@ static inline int direct_search(MpegEncContext * s, int 
mb_x, int mb_y)
 if(c->avctx->me_sub_cmp != c->avctx->mb_cmp && !c->skip)
 dmin= get_mb_score(s, mx, my, 0, 0, 0, 16, 1);
 
-get_limits(s, 16*mb_x, 16*mb_y); //restore c->?min/max, maybe not needed
+get_limits(s, 16*mb_x, 16*mb_y, 1); //restore c->?min/max, maybe not needed
 
 mv_table[mot_xy][0]= mx;
 mv_table[mot_xy][1]= my;
@@ -1509,7 +1509,7 @@ void ff_estimate_b_frame_motion(MpegEncContext * s,
 init_ref(c, s->new_pic->data, s->last_pic.data,
  s->next_pic.data, 16 * mb_x, 16 * mb_y, 2);
 
-get_limits(s, 16*mb_x, 16*mb_y);
+get_limits(s, 16*mb_x, 16*mb_y, 1);
 
 c->skip=0;
 
-- 
2.40.1

___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".

[FFmpeg-devel] [PATCH 70/73] avcodec/mpegvideo_enc: Constify pointers to static storage

2024-06-15 Thread Andreas Rheinhardt

These must not be modified (even when they are initialized at runtime
and therefore modifiable).

Signed-off-by: Andreas Rheinhardt 
---
 libavcodec/mpegvideo_enc.c | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/libavcodec/mpegvideo_enc.c b/libavcodec/mpegvideo_enc.c
index abeb235277..620ca08869 100644
--- a/libavcodec/mpegvideo_enc.c
+++ b/libavcodec/mpegvideo_enc.c
@@ -3914,8 +3914,7 @@ static int dct_quantize_trellis_c(MpegEncContext *s,
 int coeff_count[64];
 int qmul, qadd, start_i, last_non_zero, i, dc;
 const int esc_length= s->ac_esc_length;
-uint8_t * length;
-uint8_t * last_length;
+const uint8_t *length, *last_length;
 const int lambda= s->lambda2 >> (FF_LAMBDA_SHIFT - 6);
 int mpeg2_qscale;
 
-- 
2.40.1

___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".

[FFmpeg-devel] [PATCH 71/73] avcodec/h261data: Make some tables non-static

2024-06-15 Thread Andreas Rheinhardt

This will allow to avoid the indirection via ff_h261_rl_tcoeff
in future commits.

Signed-off-by: Andreas Rheinhardt 
---
 libavcodec/h261.h |  4 
 libavcodec/h261data.c | 12 ++--
 2 files changed, 10 insertions(+), 6 deletions(-)

diff --git a/libavcodec/h261.h b/libavcodec/h261.h
index 11a8a8685a..4279a12677 100644
--- a/libavcodec/h261.h
+++ b/libavcodec/h261.h
@@ -50,6 +50,10 @@ extern const uint8_t ff_h261_mv_tab[17][2];
 extern const uint8_t ff_h261_cbp_tab[63][2];
 extern RLTable ff_h261_rl_tcoeff;
 
+extern const uint16_t ff_h261_tcoeff_vlc[65][2];
+extern const int8_t ff_h261_tcoeff_level[64];
+extern const int8_t ff_h261_tcoeff_run[64];
+
 void ff_h261_loop_filter(MpegEncContext *s);
 
 #endif /* AVCODEC_H261_H */
diff --git a/libavcodec/h261data.c b/libavcodec/h261data.c
index bccd9e5f56..3ee750f98c 100644
--- a/libavcodec/h261data.c
+++ b/libavcodec/h261data.c
@@ -104,7 +104,7 @@ const uint8_t ff_h261_cbp_tab[63][2] = {
 };
 
 // H.261 VLC table for transform coefficients
-static const uint16_t h261_tcoeff_vlc[65][2] = {
+const uint16_t ff_h261_tcoeff_vlc[65][2] = {
 {  0x2,  2 }, {  0x3,  2 }, {  0x4,  4 }, {  0x5,  5 },
 {  0x6,  7 }, { 0x26,  8 }, { 0x21,  8 }, {  0xa, 10 },
 { 0x1d, 12 }, { 0x18, 12 }, { 0x13, 12 }, { 0x10, 12 },
@@ -124,7 +124,7 @@ static const uint16_t h261_tcoeff_vlc[65][2] = {
 {  0x1,  6 }  // escape
 };
 
-static const int8_t h261_tcoeff_level[64] = {
+const int8_t ff_h261_tcoeff_level[64] = {
 0, 1,  2,  3,  4,  5,  6,  7,
 8, 9, 10, 11, 12, 13, 14, 15,
 1, 2,  3,  4,  5,  6,  7,  1,
@@ -135,7 +135,7 @@ static const int8_t h261_tcoeff_level[64] = {
 1, 1,  1,  1,  1,  1,  1,  1
 };
 
-static const int8_t h261_tcoeff_run[64] = {
+const int8_t ff_h261_tcoeff_run[64] = {
  0,
  0,  0,  0,  0,  0,  0,  0,  0,
  0,  0,  0,  0,  0,  0,  0,  1,
@@ -150,7 +150,7 @@ static const int8_t h261_tcoeff_run[64] = {
 RLTable ff_h261_rl_tcoeff = {
 64,
 64,
-h261_tcoeff_vlc,
-h261_tcoeff_run,
-h261_tcoeff_level,
+ff_h261_tcoeff_vlc,
+ff_h261_tcoeff_run,
+ff_h261_tcoeff_level,
 };
-- 
2.40.1

___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".

[FFmpeg-devel] [PATCH 72/73] avcodec/h261enc: Avoid RLTable when writing macroblock

2024-06-15 Thread Andreas Rheinhardt

The RLTable API in rl.c is not well designed for codecs with
an explicit end-of-block code. ff_h261_rl_tcoeff's vlc has
the EOB code as first element (presumably so that the decoder
can check for it via "if (level == 0)") and this implies
that the indices returned by get_rl_index() are off by one
for run == 0 which is therefore explicitly checked.

This commit changes this by adding a simple LUT for the
values not requiring escaping. It is easy to directly
include the sign bit into this, so this has also been done.

Signed-off-by: Andreas Rheinhardt 
---
 libavcodec/h261enc.c | 51 +++-
 1 file changed, 31 insertions(+), 20 deletions(-)

diff --git a/libavcodec/h261enc.c b/libavcodec/h261enc.c
index 8e08c749d1..b19830d578 100644
--- a/libavcodec/h261enc.c
+++ b/libavcodec/h261enc.c
@@ -36,6 +36,14 @@
 #include "h261enc.h"
 #include "mpegvideoenc.h"
 
+#define H261_MAX_RUN   26
+#define H261_MAX_LEVEL 15
+
+static struct VLCLUT {
+uint8_t len;
+uint16_t code;
+} vlc_lut[H261_MAX_RUN + 1][32 /* 0..2 * H261_MAX_LEN are used */];
+
 static uint8_t uni_h261_rl_len [64*64*2*2];
 #define UNI_ENC_INDEX(last,run,level) ((last)*128*64 + (run)*128 + (level))
 
@@ -165,10 +173,8 @@ static inline int get_cbp(MpegEncContext *s, int16_t 
block[6][64])
 static void h261_encode_block(H261EncContext *h, int16_t *block, int n)
 {
 MpegEncContext *const s = &h->s;
-int level, run, i, j, last_index, last_non_zero, sign, slevel, code;
-const RLTable *rl;
+int level, run, i, j, last_index, last_non_zero;
 
-rl = &ff_h261_rl_tcoeff;
 if (s->mb_intra) {
 /* DC coef */
 level = block[0];
@@ -204,24 +210,18 @@ static void h261_encode_block(H261EncContext *h, int16_t 
*block, int n)
 level = block[j];
 if (level) {
 run= i - last_non_zero - 1;
-sign   = 0;
-slevel = level;
-if (level < 0) {
-sign  = 1;
-level = -level;
-}
-code = get_rl_index(rl, 0 /*no last in H.261, EOB is used*/,
-run, level);
-if (run == 0 && level < 16)
-code += 1;
-put_bits(&s->pb, rl->table_vlc[code][1], rl->table_vlc[code][0]);
-if (code == rl->n) {
-put_bits(&s->pb, 6, run);
-av_assert1(slevel != 0);
-av_assert1(level <= 127);
-put_sbits(&s->pb, 8, slevel);
+
+if (run <= H261_MAX_RUN &&
+(unsigned)(level + H261_MAX_LEVEL) <= 2 * H261_MAX_LEVEL &&
+vlc_lut[run][level + H261_MAX_LEVEL].len) {
+put_bits(&s->pb, vlc_lut[run][level + H261_MAX_LEVEL].len,
+ vlc_lut[run][level + H261_MAX_LEVEL].code);
 } else {
-put_bits(&s->pb, 1, sign);
+/* Escape */
+put_bits(&s->pb, 6 + 6, (1 << 6) | run);
+av_assert1(level != 0);
+av_assert1(FFABS(level) <= 127);
+put_sbits(&s->pb, 8, level);
 }
 last_non_zero = i;
 }
@@ -365,6 +365,17 @@ static av_cold void h261_encode_init_static(void)
 
 ff_rl_init(&ff_h261_rl_tcoeff, h261_rl_table_store);
 init_uni_h261_rl_tab(&ff_h261_rl_tcoeff, uni_h261_rl_len);
+
+// The following loop is over the ordinary elements, not EOB or escape.
+for (size_t i = 1; i < FF_ARRAY_ELEMS(ff_h261_tcoeff_vlc) - 1; i++) {
+unsigned run   = ff_h261_tcoeff_run[i];
+unsigned level = ff_h261_tcoeff_level[i];
+unsigned len   = ff_h261_tcoeff_vlc[i][1] + 1 /* sign */;
+unsigned code  = ff_h261_tcoeff_vlc[i][0];
+
+vlc_lut[run][H261_MAX_LEVEL + level] = (struct VLCLUT){ len, code << 1 
};
+vlc_lut[run][H261_MAX_LEVEL - level] = (struct VLCLUT){ len, (code << 
1) | 1 };
+}
 }
 
 av_cold int ff_h261_encode_init(MpegEncContext *s)
-- 
2.40.1

___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".

[FFmpeg-devel] [PATCH 73/73] avcodec/h261enc: Fix ac_vlc_length tables

2024-06-15 Thread Andreas Rheinhardt

These tables are supposed to contain the number of bits needed
to encode a given (run, level) pair. Yet the number of bits
for pairs needing the escape code was wrong (it only contained
the escape code and not the bits needed for run and level).

Furthermore, H.261 (a format with explicit end-of-block codes)
does not work well together with the RLTable API from rl.c:
The EOB code is the first one in ff_h261_rl_tcoeff's VLC table
and has a run value of zero. Therefore the result of get_rl_index()
is off by one for run == 0 and level values with explicit
(run, level) pair.

Fixing this necessitated changing the ref files of the
vsynth*-h261-trellis tests. Both filesizes as well as PSNR
decreased. If one used a qscale value of 11 for this test,
one would have received files with about the same size as
before this patch (with qscale 12), but with better PSNR.

Signed-off-by: Andreas Rheinhardt 
---
 libavcodec/h261enc.c  | 59 +--
 tests/ref/vsynth/vsynth1-h261-trellis |  8 +--
 tests/ref/vsynth/vsynth2-h261-trellis |  8 +--
 tests/ref/vsynth/vsynth_lena-h261-trellis |  8 +--
 4 files changed, 24 insertions(+), 59 deletions(-)

diff --git a/libavcodec/h261enc.c b/libavcodec/h261enc.c
index b19830d578..a901c32e42 100644
--- a/libavcodec/h261enc.c
+++ b/libavcodec/h261enc.c
@@ -38,14 +38,15 @@
 
 #define H261_MAX_RUN   26
 #define H261_MAX_LEVEL 15
+#define H261_ESC_LEN   (6 + 6 + 8)
 
 static struct VLCLUT {
 uint8_t len;
 uint16_t code;
 } vlc_lut[H261_MAX_RUN + 1][32 /* 0..2 * H261_MAX_LEN are used */];
 
-static uint8_t uni_h261_rl_len [64*64*2*2];
-#define UNI_ENC_INDEX(last,run,level) ((last)*128*64 + (run)*128 + (level))
+static uint8_t uni_h261_rl_len [64 * 128];
+static uint8_t uni_h261_rl_len_last[64 * 128];
 
 typedef struct H261EncContext {
 MpegEncContext s;
@@ -320,51 +321,10 @@ void ff_h261_encode_mb(MpegEncContext *s, int16_t 
block[6][64],
 }
 }
 
-static av_cold void init_uni_h261_rl_tab(const RLTable *rl, uint8_t *len_tab)
-{
-int slevel, run, last;
-
-av_assert0(MAX_LEVEL >= 64);
-av_assert0(MAX_RUN   >= 63);
-
-for(slevel=-64; slevel<64; slevel++){
-if(slevel==0) continue;
-for(run=0; run<64; run++){
-for(last=0; last<=1; last++){
-const int index= UNI_ENC_INDEX(last, run, slevel+64);
-int level= slevel < 0 ? -slevel : slevel;
-int len, code;
-
-len_tab[index]= 100;
-
-/* ESC0 */
-code= get_rl_index(rl, 0, run, level);
-len=  rl->table_vlc[code][1] + 1;
-if(last)
-len += 2;
-
-if(code!=rl->n && len < len_tab[index]){
-len_tab [index]= len;
-}
-/* ESC */
-len = rl->table_vlc[rl->n][1];
-if(last)
-len += 2;
-
-if(len < len_tab[index]){
-len_tab [index]= len;
-}
-}
-}
-}
-}
-
 static av_cold void h261_encode_init_static(void)
 {
-static uint8_t h261_rl_table_store[2][2 * MAX_RUN + MAX_LEVEL + 3];
-
-ff_rl_init(&ff_h261_rl_tcoeff, h261_rl_table_store);
-init_uni_h261_rl_tab(&ff_h261_rl_tcoeff, uni_h261_rl_len);
+memset(uni_h261_rl_len,  H261_ESC_LEN, sizeof(uni_h261_rl_len));
+memset(uni_h261_rl_len_last, H261_ESC_LEN + 2 /* EOB */, 
sizeof(uni_h261_rl_len_last));
 
 // The following loop is over the ordinary elements, not EOB or escape.
 for (size_t i = 1; i < FF_ARRAY_ELEMS(ff_h261_tcoeff_vlc) - 1; i++) {
@@ -375,6 +335,11 @@ static av_cold void h261_encode_init_static(void)
 
 vlc_lut[run][H261_MAX_LEVEL + level] = (struct VLCLUT){ len, code << 1 
};
 vlc_lut[run][H261_MAX_LEVEL - level] = (struct VLCLUT){ len, (code << 
1) | 1 };
+
+uni_h261_rl_len [UNI_AC_ENC_INDEX(run, 64 + level)] = len;
+uni_h261_rl_len [UNI_AC_ENC_INDEX(run, 64 - level)] = len;
+uni_h261_rl_len_last[UNI_AC_ENC_INDEX(run, 64 + level)] = len + 2;
+uni_h261_rl_len_last[UNI_AC_ENC_INDEX(run, 64 - level)] = len + 2;
 }
 }
 
@@ -398,10 +363,10 @@ av_cold int ff_h261_encode_init(MpegEncContext *s)
 
 s->min_qcoeff   = -127;
 s->max_qcoeff   = 127;
-s->ac_esc_length= 6+6+8;
+s->ac_esc_length= H261_ESC_LEN;
 
 s->intra_ac_vlc_length  = s->inter_ac_vlc_length  = 
uni_h261_rl_len;
-s->intra_ac_vlc_last_length = s->inter_ac_vlc_last_length = 
uni_h261_rl_len + 128*64;
+s->intra_ac_vlc_last_length = s->inter_ac_vlc_last_length = 
uni_h261_rl_len_last;
 ff_thread_once(&init_static_once, h261_encode_init_static);
 
 return 0;
diff --git a/tests/ref/vsynth/vsynth1-h261-trellis 
b/tests/ref/vsynth/vsynth1-h261-trellis
index 87b078b0d5..0cbb9b0e18 100644
--- a/tests/ref/vsynth/vsynth1-h261-trellis
+++ b/tests/ref/vsynth/vsynth1-h261-trellis
@

[FFmpeg-devel] [PATCH] avcodec/loongarch/Makefile: Fix vc1dsp_lasx.o build criterion

2024-06-15 Thread Andreas Rheinhardt

Fixes ticket #11057.

Signed-off-by: Andreas Rheinhardt 
---
 libavcodec/loongarch/Makefile | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/libavcodec/loongarch/Makefile b/libavcodec/loongarch/Makefile
index 07da2964e4..92c8b35906 100644
--- a/libavcodec/loongarch/Makefile
+++ b/libavcodec/loongarch/Makefile
@@ -12,7 +12,7 @@ OBJS-$(CONFIG_HEVC_DECODER)   += 
loongarch/hevcdsp_init_loongarch.o
 LASX-OBJS-$(CONFIG_H264QPEL)  += loongarch/h264qpel_lasx.o
 LASX-OBJS-$(CONFIG_H264DSP)   += loongarch/h264dsp_lasx.o \
  loongarch/h264_deblock_lasx.o
-LASX-OBJS-$(CONFIG_VC1_DECODER)   += loongarch/vc1dsp_lasx.o
+LASX-OBJS-$(CONFIG_VC1DSP)+= loongarch/vc1dsp_lasx.o
 LASX-OBJS-$(CONFIG_HPELDSP)   += loongarch/hpeldsp_lasx.o
 LASX-OBJS-$(CONFIG_IDCTDSP)   += loongarch/simple_idct_lasx.o  \
  loongarch/idctdsp_lasx.o
-- 
2.40.1

___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".

Re: [FFmpeg-devel] [PATCH 1/2] avcodec/jpeg2000dec: Add support for placeholder passes, CAP, and CPF markers

2024-06-15 Thread Michael Niedermayer

On Sat, Jun 15, 2024 at 12:15:16PM +0900, Osamu Watanabe wrote:
> Signed-off-by: Osamu Watanabe 
> ---
>  libavcodec/jpeg2000.h  |  10 +
>  libavcodec/jpeg2000dec.c   | 454 ++---
>  libavcodec/jpeg2000dec.h   |   7 +
>  libavcodec/jpeg2000htdec.c | 225 ++
>  libavcodec/jpeg2000htdec.h |   2 +-
>  5 files changed, 518 insertions(+), 180 deletions(-)

this breaks decoding of

tickets/4631/199.jp2
https://trac.ffmpeg.org/attachment/ticket/4631/199.jp2

and others

thx

[...]
-- 
Michael GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB

If you drop bombs on a foreign country and kill a hundred thousand
innocent people, expect your government to call the consequence
"unprovoked inhuman terrorist attacks" and use it to justify dropping
more bombs and killing more people. The technology changed, the idea is old.

signature.asc
Description: PGP signature
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".

[FFmpeg-devel] libavformat / vapoursynth update

2024-06-15 Thread Stefan Oltmanns via ffmpeg-devel


Hello,

I updated the VapourSynth input module from the old API to the "new" API
(the "new" API was introduced 4 years ago).

The greatest advantage of the new API is, that it requires only a single
import from the VapourSynth library, optimizing it for runtime loading
instead of linking to the library during the build process.

Currently almost no one builds ffmpeg with VapourSynth enabled:
Understandable, because VapourSynth is not just an external library, but
an entire application with its own dependencies. Making ffmpeg load the
VapourSynth library at runtime only when a VapourSynth script is opened
would allow building ffmpeg binaries with VapourSynth enabled, that do
not require VapourSynth to be installed on the user system (unless the
user would want to open a VapourSynth script).

I have already successfully tested this approach on different platforms
(Linux, macOS and partly Windows).

There are a few options on how to do it exactly:

-Remove linking at build time all together and disable VapourSynth on
platforms that do not support LoadLibrary or dlopen (as VapourSynth is
based on plug-ins, on platforms that do not have these function it won't
work at all) or make it selectable at build time.

-Include the header files directly in ffmpeg (VapourSynth is LPGL 2.1 or
later just like ffmpeg, so no license issue), this would allow for
building ffmpeg with VapourSynth support without the need to have
VapourSynth installed on the build machine. The alternative would be to
use the header files installed on the system if available.
(In case additional files should be avoided, the header files could of
course be copied into the vapoursynth.c, or at least merged into a
single header file).

What do you think about this? Should I prepare a patch for this? What
option should I choose?

Best regards,
Stefan
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".

[FFmpeg-devel] [PATCH] fate/jpeg2000dec: add support for p0_10.j2k

2024-06-15 Thread pal

From: Pierre-Anthony Lemieux 

p0_10.j2k is one of the reference codestreams included in Rec. ITU-T T.803 | 
ISO/IEC 15444-4.
---
 tests/fate/jpeg2000.mak  | 3 +++
 tests/ref/fate/jpeg2000dec-p0_10 | 6 ++
 2 files changed, 9 insertions(+)
 create mode 100644 tests/ref/fate/jpeg2000dec-p0_10

diff --git a/tests/fate/jpeg2000.mak b/tests/fate/jpeg2000.mak
index 2969d2cf0a..a99b0c4e0c 100644
--- a/tests/fate/jpeg2000.mak
+++ b/tests/fate/jpeg2000.mak
@@ -42,6 +42,9 @@ fate-jpeg2000dec-p0_08: CMD = framecrc -flags +bitexact 
-auto_conversion_filters
 FATE_JPEG2000DEC += fate-jpeg2000dec-p0_09
 fate-jpeg2000dec-p0_09: CMD = framecrc -flags +bitexact -i 
$(TARGET_SAMPLES)/jpeg2000/itu-iso/codestreams_profile0/p0_09.j2k
 
+FATE_JPEG2000DEC += fate-jpeg2000dec-p0_10
+fate-jpeg2000dec-p0_10: CMD = framecrc -flags +bitexact -i 
$(TARGET_SAMPLES)/jpeg2000/itu-iso/codestreams_profile0/p0_10.j2k
+
 FATE_JPEG2000DEC += fate-jpeg2000dec-p0_11
 fate-jpeg2000dec-p0_11: CMD = framecrc -flags +bitexact -i 
$(TARGET_SAMPLES)/jpeg2000/itu-iso/codestreams_profile0/p0_11.j2k
 
diff --git a/tests/ref/fate/jpeg2000dec-p0_10 b/tests/ref/fate/jpeg2000dec-p0_10
new file mode 100644
index 00..16c4e5e39d
--- /dev/null
+++ b/tests/ref/fate/jpeg2000dec-p0_10
@@ -0,0 +1,6 @@
+#tb 0: 1/25
+#media_type 0: video
+#codec_id 0: rawvideo
+#dimensions 0: 64x64
+#sar 0: 0/1
+0,  0,  0,1,12288, 0x68638483
-- 
2.25.1

___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".

Re: [FFmpeg-devel] [PATCH v2] aarch64: Use cntvct_el0 as timer register on Android and macOS

2024-06-15 Thread Zhao Zhili



> On Jun 14, 2024, at 19:09, Martin Storsjö  wrote:
> 
> The default timer register pmccntr_el0 usually requires enabling
> access with e.g. a kernel module.
> 
> On macOS, using cntvct_el0 gives measurements with the same
> magnitude as mach_absolute_time (which is used currently), but
> possibly with a little less overhead/noise.
> ---
> cntvct_el0 should have less noise than mach_absolute_time or
> clock_gettime.
> 
> In one tested case, the cntvct_el0 timer has a frequency of 25 MHz
> (readable via the register cntfrq_el0).
> ---
> libavutil/aarch64/timer.h | 11 ++-
> 1 file changed, 10 insertions(+), 1 deletion(-)
> 
> diff --git a/libavutil/aarch64/timer.h b/libavutil/aarch64/timer.h
> index fadc9568f8..922b0c5598 100644
> --- a/libavutil/aarch64/timer.h
> +++ b/libavutil/aarch64/timer.h
> @@ -24,7 +24,7 @@
> #include 
> #include "config.h"
> 
> -#if HAVE_INLINE_ASM && !defined(__APPLE__)
> +#if HAVE_INLINE_ASM
> 
> #define AV_READ_TIME read_time
> 
> @@ -33,7 +33,16 @@ static inline uint64_t read_time(void)
> uint64_t cycle_counter;
> __asm__ volatile(
> "isb   \t\n"
> +#if defined(__ANDROID__) || defined(__APPLE__)
> +// cntvct_el0 has lower resolution than pmccntr_el0, but is usually
> +// accessible from user space by default.
> +"mrs %0, cntvct_el0"
> +#else
> +// pmccntr_el0 has higher resolution, but is usually not accessible
> +// from user space by default (but access can be enabled with a 
> custom
> +// kernel module).
> "mrs %0, pmccntr_el0   "
> +#endif
> : "=r"(cycle_counter) :: "memory" );
> 
> return cycle_counter;

LGTM, thanks!

___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".

Re: [FFmpeg-devel] [PATCH 1/2] avutil/timer: define macos kperf as AV_READ_TIME

2024-06-15 Thread Zhao Zhili




> On Jun 12, 2024, at 23:22, Zhao Zhili  wrote:
> 
> From: Zhao Zhili 
> 
> Firstly, make ff_kperf_cycles as an implementation of AV_READ_TIME
> avoids code duplication.
> 
> Secondly, fix compilation error since 6a18c0bc87e when macos-kperf
> is enabled. mach_time.h is included only when CONFIG_MACOS_KPERF
> is 0. The error happened due to define mach_absolute_time as
> AV_READ_TIME but missing include mach_time.h. Define macos kperf
> as AV_READ_TIME fixed the issue.

Ping.

> ---
> libavutil/macos_kperf.c   |  8 +++-
> libavutil/macos_kperf.h   |  3 ++-
> libavutil/timer.h | 10 --
> tests/checkasm/checkasm.c |  8 
> 4 files changed, 5 insertions(+), 24 deletions(-)
> 
> diff --git a/libavutil/macos_kperf.c b/libavutil/macos_kperf.c
> index 9fb047..a0bc845fd3 100644
> --- a/libavutil/macos_kperf.c
> +++ b/libavutil/macos_kperf.c
> @@ -96,15 +96,13 @@ static void kperf_init(void)
> av_assert0(kpc_set_thread_counting(KPC_MASK) == 0);
> }
> 
> -void ff_kperf_init(void)
> +uint64_t ff_kperf_cycles(void)
> {
> static AVOnce init_static_once = AV_ONCE_INIT;
> +uint64_t counters[COUNTERS_COUNT];
> +
> ff_thread_once(&init_static_once, kperf_init);
> -}
> 
> -uint64_t ff_kperf_cycles(void)
> -{
> -uint64_t counters[COUNTERS_COUNT];
> if (kpc_get_thread_counters(0, COUNTERS_COUNT, counters)) {
> return -1;
> }
> diff --git a/libavutil/macos_kperf.h b/libavutil/macos_kperf.h
> index d039691340..40bbc616df 100644
> --- a/libavutil/macos_kperf.h
> +++ b/libavutil/macos_kperf.h
> @@ -21,7 +21,8 @@
> 
> #include 
> 
> -void ff_kperf_init(void);
> uint64_t ff_kperf_cycles(void);
> 
> +#define AV_READ_TIME ff_kperf_cycles
> +
> #endif /* AVUTIL_MACOS_KPERF_H */
> diff --git a/libavutil/timer.h b/libavutil/timer.h
> index 6bd6a0c645..16f2b1a96c 100644
> --- a/libavutil/timer.h
> +++ b/libavutil/timer.h
> @@ -142,16 +142,6 @@
> read(linux_perf_fd, &tperf, sizeof(tperf)); \
> TIMER_REPORT(id, tperf)
> 
> -#elif CONFIG_MACOS_KPERF
> -
> -#define START_TIMER \
> -uint64_t tperf; \
> -ff_kperf_init();\
> -tperf = ff_kperf_cycles();
> -
> -#define STOP_TIMER(id)  \
> -TIMER_REPORT(id, ff_kperf_cycles() - tperf);
> -
> #elif defined(AV_READ_TIME)
> #define START_TIMER \
> uint64_t tend;  \
> diff --git a/tests/checkasm/checkasm.c b/tests/checkasm/checkasm.c
> index 2329e2e1bc..28237b4d25 100644
> --- a/tests/checkasm/checkasm.c
> +++ b/tests/checkasm/checkasm.c
> @@ -775,12 +775,6 @@ static int bench_init_linux(void)
> }
> return 0;
> }
> -#elif CONFIG_MACOS_KPERF
> -static int bench_init_kperf(void)
> -{
> -ff_kperf_init();
> -return 0;
> -}
> #else
> static int bench_init_ffmpeg(void)
> {
> @@ -806,8 +800,6 @@ static int bench_init(void)
> {
> #if CONFIG_LINUX_PERF
> int ret = bench_init_linux();
> -#elif CONFIG_MACOS_KPERF
> -int ret = bench_init_kperf();
> #else
> int ret = bench_init_ffmpeg();
> #endif
> -- 
> 2.42.0
> 

___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".

Re: [FFmpeg-devel] [PATCH 1/2] avcodec/jpeg2000dec: Add support for placeholder passes, CAP, and CPF markers

Re: [FFmpeg-devel] [PATCH v2] movenc: Add an option for hiding fragments at the end

Re: [FFmpeg-devel] [PATCH v6 1/4] doc: Explain what "context" means

[FFmpeg-devel] [PATCH 1/2] swscale/aarch64: Add bgr24 to yuv

[FFmpeg-devel] [PATCH 2/2] swscale/aarch64: Add bgra/rgba to yuv

[FFmpeg-devel] [PATCH v4 3/4] lavc/vp9dsp: R-V V mc tap h v

[FFmpeg-devel] [PATCH v4 4/4] lavc/vp9dsp: R-V V mc tap hv

[FFmpeg-devel] [PATCH v4 1/4] lavc/vp9dsp: R-V V mc bilin h v

[FFmpeg-devel] [PATCH v4 2/4] lavc/vp9dsp: R-V V mc bilin hv

Re: [FFmpeg-devel] [PATCH v4 1/4] lavc/vp9dsp: R-V V mc bilin h v

Re: [FFmpeg-devel] [PATCH v4 2/4] lavc/vp9dsp: R-V V mc bilin hv

Re: [FFmpeg-devel] [PATCH v4 3/4] lavc/vp9dsp: R-V V mc tap h v

Re: [FFmpeg-devel] [PATCH] lavc/vvc: Invalidate PPSs which refer to a changed SPS

Re: [FFmpeg-devel] [PATCH v5 1/1] avcodec: add external enc libvvenc for H266/VVC

Re: [FFmpeg-devel] [PATCH] rtp enc/dec update for vvc

Re: [FFmpeg-devel] [PATCH v13 06/15] avcodec/vaapi_encode: move the dpb logic from VAAPI to base layer

Re: [FFmpeg-devel] [PATCH] lavc/vvc: Invalidate PPSs which refer to a changed SPS

[FFmpeg-devel] [PATCH 68/73] avcodec/h261enc: Inline constants

[FFmpeg-devel] [PATCH 69/73] avcodec/motion_est: Optimize dead code away

[FFmpeg-devel] [PATCH 70/73] avcodec/mpegvideo_enc: Constify pointers to static storage

[FFmpeg-devel] [PATCH 71/73] avcodec/h261data: Make some tables non-static

[FFmpeg-devel] [PATCH 72/73] avcodec/h261enc: Avoid RLTable when writing macroblock

[FFmpeg-devel] [PATCH 73/73] avcodec/h261enc: Fix ac_vlc_length tables

[FFmpeg-devel] [PATCH] avcodec/loongarch/Makefile: Fix vc1dsp_lasx.o build criterion

Re: [FFmpeg-devel] [PATCH 1/2] avcodec/jpeg2000dec: Add support for placeholder passes, CAP, and CPF markers

[FFmpeg-devel] libavformat / vapoursynth update

[FFmpeg-devel] [PATCH] fate/jpeg2000dec: add support for p0_10.j2k

Re: [FFmpeg-devel] [PATCH v2] aarch64: Use cntvct_el0 as timer register on Android and macOS

Re: [FFmpeg-devel] [PATCH 1/2] avutil/timer: define macos kperf as AV_READ_TIME

29 matches

Site Navigation

Mail list logo

Footer information