se PSHUFB
addition since SSSE3.
Now, we try to optimzie with AVX, AVX2 and AVX512, so I suggest we use proposed
algorithm to get more performance.
Regards,
Min Chen
At 2021-09-28 13:34:03, "Wu Jianhua" wrote:
>With the accelerating by means of AVX2, the uyvytoyuv422 can be fast
Hello,
Excuse me, how about FMADD on AVX2 platform?
For example
+mulps m7, m7, m14
+addps m0, m0, m7
==>
fmadd231ps m0,m7,m14
Regards,
Min Chen
2021-09-29 09:18:05,mindm...@gmail.com
>From: Mark Reid
>
>Only supports float and 16bit planer formats at the momom
Hello,
>+pb_shuffle_low: times 4 db 1, 3, 5, 7, 9, 11, 13, 15, -1, -1, -1, -1, -1, -1,
>-1, -1
Why we times 4?
AVX2 provided instruction VPBROADCASTQ to load these constant into SIMD
register.
Moreover, the plane U/V also apply same algorithm to get improve.
Regards,
Min Chen
At 2021
At 2021-09-30 15:23:08, "Wu, Jianhua" wrote:
>Min Chen wrote:
>> Sent: Thursday, September 30, 2021 10:29 AM
>> To: FFmpeg development discussions and patches > de...@ffmpeg.org>
>> Subject: Re: [FFmpeg-devel] [PATCH v2 3/4] libswscale/x86/rgb2rgb:
Hi,
Glad to hear there have some optimize code for loongarch.
In my view, a remote debuggable machine may help more people focus on loongarch
assembly code. for generic C/C++ code, qemu enough.
Regards,
Min Chen
At 2021-11-02 20:51:43, wrote:
>Hello
>
>I am trying to add su
Inlined a few comments for ff_pred16x16_top_dc_neon_10, other are similar.
At 2021-04-14 20:35:44, "Martin Storsjö" wrote:
>On Tue, 13 Apr 2021, Mikhail Nitenko wrote:
>
>> Benchmarks:
>> pred16x16_dc_10_c: 124.0
>> pred16x16_dc_10_neon: 97.2
>> pred16x16_horizontal_10_c: 71.7
>> pred16x16_horizo
> 下面是被转发的邮件:
>
> 发件人: chen
> 主题: [FFmpeg-devel] [PATCH] avfilter/vf_convolution: add 16-column operation
> for filter_column() to prepare for x86 SIMD.
> 日期: 2019年12月02日 GMT+8 11:36:50
> 收件人: xuju...@sjtu.edu.cn
>
> In this case, modify in filter_slice(…) is unnec
I have a little suggest on filter_column16(..) [the function]
Firstly, the function is confused with filter16_column(..)
Secondly, the function's algoritym based on row direction, it means reduced
address calculate operators and less cache performance, cost of them may more
than calculate cos
I have a little suggest on filter_column16(..) [the function]
Firstly, the function is confused with filter16_column(..)
Secondly, the function's algoritym based on row direction, it means reduced
address calculate operators and less cache performance, cost of them may more
than calculate cos
This is toy only, it depends on compiler
On my PC, it helpful my old version compiler generate movaps other than movups.
At 2019-12-02 17:21:58, "Carl Eugen Hoyos" wrote:
>Am Mo., 2. Dez. 2019 um 08:33 Uhr schrieb chen :
>
>> +#define __assume(cond) do { if (!(cond))
comments inline in code
At 2019-12-03 15:52:07, xuju...@sjtu.edu.cn wrote:
>From: Xu Jun
>
>+; void filter_column(uint8_t *dst, int height,
>+; float rdiv, float bias, const int *const matrix,
>+; const uint8_t *c[], int length, int radius,
>+;
At 2019-12-04 08:59:08, "Song, Ruiling" wrote:
>> -Original Message-
>> From: ffmpeg-devel On Behalf Of
>> chen
>> Sent: Tuesday, December 3, 2019 4:59 PM
>> To: FFmpeg development discussions and patches > de...@ffmpeg.org>
>&
At 2019-12-04 16:51:52, "Paul B Mahol" wrote:
>On 12/4/19, Song, Ruiling wrote:
>>> -Original Message-
>>> From: ffmpeg-devel On Behalf Of
>>> chen
>>> >> At 2019-12-03 15:52:07, xuju...@sjtu.edu.cn wrote:
>>> >>
comments inlined
At 2019-12-22 16:37:03, xuju...@sjtu.edu.cn wrote:
>From: Xu Jun
>
>Performance improves about 10% compared to v1.
>
>Tested using this command:
>./ffmpeg_g -s 1280*720 -pix_fmt yuv420p -i test.yuv -vf convolution="1 2 3 4 5
>6 7 8 9:1 2 3 4 5 6 7 8 9:1 2 3 4 5 6 7 8 9:1 2 3 4 5
inline comment with prefix [MC]
At 2021-04-29 03:50:26, "Josh Dekker" wrote:
>From: Rafal Dabrowa
>
>Benchmarked on Apple M1:
>
>put_hevc_epel_bi_h4_8_c: 69.9
>put_hevc_epel_bi_h4_8_neon: 15.4
>put_hevc_epel_bi_h6_8_c: 137.1
>put_hevc_epel_bi_h6_8_neon: 31.9
>put_hevc_epel_bi_h8_8_c: 124.6
>put_
A little update
The sequence passed only 163 because they are not update their CMakeLists.txt.
I had been updated CMakeLists.txt and these patches in my github tree as well,
I can found 81.93% (195/238) passed,
Moreover, there have two of Field video clips, the decoder works, but output as
sep
At 2020-12-23 23:38:18, "Nuo Mi" wrote:
>On Wed, Dec 23, 2020 at 10:00 PM Lynne wrote:
>
>> Dec 23, 2020, 14:07 by nuomi2...@gmail.com:
>>
>> > Hi Lynne & James,
>> > Do not worry about the dav1d things that happened on vvcdec. It just a
>> > reference code like libaom.
>> >
>>
>> libaom does
In my evaluate, the RISC-V code density is 60% compare to ARM, with
C-Extension, it raise to 80%
it may be a big problem play large ffmpeg on really products, but we have more
space to improve ffmpeg on it.
At 2021-01-11 04:21:07, "Kieran Kunhya" wrote:
>Hello,
>
>Lynne has suggested on IRC th
From 4067c58be8e719a55d89e68aaa9d3db19b88b32f Mon Sep 17 00:00:00 2001
From: Chen
Date: Fri, 8 Nov 2024 22:21:19 -0800
Subject: [PATCH] Fix memory leak in the libx265
---
libavcodec/libx265.c | 4 +++-
1 file changed, 3 insertions(+), 1 deletion(-)
diff --git a/libavcodec/libx265.c
On Sat, May 18, 2024 at 9:04 AM Ronald S. Bultje wrote:
> Hi,
>
> On Tue, May 14, 2024 at 4:40 PM Stone Chen
> wrote:
>
>> Implements AVX2 DMVR (decoder-side motion vector refinement) SAD
>> functions. DMVR SAD is only calculated if w >= 8, h >= 8, and w * h &g
On Sat, May 18, 2024 at 11:33 AM Ronald S. Bultje
wrote:
> Hi,
>
> On Tue, May 14, 2024 at 4:40 PM Stone Chen
> wrote:
>
>> +vvc_sad_8:
>> +.loop_height:
>> +movu xm0, [src1q]
>> +movu xm1, [src2q]
>
codec/x86/vvc/vvc_sad.asm
new file mode 100644
index 00..58a24635d2
--- /dev/null
+++ b/libavcodec/x86/vvc/vvc_sad.asm
@@ -0,0 +1,138 @@
+; /*
+; * Provide SIMD DMVR SAD functions for VVC decoding
+; *
+; * Copyright (c) 2024 Stone Chen
+; *
+; * This file is part of FFmpeg.
+; *
+; * FFmpe
Adds checkasm for DMVR SAD AVX2 implementation.
Benchmarks ( AMD 7940HS )
vvc_sad_8x8_c: 70.0
vvc_sad_8x8_avx2: 10.0
vvc_sad_16x16_c: 280.0
vvc_sad_16x16_avx2: 20.0
vvc_sad_32x32_c: 1020.0
vvc_sad_32x32_avx2: 70.0
vvc_sad_64x64_c: 3560.0
vvc_sad_64x64_avx2: 270.0
vvc_sad_128x128_c: 13760.0
vvc_sad
codec/x86/vvc/vvc_sad.asm
new file mode 100644
index 00..58a24635d2
--- /dev/null
+++ b/libavcodec/x86/vvc/vvc_sad.asm
@@ -0,0 +1,138 @@
+; /*
+; * Provide SIMD DMVR SAD functions for VVC decoding
+; *
+; * Copyright (c) 2024 Stone Chen
+; *
+; * This file is part of FFmpeg.
+; *
+; * FFmpe
Adds checkasm for DMVR SAD AVX2 implementation.
Benchmarks ( AMD 7940HS )
vvc_sad_8x8_c: 70.0
vvc_sad_8x8_avx2: 10.0
vvc_sad_16x16_c: 280.0
vvc_sad_16x16_avx2: 20.0
vvc_sad_32x32_c: 1020.0
vvc_sad_32x32_avx2: 70.0
vvc_sad_64x64_c: 3560.0
vvc_sad_64x64_avx2: 270.0
vvc_sad_128x128_c: 13760.0
vvc_sad
codec/x86/vvc/vvc_sad.asm
new file mode 100644
index 00..9766446b11
--- /dev/null
+++ b/libavcodec/x86/vvc/vvc_sad.asm
@@ -0,0 +1,130 @@
+; /*
+; * Provide SIMD DMVR SAD functions for VVC decoding
+; *
+; * Copyright (c) 2024 Stone Chen
+; *
+; * This file is part of FFmpeg.
+; *
+; * FFmpeg is
Adds checkasm for DMVR SAD AVX2 implementation.
Benchmarks ( AMD 7940HS )
vvc_sad_8x8_c: 50.3
vvc_sad_8x8_avx2: 0.3
vvc_sad_16x16_c: 250.3
vvc_sad_16x16_avx2: 10.3
vvc_sad_32x32_c: 1020.3
vvc_sad_32x32_avx2: 60.3
vvc_sad_64x64_c: 3850.3
vvc_sad_64x64_avx2: 220.3
vvc_sad_128x128_c: 14100.3
vvc_sad_
On Mon, May 20, 2024 at 7:23 AM Ronald S. Bultje wrote:
> Hi,
>
> This is mostly good, the following is tiny nitpicks.
>
> On Sun, May 19, 2024 at 8:46 PM Stone Chen
> wrote:
>
>> +%macro INIT_OFFSET 6 ; src1, src2, dxq, dyq, off1, off2
>>
>
> The macro is
On Thu, May 23, 2024 at 9:18 AM Nuo Mi wrote:
> On Thu, May 23, 2024 at 7:38 AM James Almer wrote:
>
> > On 5/21/2024 10:01 PM, Ronald S. Bultje wrote:
> > > Hi,
> > >
> > > On Tue, May 21, 2024 at 8:01 PM Stone Chen
> > wrote:
> > >
&
According to the VVC specification (section 8.5.1), the maximum width/height of
a subblock passed for DMVR SAD is 16. This along with previous constraint
requiring width * height >= 128 means that 8x16, 16x8, and 16x16 are the only
allowed sizes. This re-labels vvc_sad_16_128 to vvc_sad_16 to r
According to the VVC specification (section 8.5.1), the maximum width/height of
a subblock passed for DMVR SAD is 16. This along with previous constraint
requiring width * height >= 128 means that 8x16, 16x8, and 16x16 are the only
allowed sizes.
This changes check_vvc_sad() to only test and b
rames, so we can send serveral frames to HW at once to increase
performance. Now I changed them to be called in a
asynchronous way, which will make better use of hardware.
1080p transcoding increases about 17% fps on my environment.
Signed-off-by: Wenbin Chen
---
libavcodec/vaapi_encode.c
;wait=1" means wait until operation ready. "wait=0" means
query operation's status. If ready return 0, if still in progress
return EAGAIN.
Signed-off-by: Wenbin Chen
---
libavcodec/vaapi_encode.c | 47 +--
1 file changed, 40 insertions(+), 7
es) with -async_depth=4 can increase 20%
performance on my environment.
The async increases performance but also introduces frame delay.
Signed-off-by: Wenbin Chen
---
libavcodec/vaapi_encode.c | 20 +++-
libavcodec/vaapi_encode.h | 12 ++--
2 files changed, 25 insertio
ffer to
> reorder frames, so we can send serveral frames to HW at once to increase
> performance. Now I changed them to be called in a
> asynchronous way, which will make better use of hardware.
> 1080p transcoding increases about 17% fps on my environment.
>
> Signed-off-b
or the async_depth
> option. "wait=1" means wait until operation ready. "wait=0" means
> query operation's status. If ready return 0, if still in progress
> return EAGAIN.
>
> Signed-off-by: Wenbin Chen
> ---
> libavcodec/vaapi_encode.c | 47
t; 1080p transcoding (no B frames) with -async_depth=4 can increase 20%
> performance on my environment.
> The async increases performance but also introduces frame delay.
>
> Signed-off-by: Wenbin Chen
> ---
> libavcodec/vaapi_encode.c | 20 +++-
> libavco
Adding nb_surfaces in AVD3D11VAFramesContext in the end of the structure
to support flexible size of this arrays and align to
AVDXVA2FramesContext and AVVAAPIFramesContext.
Signed-off-by Wenbin Chen
---
libavutil/hwcontext_d3d11va.c | 3 +--
libavutil/hwcontext_d3d11va.h | 2 ++
2 files changed
ternal
but handle_pairs_internal is alloced with the size of init_pool_size.
This lead to access to illegal address.
Now change it to use nb_surfaces to allocate handle_pairs_internal and the
core dumped error is unseen. Also change D3D11VA to use nb_surfaces
to align to VAAPI and DXVA2.
Signed-off-by: W
-i input.264 \
-vf "hwmap=derive_device=qsv,format=qsv" -c:v h264_qsv output.264
Signed-off-by: nyanmisaka
Signed-off-by: Wenbin Chen
---
libavutil/hwcontext_qsv.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/libavutil/hwcontext_qsv.c b/libavutil/hwcontex
qsv -hwaccel_output_format qsv -hwaccel_device qs -c:v h264_qsv \
-i input.264 -vf "hwmap=derive_device=opencl,format=opencl,avgblur_opencl, \
hwmap=derive_device=qsv:reverse=1:extra_hw_frames=32,format=qsv" \
-c:v h264_qsv output.264
Signed-off-by: nyanmisaka
Signed-off-by: Wenbin Chen
-
; i < kernel.length(); i++) {
> > )
> > +C(2, sum += texture(input_image[index], pos + vec2(0.0, i)) *
> kernel[i]; )
> > +C(2, sum += texture(input_image[index], pos - vec2(0.0, i)) *
> kernel[i]; )
> > +
From: Bas Nieuwenhuizen
This way we can pass explicit modifiers in. Sometimes the
modifier matters for the number of memory planes that
libva accepts, in particular when dealing with
driver-compressed textures. Furthermore the driver might
not actually be able to determine the implicit modifier
i
this change will not affect current vulkan behaviour.
Signed-off-by: Wenbin Chen
---
libavutil/hwcontext_vulkan.c | 12 +++-
1 file changed, 11 insertions(+), 1 deletion(-)
diff --git a/libavutil/hwcontext_vulkan.c b/libavutil/hwcontext_vulkan.c
index 6041580117..ccf3e58f49 100644
--- a
Vulkan will map nv12 to R8 and GR88, so add this map to vaapi to support
vulkan frame.
Signed-off-by: Wenbin Chen
---
libavutil/hwcontext_vaapi.c | 1 +
1 file changed, 1 insertion(+)
diff --git a/libavutil/hwcontext_vaapi.c b/libavutil/hwcontext_vaapi.c
index 75acc851d6..994b744e4d 100644
sem_sig_val is wrongly assigned to pWaitSemaphoreValues when export drm. Now fix
it.
Signed-off-by: Wenbin Chen
---
libavutil/hwcontext_vulkan.c | 5 +++--
1 file changed, 3 insertions(+), 2 deletions(-)
diff --git a/libavutil/hwcontext_vulkan.c b/libavutil/hwcontext_vulkan.c
index b857d1a9ed
offset of each plane.
Signed-off-by: Wenbin Chen
---
libavutil/hwcontext_vulkan.c | 46 +++-
libavutil/hwcontext_vulkan.h | 1 +
2 files changed, 46 insertions(+), 1 deletion(-)
diff --git a/libavutil/hwcontext_vulkan.c b/libavutil/hwcontext_vulkan.c
index
Add hwupload and hwdownload support to vulkan when frames are allocated
in one memory
Signed-off-by: Wenbin Chen
---
libavutil/hwcontext_vulkan.c | 10 --
1 file changed, 8 insertions(+), 2 deletions(-)
diff --git a/libavutil/hwcontext_vulkan.c b/libavutil/hwcontext_vulkan.c
index
.264 -vf "hwmap=derive_device=vulkan,format=vulkan, \
scale_vulkan=1920:1080,hwmap=derive_device=vaapi,format=vaapi" -c:v h264_vaapi
output.264
Signed-off-by: Wenbin Chen
---
libavutil/hwcontext_vulkan.c | 76 +---
libavutil/hwcontext_vulkan.h | 5 ++
> Adding nb_surfaces in AVD3D11VAFramesContext in the end of the structure
> to support flexible size of this arrays and align to
> AVDXVA2FramesContext and AVVAAPIFramesContext.
>
> Signed-off-by Wenbin Chen
> ---
> libavutil/hwcontext_d3d11va.c | 3 +--
> libavutil
use nb_surfaces
> to align to VAAPI and DXVA2.
>
> Signed-off-by: Wenbin Chen
> ---
> libavutil/hwcontext_qsv.c | 13 ++---
> 1 file changed, 6 insertions(+), 7 deletions(-)
>
> diff --git a/libavutil/hwcontext_qsv.c b/libavutil/hwcontext_qsv.c
> index c18747f7
vice /dev/dri/renderD128 \
> -hwaccel_output_format vaapi -i input.264 \
> -vf "hwmap=derive_device=qsv,format=qsv" -c:v h264_qsv output.264
>
> Signed-off-by: nyanmisaka
> Signed-off-by: Wenbin Chen
> ---
> libavutil/hwcontext_qsv.c | 2 +-
> 1 file changed, 1 insertion(+), 1
put.264
>
> Signed-off-by: nyanmisaka
> Signed-off-by: Wenbin Chen
> ---
> libavutil/hwcontext_opencl.c | 3 ++-
> 1 file changed, 2 insertions(+), 1 deletion(-)
>
> diff --git a/libavutil/hwcontext_opencl.c b/libavutil/hwcontext_opencl.c
> index 26a3a24593..4b6e74ff6
> 9 Nov 2021, 10:18 by wenbin.c...@intel.com:
>
> > sem_sig_val is wrongly assigned to pWaitSemaphoreValues when export
> drm. Now fix
> > it.
> >
> > Signed-off-by: Wenbin Chen <> wenbin.c...@intel.com> >
> >
>
> Thanks for spotting this
> > -Original Message-
> > From: ffmpeg-devel On Behalf Of
> > Chen, Wenbin
> > Sent: Wednesday, November 10, 2021 4:03 AM
> > To: ffmpeg-devel@ffmpeg.org
> > Subject: Re: [FFmpeg-devel] [PATCH 2/4] libavutil/hwcontext_qsv: fix
> > a bug when m
> > -Original Message-
> > From: ffmpeg-devel On Behalf Of
> > Chen, Wenbin
> > Sent: Wednesday, November 10, 2021 4:03 AM
> > To: ffmpeg-devel@ffmpeg.org
> > Subject: Re: [FFmpeg-devel] [PATCH 1/4] libavutil/hwcontext_d3d11va:
> >
o allocate memory according to one_memory flag.
> > A new variable is added to AVVKFrame to store the offset of each plane.
> >
> > Signed-off-by: Wenbin Chen
> > ---
> > libavutil/hwcontext_vulkan.c | 46
> +++-
> > libavuti
ne works:
> >
> > ffmpeg -hwaccel vaapi -hwaccel_device /dev/dri/renderD128 -
> hwaccel_output_format \
> > vaapi -i input_1080p.264 -vf "hwmap=derive_device=vulkan,format=vulkan,
> \
> > scale_vulkan=1920:1080,hwmap=derive_device=vaapi,format=vaapi" -c:v
>
ces). Now add code to make sure
init_pool_size is only set once. Now the following commandline works:
ffmpeg -hwaccel vaapi -hwaccel_device /dev/dri/renderD128 \
-hwaccel_output_format vaapi -i input.264 \
-vf "hwmap=derive_device=qsv,format=qsv" \
-c:v h264_qsv output.264
Signed-off-by
-i input.264 \
-vf "hwmap=derive_device=qsv,format=qsv" -c:v h264_qsv output.264
Signed-off-by: nyanmisaka
Signed-off-by: Wenbin Chen
---
libavutil/hwcontext_qsv.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/libavutil/hwcontext_qsv.c b/libavutil/hwcontex
qsv -hwaccel_output_format qsv -hwaccel_device qs -c:v h264_qsv \
-i input.264 -vf "hwmap=derive_device=opencl,format=opencl,avgblur_opencl, \
hwmap=derive_device=qsv:reverse=1:extra_hw_frames=32,format=qsv" \
-c:v h264_qsv output.264
Signed-off-by: nyanmisaka
Signed-off-by: Wenbin Chen
-
ording to one_memory flag.
> >>>> > A new variable is added to AVVKFrame to store the offset of each
> plane.
> >>>> >
> >>>> > Signed-off-by: Wenbin Chen
> >>>> > ---
> >>>> > libavutil/hwcontext_vulka
> > -Original Message-
> > From: ffmpeg-devel On Behalf Of
> > Xiang, Haihao
> > Sent: Monday, October 18, 2021 6:48 AM
> > To: ffmpeg-devel@ffmpeg.org
> > Subject: Re: [FFmpeg-devel] [PATCH v3 1/1] avutils/hwcontext: When
> > deriving a hwdevice, search for existing device in both directi
evice=qsv,format=qsv" \
> -c:v h264_qsv output.264
>
> Signed-off-by: Wenbin Chen
> ---
> libavcodec/vaapi_decode.c | 34 ++
> 1 file changed, 18 insertions(+), 16 deletions(-)
>
> diff --git a/libavcodec/vaapi_decode.c b/lib
vice /dev/dri/renderD128 \
> -hwaccel_output_format vaapi -i input.264 \
> -vf "hwmap=derive_device=qsv,format=qsv" -c:v h264_qsv output.264
>
> Signed-off-by: nyanmisaka
> Signed-off-by: Wenbin Chen
> ---
> libavutil/hwcontext_qsv.c | 2 +-
> 1 file changed, 1 insertion(+), 1
put.264
>
> Signed-off-by: nyanmisaka
> Signed-off-by: Wenbin Chen
> ---
> libavutil/hwcontext_opencl.c | 3 ++-
> 1 file changed, 2 insertions(+), 1 deletion(-)
>
> diff --git a/libavutil/hwcontext_opencl.c b/libavutil/hwcontext_opencl.c
> index 26a3a24593..4b6e74ff6
Vulkan will map nv12 to R8 and GR88, so add this map to vaapi to support
vulkan frame.
Signed-off-by: Wenbin Chen
---
libavutil/hwcontext_vaapi.c | 1 +
1 file changed, 1 insertion(+)
diff --git a/libavutil/hwcontext_vaapi.c b/libavutil/hwcontext_vaapi.c
index 75acc851d6..994b744e4d 100644
A new flag frame_flag is also added to AVVulkanFramesContext. User
can use this flag to force enable or disable this behaviour.
A new variable "offset "is added to AVVKFrame. It describe describe the
offset from the memory currently bound to the VkImage.
Signed-off-by: Wenbin Chen
---
libavuti
From: Bas Nieuwenhuizen
This way we can pass explicit modifiers in. Sometimes the
modifier matters for the number of memory planes that
libva accepts, in particular when dealing with
driver-compressed textures. Furthermore the driver might
not actually be able to determine the implicit modifier
i
Add hwupload and hwdownload support to vulkan when frames are allocated
in one memory
Signed-off-by: Wenbin Chen
---
libavutil/hwcontext_vulkan.c | 12 ++--
1 file changed, 10 insertions(+), 2 deletions(-)
diff --git a/libavutil/hwcontext_vulkan.c b/libavutil/hwcontext_vulkan.c
index
.264 -vf "hwmap=derive_device=vulkan,format=vulkan, \
scale_vulkan=1920:1080,hwmap=derive_device=vaapi,format=vaapi" -c:v h264_vaapi
output.264
Signed-off-by: Wenbin Chen
---
libavutil/hwcontext_vulkan.c | 130 +--
1 file changed, 124 insertions(+), 6
> 24 Nov 2021, 06:28 by wenbin.c...@intel.com:
>
> > From: Bas Nieuwenhuizen
> >
> > This way we can pass explicit modifiers in. Sometimes the
> > modifier matters for the number of memory planes that
> > libva accepts, in particular when dealing with
> > driver-compressed textures. Furthermore t
VVKFrame. It describe describe the
> > offset from the memory currently bound to the VkImage.
> >
> > Signed-off-by: Wenbin Chen
> > ---
> > libavutil/hwcontext_vulkan.c | 62
> ++--
> > libavutil/hwcontext_vulkan.h | 22
From: Bas Nieuwenhuizen
This way we can pass explicit modifiers in. Sometimes the
modifier matters for the number of memory planes that
libva accepts, in particular when dealing with
driver-compressed textures. Furthermore the driver might
not actually be able to determine the implicit modifier
i
Vulkan will map nv12 to R8 and GR88, so add this map to vaapi to support
vulkan frame.
Signed-off-by: Wenbin Chen
---
libavutil/hwcontext_vaapi.c | 1 +
1 file changed, 1 insertion(+)
diff --git a/libavutil/hwcontext_vaapi.c b/libavutil/hwcontext_vaapi.c
index 75acc851d6..994b744e4d 100644
A new flag frame_flag is also added to AVVulkanFramesContext. User
can use this flag to force enable or disable this behaviour.
A new variable "offset "is added to AVVKFrame. It describe describe the
offset from the memory currently bound to the VkImage.
Signed-off-by: Wenbin Chen
---
libavuti
Add support to map vulkan frames to software frames when
using contiguous_planes flag.
Signed-off-by: Wenbin Chen
---
libavutil/hwcontext_vulkan.c | 11 +--
1 file changed, 9 insertions(+), 2 deletions(-)
diff --git a/libavutil/hwcontext_vulkan.c b/libavutil/hwcontext_vulkan.c
index
.264 -vf "hwmap=derive_device=vulkan,format=vulkan, \
scale_vulkan=1920:1080,hwmap=derive_device=vaapi,format=vaapi" -c:v h264_vaapi
output.264
Signed-off-by: Wenbin Chen
---
libavutil/hwcontext_vulkan.c | 127 +--
1 file changed, 121 insertions(+), 6
VVKFrame. It describe describe the
> > offset from the memory currently bound to the VkImage.
> >
> > Signed-off-by: Wenbin Chen
> >
>
> Why is a new offset variable needed?
> vkGetImageSubresourceLayout is valid for DRM tiled images.
> According to the spe
> Quoting Wenbin Chen (2021-11-16 09:16:23)
> > From: nyanmisaka
> >
> > mfxHDLPair was added to qsv, so modify qsv->opencl map function as well.
> > Now the following commandline works:
> >
> > ffmpeg -v verbose -init_hw_device vaapi=va:/dev/dri/
From: Bas Nieuwenhuizen
This way we can pass explicit modifiers in. Sometimes the
modifier matters for the number of memory planes that
libva accepts, in particular when dealing with
driver-compressed textures. Furthermore the driver might
not actually be able to determine the implicit modifier
i
Vulkan will map nv12 to R8 and GR88, so add this map to vaapi to support
vulkan frame.
Signed-off-by: Wenbin Chen
---
libavutil/hwcontext_vaapi.c | 1 +
1 file changed, 1 insertion(+)
diff --git a/libavutil/hwcontext_vaapi.c b/libavutil/hwcontext_vaapi.c
index 75acc851d6..994b744e4d 100644
A new flag frame_flag is also added to AVVulkanFramesContext. User
can use this flag to force enable or disable this behaviour.
A new variable "offset "is added to AVVKFrame. It describe describe the
offset from the memory currently bound to the VkImage.
Signed-off-by: Wenbin Chen
---
libavuti
Add support to map vulkan frames to software frames when
using contiguous_planes flag.
Signed-off-by: Wenbin Chen
---
libavutil/hwcontext_vulkan.c | 11 +--
1 file changed, 9 insertions(+), 2 deletions(-)
diff --git a/libavutil/hwcontext_vulkan.c b/libavutil/hwcontext_vulkan.c
index
.264 -vf "hwmap=derive_device=vulkan,format=vulkan, \
scale_vulkan=1920:1080,hwmap=derive_device=vaapi,format=vaapi" -c:v h264_vaapi
output.264
Signed-off-by: Wenbin Chen
---
libavutil/hwcontext_vulkan.c | 127 +--
1 file changed, 121 insertions(+), 6
ll be enabled.
> A new flag frame_flag is also added to AVVulkanFramesContext. User
> can use this flag to force enable or disable this behaviour.
> A new variable "offset "is added to AVVKFrame. It describe describe the
> offset from the memory currently bound to
ces). Now add code to make sure
init_pool_size is only set once. Now the following commandline works:
ffmpeg -hwaccel vaapi -hwaccel_device /dev/dri/renderD128 \
-hwaccel_output_format vaapi -i input.264 \
-vf "hwmap=derive_device=qsv,format=qsv" \
-c:v h264_qsv output.264
Signed-off-by
-i input.264 \
-vf "hwmap=derive_device=qsv,format=qsv" -c:v h264_qsv output.264
Signed-off-by: nyanmisaka
Signed-off-by: Wenbin Chen
---
libavutil/hwcontext_qsv.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/libavutil/hwcontext_qsv.c b/libavutil/hwcontex
qsv -hwaccel_output_format qsv -hwaccel_device qs -c:v h264_qsv \
-i input.264 -vf "hwmap=derive_device=opencl,format=opencl,avgblur_opencl, \
hwmap=derive_device=qsv:reverse=1:extra_hw_frames=32,format=qsv" \
-c:v h264_qsv output.264
Signed-off-by: nyanmisaka
Signed-off-by: Wenbin Chen
-
> Quoting Wenbin Chen (2021-11-30 07:28:13)
> > diff --git a/libavutil/hwcontext_vulkan.h b/libavutil/hwcontext_vulkan.h
> > index fdf2a60156..c485ee7437 100644
> > --- a/libavutil/hwcontext_vulkan.h
> > +++ b/libavutil/hwcontext_vulkan.h
> > @@ -35,6 +35,17 @@
>
From: Bas Nieuwenhuizen
This way we can pass explicit modifiers in. Sometimes the
modifier matters for the number of memory planes that
libva accepts, in particular when dealing with
driver-compressed textures. Furthermore the driver might
not actually be able to determine the implicit modifier
i
Vulkan will map nv12 to R8 and GR88, so add this map to vaapi to support
vulkan frame.
Signed-off-by: Wenbin Chen
---
libavutil/hwcontext_vaapi.c | 1 +
1 file changed, 1 insertion(+)
diff --git a/libavutil/hwcontext_vaapi.c b/libavutil/hwcontext_vaapi.c
index 75acc851d6..994b744e4d 100644
A new flag frame_flag is also added to AVVulkanFramesContext. User
can use this flag to force enable or disable this behaviour.
A new variable "offset "is added to AVVKFrame. It describe describe the
offset from the memory currently bound to the VkImage.
Signed-off-by: Wenbin Chen
---
libavuti
Add support to map vulkan frames to software frames when
using contiguous_planes flag.
Signed-off-by: Wenbin Chen
---
libavutil/hwcontext_vulkan.c | 11 +--
1 file changed, 9 insertions(+), 2 deletions(-)
diff --git a/libavutil/hwcontext_vulkan.c b/libavutil/hwcontext_vulkan.c
index
.264 -vf "hwmap=derive_device=vulkan,format=vulkan, \
scale_vulkan=1920:1080,hwmap=derive_device=vaapi,format=vaapi" -c:v h264_vaapi
output.264
Signed-off-by: Wenbin Chen
---
libavutil/hwcontext_vulkan.c | 127 +--
1 file changed, 121 insertions(+), 6
ll be enabled.
> A new flag frame_flag is also added to AVVulkanFramesContext. User
> can use this flag to force enable or disable this behaviour.
> A new variable "offset "is added to AVVKFrame. It describe describe the
> offset from the memory currently bound to
area with border pixel to fix this
run2run problem, and also move the new AVFrame to global structure
to reduce redundant allocation operation to increase preformance.
Signed-off-by: Wenbin Chen
---
libavutil/hwcontext_qsv.c | 96 +--
1 file changed, 83 insert
uot;offset "is added to AVVKFrame. It describe describe the
> >> offset from the memory currently bound to the VkImage.
> >>
> >> Signed-off-by: Wenbin Chen
> >> ---
> >> libavutil/hwcontext_vulkan.c | 68
> >>
Thanks for reviewing this patch.
Do you mean this should be merged with the change to vf_vpp_qsv file
and send only one patch file?
On Mon, Oct 16, 2023 at 3:51 PM Xiang, Haihao wrote:
>
> On Sa, 2023-09-23 at 23:36 +0800, Chen Yufei wrote:
> > Signed-off-by
occur multiple times?
On Mon, Oct 16, 2023 at 4:05 PM Xiang, Haihao wrote:
>
> On Sa, 2023-09-23 at 23:36 +0800, Chen Yufei wrote:
> > Usage: "vpp_qsv=lut3d_file="
> >
> > Only enabled with VAAPI because using VASurface to store 3D LUT.
> >
> > Signed-off-by
1 - 100 of 597 matches
Mail list logo