Signed-off-by: Ruiling Song
---
This filter runs about 2x faster on integrated GPU than nlmeans on my Skylake
CPU.
Anybody like to give some comments?
Ruiling
configure | 1 +
doc/filters.texi| 4 +
libavfilter/Makefile| 1 +
libavfilter
Signed-off-by: Ruiling Song
---
libavfilter/opencl.h | 38 ++
1 file changed, 38 insertions(+)
diff --git a/libavfilter/opencl.h b/libavfilter/opencl.h
index 0b06232ade..0fa5b49d3f 100644
--- a/libavfilter/opencl.h
+++ b/libavfilter/opencl.h
@@ -73,6 +73,44
Signed-off-by: Ruiling Song
---
configure | 1 +
doc/filters.texi| 4 +
libavfilter/Makefile| 1 +
libavfilter/allfilters.c| 1 +
libavfilter/opencl/nlmeans.cl | 115 +
libavfilter/opencl_source.h | 1
Instead of doing each column one by one, doing several columns
together gives about 30% better performance.
Signed-off-by: Ruiling Song
---
below is some of performance numbers(fps) on my i7-6770HQ (decode + gblur):
resolution:480p | 720p | 1080p | 4k
without patch: 393
Signed-off-by: Ruiling Song
---
configure | 1 +
doc/filters.texi| 4 +
libavfilter/Makefile| 1 +
libavfilter/allfilters.c| 1 +
libavfilter/opencl/nlmeans.cl | 115 +
libavfilter/opencl_source.h | 1
Signed-off-by: Ruiling Song
---
libavfilter/unsharp.h| 4 +-
libavfilter/vf_unsharp.c | 98 ++--
2 files changed, 78 insertions(+), 24 deletions(-)
diff --git a/libavfilter/unsharp.h b/libavfilter/unsharp.h
index caff986fc1..a60b30f31a 100644
--- a
ctx is a pointer to pointer here.
Signed-off-by: Ruiling Song
---
libavutil/tx.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/libavutil/tx.c b/libavutil/tx.c
index 934ef27c81..2bf4aa1c28 100644
--- a/libavutil/tx.c
+++ b/libavutil/tx.c
@@ -697,7 +697,7 @@ static int
ctx is a pointer to pointer here.
Signed-off-by: Ruiling Song
---
libavutil/tx.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/libavutil/tx.c b/libavutil/tx.c
index 934ef27c81..1690604040 100644
--- a/libavutil/tx.c
+++ b/libavutil/tx.c
@@ -697,7 +697,7 @@ static int
benchmarking with a simple command:
ffmpeg -i 1080p.mp4 -vf unsharp=la=3:ca=3 -an -f null /dev/null
with the patch, the fps increase from 50 to 120 on my local machine (i7-6770HQ).
v2:
make av_image_copy_plane() only copy per-slice content.
Signed-off-by: Ruiling Song
---
libavfilter/unsharp.h
1080p.mp4 -vf gblur=threads=1 -f null /dev/null
For single thread, the fps improves from 43 to 60, about 40%.
For multi-thread, the fps improves from 110 to 130, about 20%.
Signed-off-by: Ruiling Song
---
libavfilter/gblur.h | 54 ++
libavfilter/vf_gblur.c | 66
, about 40%.
For multi-thread, the fps improves from 110 to 130, about 20%.
v2:
Fix the bug when steps is not one.
Signed-off-by: Ruiling Song
---
libavfilter/gblur.h | 55 ++
libavfilter/vf_gblur.c | 71 ++---
libavfilter/x86/Makefile| 2
Signed-off-by: Ruiling Song
---
tests/checkasm/Makefile | 1 +
tests/checkasm/checkasm.c | 3 ++
tests/checkasm/checkasm.h | 1 +
tests/checkasm/vf_gblur.c | 67 +++
tests/fate/checkasm.mak | 1 +
5 files changed, 73 insertions(+)
create mode 100644
Signed-off-by: Ruiling Song
---
tests/checkasm/Makefile | 1 +
tests/checkasm/checkasm.c | 3 ++
tests/checkasm/checkasm.h | 1 +
tests/checkasm/vf_gblur.c | 67 +++
tests/fate/checkasm.mak | 1 +
5 files changed, 73 insertions(+)
create mode 100644
, about 40%.
For multi-thread, the fps improves from 110 to 130, about 20%.
v2:
Fix the bug when steps is not one.
v3:
Fix the bug when the upper half of 64bit register for 'int'
argument passing may have garbage.
Signed-off-by: Ruiling Song
---
libavfilter/gblur.h
rease from 151 to 270 on my local machine.
Signed-off-by: Ruiling Song
---
libavfilter/convolution.h | 64 +++
libavfilter/vf_convolution.c | 41 +--
libavfilter/x86/Makefile | 2 +
libavfilter/x86/vf_convolution.as
rease from 151 to 270 on my local machine.
Signed-off-by: Ruiling Song
---
v2:
fix a bug in scalar code path.
Use macro PROCESS_V/S for the first tap to simplify code.
libavfilter/convolution.h | 64 +++
libavfilter/vf_convolution.c | 41 +--
libavfilter/x8
Signed-off-by: Danil Iashchenko
Signed-off-by: Ruiling Song
---
Seems like Danil is not working on this recently.
So I re-submit this patch to address the comment over overlay_opencl.
Thanks!
Ruiling
doc/filters.texi | 486 +++
1 file changed
Signed-off-by: Ruiling Song
---
doc/filters.texi | 96
1 file changed, 96 insertions(+)
diff --git a/doc/filters.texi b/doc/filters.texi
index 83df460..f884ba4 100644
--- a/doc/filters.texi
+++ b/doc/filters.texi
@@ -16387,6 +16387,7
The main input may have alpha channel, we just ignore it.
Also add some checks for incompatible input formats.
Signed-off-by: Ruiling Song
---
libavfilter/vf_overlay_opencl.c | 58 -
1 file changed, 46 insertions(+), 12 deletions(-)
diff --git a
Since the filter use auto-calculate the peak value,
the option does not work as expected. So, remove it.
Signed-off-by: Ruiling Song
---
libavfilter/vf_tonemap_opencl.c | 7 ++-
1 file changed, 2 insertions(+), 5 deletions(-)
diff --git a/libavfilter/vf_tonemap_opencl.c b/libavfilter
Signed-off-by: Danil Iashchenko
Signed-off-by: Ruiling Song
---
Seems like Danil is not working on this recently.
So I re-submit this patch to address the comment over overlay_opencl.
Thanks!
Ruiling
doc/filters.texi | 486 +++
1 file changed
The main input may have alpha channel, we just ignore it.
Also add some checks for incompatible input formats.
Signed-off-by: Ruiling Song
---
libavfilter/vf_overlay_opencl.c | 58 -
1 file changed, 46 insertions(+), 12 deletions(-)
diff --git a
Signed-off-by: Ruiling Song
---
doc/filters.texi | 96
1 file changed, 96 insertions(+)
diff --git a/doc/filters.texi b/doc/filters.texi
index 83df460..f884ba4 100644
--- a/doc/filters.texi
+++ b/doc/filters.texi
@@ -16387,6 +16387,7
Signed-off-by: Ruiling Song
---
configure | 1 +
libavfilter/Makefile | 1 +
libavfilter/allfilters.c | 1 +
libavfilter/opencl/transpose.cl | 35 +
libavfilter/opencl_source.h | 1 +
libavfilter/transpose.h | 34
Signed-off-by: Ruiling Song
---
configure | 1 +
libavfilter/Makefile | 1 +
libavfilter/allfilters.c | 1 +
libavfilter/opencl/transpose.cl | 35 +
libavfilter/opencl_source.h | 1 +
libavfilter/transpose.h | 34
As these functions are moved to shared file, other colorspace-related
filters could also leverage the code.
Signed-off-by: Ruiling Song
---
libavfilter/colorspace.c| 71 +
libavfilter/colorspace.h| 4 ++
libavfilter/opencl
s_ctx same as the destination hw_frames_ctx. But I
think that if we are trying to map to the same device as the orginal
device_ctx, then we can just do the unmap.
Signed-off-by: Ruiling Song
---
I am not sure if there are any concern or side-effects of doing like this?
The first idea came up to fix
This patch was used to fix the second hwmap filter issue:
[vaapi_frame] hwmap [software filters] hwmap [vaapi_frame]
For such case, we also need to allocate the hardware frame
and map it back to software.
Signed-off-by: Ruiling Song
---
libavfilter/vf_hwmap.c | 125
Signed-off-by: Ruiling Song
---
libswscale/swscale.c | 16 +---
libswscale/swscale_internal.h | 5 +
libswscale/x86/swscale.c | 3 +--
3 files changed, 3 insertions(+), 21 deletions(-)
diff --git a/libswscale/swscale.c b/libswscale/swscale.c
index 8436f056d4
vaapi \
-i INPUT -vf 'tonemap_vaapi=format=p010' -c:v hevc_vaapi -profile 2 OUTPUT
Signed-off-by: Xinpeng Sun
Signed-off-by: Zachary Zhou
Signed-off-by: Ruiling Song
---
When I re-think about the document part. I find it is not necessary to repeat
how to set up vaapi device in this f
ned-off-by: Ruiling Song
---
configure| 2 +-
libavutil/hwcontext_opencl.c | 2 +-
2 files changed, 2 insertions(+), 2 deletions(-)
diff --git a/configure b/configure
index c2b8fac..48fdc8e 100755
--- a/configure
+++ b/configure
@@ -6427,7 +6427,7 @@ fi
if enabled_all op
These functions can be reused by other colorspace filters,
so move them to common file. No functional changes.
Signed-off-by: Ruiling Song
---
libavfilter/colorspace.c| 71
libavfilter/colorspace.h| 4 +++
libavfilter/vf_colorspace.c | 80
Signed-off-by: Ruiling Song
---
libavfilter/opencl/colorspace_common.cl | 25 -
libavfilter/vf_tonemap_opencl.c | 64 +++--
2 files changed, 29 insertions(+), 60 deletions(-)
diff --git a/libavfilter/opencl/colorspace_common.cl
b/libavfilter
Some filters may not need to do linearize/delinearize, thus
will even not define them. Add ifdef check, so they could easily
re-use the .cl file.
Signed-off-by: Ruiling Song
---
libavfilter/opencl/colorspace_common.cl | 14 --
1 file changed, 12 insertions(+), 2 deletions(-)
diff
This is used to print a 3x3 matrix into a part of OpenCL
source code.
Signed-off-by: Ruiling Song
---
libavfilter/opencl.c | 13 +
libavfilter/opencl.h | 8
2 files changed, 21 insertions(+)
diff --git a/libavfilter/opencl.c b/libavfilter/opencl.c
index ac5eec6..95f0bfc
Signed-off-by: Ruiling Song
---
This patch depends on the colorspace patchset I sent before
(https://patchwork.ffmpeg.org/patch/11820/)
Although I am still working on some minor functionality,
hope somebody could give some comments about the overall design.
Ruiling
configure
This is just code fine. No functional change.
Signed-off-by: Ruiling Song
---
libavfilter/vf_hwmap.c | 83 --
1 file changed, 39 insertions(+), 44 deletions(-)
diff --git a/libavfilter/vf_hwmap.c b/libavfilter/vf_hwmap.c
index 290559a..14276ce
Signed-off-by: Ruiling Song
---
MAINTAINERS | 1 +
1 file changed, 1 insertion(+)
diff --git a/MAINTAINERS b/MAINTAINERS
index 7ac2d22..412a739 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -362,6 +362,7 @@ Filters:
vf_ssim.c Paul B Mahol
vf_stereo3d.c
Signed-off-by: Ruiling Song
---
libavcodec/vaapi_encode_h265.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/libavcodec/vaapi_encode_h265.c b/libavcodec/vaapi_encode_h265.c
index 3ae92a7..32b8bc6 100644
--- a/libavcodec/vaapi_encode_h265.c
+++ b/libavcodec/vaapi_encode_h265
.
Signed-off-by: Ruiling Song
---
libavfilter/Makefile | 2 +-
libavfilter/vf_overlay_qsv.c | 212 +++
2 files changed, 75 insertions(+), 139 deletions(-)
diff --git a/libavfilter/Makefile b/libavfilter/Makefile
index fc16512..e642b8d 100644
--- a
debugging seems that it
is non-sense. so just skip it totally, not bothering to
return a EAGAIN error to the caller.
Signed-off-by: Ruiling Song
---
libavfilter/qsvvpp.c | 6 +++---
1 file changed, 3 insertions(+), 3 deletions(-)
diff --git a/libavfilter/qsvvpp.c b/libavfilter/qsvvpp.c
index f32b46
rk are put in a separate patch.
v2:
add .preinit field to initilize framesync options.
export more options like vf_overlay.c
Signed-off-by: Ruiling Song
---
libavfilter/Makefile | 2 +-
libavfilter/vf_overlay_qsv.c | 213 ---
2 files changed, 78
. That's why I made
this v2 to fix the side-effect on normal filters.
v2:
and one av_frame_free() in vf_vpp_qsv.c
Signed-off-by: Ruiling Song
---
libavfilter/qsvvpp.c | 4 ++--
libavfilter/vf_vpp_qsv.c | 5 -
2 files changed, 6 insertions(+), 3 deletions(-)
diff --git a/libavfilter/qsvvp
It basically does hdr to sdr conversion with tonemapping.
Signed-off-by: Ruiling Song
---
This patch tries to add a filter to do hdr to sdr conversion with tonemapping.
The filter does all the job of tonemapping in one pass, which is quite
different from the vf_tonemap.c
I choose this way
MediaSDK may fail to decode some frame, just skip it.
Otherwise, it will keep decoding the failure packet repeatedly
without processing any packet afterwards.
Signed-off-by: Ruiling, Song
---
libavcodec/qsvdec_h2645.c | 8 ++--
1 file changed, 6 insertions(+), 2 deletions(-)
diff --git a
From: "Ruiling, Song"
MediaSDK may fail to decode some frame, just skip it.
Otherwise, it will keep decoding the failure packet repeatedly
without processing any packet afterwards.
v2:
switch to using av_packet_unref().
Signed-off-by: Ruiling Song
---
libavcodec/qsvdec_h2645.c | 6
The common way to use libVA was first destroy the buffer, then the
context. I am not sure whether libVA has clear statement on this.
This patch just make things simple. This would fix an segmentation
fault issue against iHD open source driver.
Signed-off-by: Ruiling Song
---
libavcodec
-filter_hw_device ocl -filter_complex \
'[0:v]hwmap,tonemap_opencl=t=bt2020:tonemap=linear:format=p010[x1]; \
[x1]hwmap=derive_device=vaapi:reverse=1' -c:v hevc_vaapi -profile 2 OUTPUT
Signed-off-by: Ruiling Song
---
configure | 1 +
libavfilter/Makefile
If the transfer was SMPTE2084, use the peak of 1 even if not tagged.
Otherwise, we would assume it is HLG with a peak of 1200.
Based on suggestion by Niklas Haas.
Signed-off-by: Ruiling Song
---
libavfilter/vf_tonemap.c | 5 ++---
1 file changed, 2 insertions(+), 3 deletions(-)
diff --git
-filter_hw_device ocl -filter_complex \
'[0:v]hwmap,tonemap_opencl=t=bt2020:tonemap=linear:format=p010[x1]; \
[x1]hwmap=derive_device=vaapi:reverse=1' -c:v hevc_vaapi -profile 2 OUTPUT
v2:
add peak detection.
Signed-off-by: Ruiling Song
---
configure | 1 +
libavfilte
-filter_hw_device ocl -filter_complex \
'[0:v]hwmap,tonemap_opencl=t=bt2020:tonemap=linear:format=p010[x1]; \
[x1]hwmap=derive_device=vaapi:reverse=1' -c:v hevc_vaapi -profile 2 OUTPUT
Signed-off-by: Ruiling Song
---
this version mainly address Mark's comments on v2.
Thanks!
Ruil
These functions are shared among colorspace related filters.
Signed-off-by: Ruiling Song
---
libavfilter/Makefile| 2 +-
libavfilter/vf_colorspace.c | 118 +---
2 files changed, 23 insertions(+), 97 deletions(-)
diff --git a/libavfilter
This fix a build error on Windows:
C2440: connot convert from 'void (__cdecl *) (...)' to 'void (__stdcall
*)(...)'.
Signed-off-by: Ruiling Song
---
libavutil/hwcontext_opencl.c | 7 ---
1 file changed, 4 insertions(+), 3 deletions(-)
diff --git a/libavutil/hwcontext_
Signed-off-by: Ruiling Song
---
I am not sure whether do you think this would be useful?
the main purpose is to make OpenCL error check code simpler.
If we think this is good, I can go to replace current
OpenCL filters to use this macro.
for example:
if (cle != CL_SUCCESS
-filter_hw_device ocl -filter_complex \
'[0:v]hwmap,tonemap_opencl=t=bt2020:tonemap=linear:format=p010[x1]; \
[x1]hwmap=derive_device=vaapi:reverse=1' -c:v hevc_vaapi -profile 2 OUTPUT
Signed-off-by: Ruiling Song
---
As I didn't receive any other comment on v3, this version only fix the commen
These functions are shared among colorspace related filters.
Signed-off-by: Ruiling Song
---
libavfilter/Makefile| 2 +-
libavfilter/vf_colorspace.c | 118 +---
2 files changed, 23 insertions(+), 97 deletions(-)
diff --git a/libavfilter
Signed-off-by: Ruiling Song
---
libavfilter/opencl.h| 4 ++--
libavfilter/vf_avgblur_opencl.c | 45 +--
libavfilter/vf_overlay_opencl.c | 29 +--
libavfilter/vf_program_opencl.c | 14 ++-
libavfilter/vf_tonemap_opencl.c
The very last clFinish() should be ok.
Signed-off-by: Ruiling Song
---
libavfilter/vf_avgblur_opencl.c | 1 -
1 file changed, 1 deletion(-)
diff --git a/libavfilter/vf_avgblur_opencl.c b/libavfilter/vf_avgblur_opencl.c
index bc6bcab..99ed1ca 100644
--- a/libavfilter/vf_avgblur_opencl.c
+++ b
The very last clFinish() should be ok.
Signed-off-by: Ruiling Song
---
libavfilter/vf_avgblur_opencl.c | 1 -
1 file changed, 1 deletion(-)
diff --git a/libavfilter/vf_avgblur_opencl.c b/libavfilter/vf_avgblur_opencl.c
index bc6bcab..99ed1ca 100644
--- a/libavfilter/vf_avgblur_opencl.c
+++ b
Signed-off-by: Ruiling Song
---
libavfilter/opencl.h| 11 +
libavfilter/vf_avgblur_opencl.c | 45 +--
libavfilter/vf_overlay_opencl.c | 29 +--
libavfilter/vf_program_opencl.c | 14 ++-
libavfilter
Signed-off-by: Ruiling Song
---
Sorry I have not verified this patch, I don't know how to reproduce the gcc
warning.
Thanks!
Ruiling
libavfilter/vf_colorspace.c | 16
libavfilter/vf_tonemap_opencl.c | 4 ++--
2 files changed, 10 insertions(+), 10 deletions(-)
diff
Signed-off-by: Ruiling Song
---
doc/filters.texi | 158 +++
1 file changed, 158 insertions(+)
diff --git a/doc/filters.texi b/doc/filters.texi
index 6695999c84..f622d03226 100644
--- a/doc/filters.texi
+++ b/doc/filters.texi
@@ -17776,6 +17776,164
62 matches
Mail list logo