vp9_diag_downleft_16x16_12bpp_ssse3: 32.0
vp9_diag_downleft_16x16_12bpp_avx: 32.4
vp9_diag_downleft_16x16_12bpp_avx2: 25.5
Benchmarked with 1 runs
Signed-off-by: Ilia
---
libavcodec/x86/vp9dsp_init_16bpp.c| 2 ++
libavcodec/x86/vp9intrapred_16bpp.asm | 39 +++
2 files
vp9_diag_downleft_32x32_8bpp_c: 580.2
vp9_diag_downleft_32x32_8bpp_sse2: 75.6
vp9_diag_downleft_32x32_8bpp_ssse3: 73.7
vp9_diag_downleft_32x32_8bpp_avx: 72.7
vp9_diag_downleft_32x32_10bpp_c: 1101.2
vp9_diag_downleft_32x32_10bpp_sse2: 145.4
vp9_diag_downleft_32x32_10bpp_ssse3: 137.5
vp9_diag_downlef
vp9_diag_downleft_32x32_8bpp_c: 580.2
vp9_diag_downleft_32x32_8bpp_sse2: 75.6
vp9_diag_downleft_32x32_8bpp_ssse3: 73.7
vp9_diag_downleft_32x32_8bpp_avx: 72.7
vp9_diag_downleft_32x32_10bpp_c: 1101.2
vp9_diag_downleft_32x32_10bpp_sse2: 145.4
vp9_diag_downleft_32x32_10bpp_ssse3: 137.5
vp9_diag_downlef
vp9_diag_downright_16x16_12bpp_c: 149.0
vp9_diag_downright_16x16_12bpp_sse2: 67.8
vp9_diag_downright_16x16_12bpp_ssse3: 45.6
vp9_diag_downright_16x16_12bpp_avx: 36.6
vp9_diag_downright_16x16_12bpp_avx2: 25.5
~30% faster than avx
Signed-off-by: Ilia Valiakhmetov
---
libavcodec/x86
Signed-off-by: Ilia Valiakhmetov
---
libavcodec/x86/vp9dsp_init_16bpp.c| 2 ++
libavcodec/x86/vp9intrapred_16bpp.asm | 56 +++
2 files changed, 58 insertions(+)
diff --git a/libavcodec/x86/vp9dsp_init_16bpp.c
b/libavcodec/x86/vp9dsp_init_16bpp.c
index
Signed-off-by: Ilia Valiakhmetov
---
libavcodec/x86/vp9dsp_init_16bpp.c| 2 ++
libavcodec/x86/vp9intrapred_16bpp.asm | 56 +++
2 files changed, 58 insertions(+)
diff --git a/libavcodec/x86/vp9dsp_init_16bpp.c
b/libavcodec/x86/vp9dsp_init_16bpp.c
index
Signed-off-by: Ilia Valiakhmetov
---
libavcodec/avcodec.h | 7 ++-
libavcodec/options.c | 1 +
libavcodec/pthread_slice.c | 26 --
libavcodec/utils.c | 14 ++
4 files changed, 45 insertions(+), 3 deletions(-)
diff --git a/libavcodec
---
libavcodec/internal.h | 4
libavcodec/pthread_slice.c | 33 ++---
libavcodec/thread.h| 1 +
libavutil/slicethread.h| 18 ++
4 files changed, 37 insertions(+), 19 deletions(-)
diff --git a/libavcodec/internal.h b/libavcodec/in
Signed-off-by: Ilia Valiakhmetov
v2:
---
libavcodec/internal.h | 4
libavcodec/pthread_slice.c | 22 --
libavcodec/thread.h| 4 +++-
3 files changed, 27 insertions(+), 3 deletions(-)
diff --git a/libavcodec/internal.h b/libavcodec/internal.h
index
Signed-off-by: Ilia Valiakhmetov
v8:
---
libavcodec/vp9.c | 5 +++--
1 file changed, 3 insertions(+), 2 deletions(-)
diff --git a/libavcodec/vp9.c b/libavcodec/vp9.c
index b780262..a71045e 100644
--- a/libavcodec/vp9.c
+++ b/libavcodec/vp9.c
@@ -1628,7 +1628,7 @@ FF_ENABLE_DEPRECATION_WARNINGS
Signed-off-by: Ilia Valiakhmetov
---
Changelog | 1 +
1 file changed, 1 insertion(+)
diff --git a/Changelog b/Changelog
index cae5254..8a4818a 100644
--- a/Changelog
+++ b/Changelog
@@ -43,6 +43,7 @@ version :
- add --disable-autodetect build switch
- drop deprecated qtkit input device (use
Signed-off-by: Ilia Valiakhmetov
---
Changelog | 1 +
1 file changed, 1 insertion(+)
diff --git a/Changelog b/Changelog
index 22928de..ca0758a 100644
--- a/Changelog
+++ b/Changelog
@@ -46,6 +46,7 @@ version :
- haas audio filter
- SUP/PGS subtitle muxer
- convolve video filter
+- VP9 tile
vp9_diag_downright_32x32_12bpp_c: 429.7
vp9_diag_downright_32x32_12bpp_sse2: 158.9
vp9_diag_downright_32x32_12bpp_ssse3: 144.6
vp9_diag_downright_32x32_12bpp_avx: 141.0
vp9_diag_downright_32x32_12bpp_avx2: 73.8
Almost 50% faster than avx implementation
---
libavcodec/x86/vp9dsp_init_16bpp.c|
avx
Signed-off-by: Ilia Valiakhmetov
---
libavcodec/x86/vp9intrapred_16bpp.asm | 47 ---
1 file changed, 33 insertions(+), 14 deletions(-)
diff --git a/libavcodec/x86/vp9intrapred_16bpp.asm
b/libavcodec/x86/vp9intrapred_16bpp.asm
index 8d8d65e..33a8a7f 100644
vp9_vert_left_16x16_12bpp_c: 273.8
vp9_vert_left_16x16_12bpp_sse2: 69.4
vp9_vert_left_16x16_12bpp_ssse3: 35.3
vp9_vert_left_16x16_12bpp_avx: 34.6
vp9_vert_left_16x16_12bpp_avx2: 22.4
~35% faster than avx
Signed-off-by: Ilia Valiakhmetov
---
libavcodec/x86/vp9dsp_init_16bpp.c| 2
Signed-off-by: Ilia Valiakhmetov
---
libavcodec/avcodec.h | 7 ++-
libavcodec/options.c | 1 +
libavcodec/pthread_slice.c | 27 +--
libavcodec/utils.c | 13 +
4 files changed, 45 insertions(+), 3 deletions(-)
diff --git a/libavcodec
argument - main function for avpriv_slicethread_create(), it is used
for the loopfilter.
Ilia Valiakhmetov (2):
avcodec: add execute3() api to utilize the main function of
avpriv_slicethread_create().
avcodec/vp9: Add tile threading support
libavcodec/avcodec.h | 7 +-
libavcodec
17 matches
Mail list logo