[FFmpeg-devel] [PATCH 1/2] lavu/checkasm: add (private) kperf timing for macOS

2021-04-28 Thread Josh Dekker
Signed-off-by: Josh Dekker --- configure | 2 + libavutil/Makefile| 1 + libavutil/macos_kperf.c | 140 ++ libavutil/macos_kperf.h | 23 +++ libavutil/timer.h | 17 - tests/checkasm/checkasm.c | 14 +++- tests

[FFmpeg-devel] [PATCH 0/2] ARM64 HEVC QPEL/EPEL

2021-04-28 Thread Josh Dekker
This is a patch originally, submitted in 2017 (author/date info left intact). At the time, it didn't get much attention I assume due to the sheer size of it. I have split the patch into only its QPEL/EPEL parts, rebasing, and doing some cleaning of the patches as much is reasonable for a 9001 line

Re: [FFmpeg-devel] [PATCH v4 1/2] lavc/aarch64: change h264pred_init structure

2021-04-19 Thread Josh Dekker
Set applied. -- Josh ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org https://ffmpeg.org/mailman/listinfo/ffmpeg-devel To unsubscribe, visit link above, or email ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".

[FFmpeg-devel] [PATCH] checkasm: add (private) kperf timing for macOS

2021-04-12 Thread Josh Dekker
Signed-off-by: Josh Dekker --- configure| 2 + tests/checkasm/Makefile | 1 + tests/checkasm/checkasm.c| 19 - tests/checkasm/checkasm.h| 10 ++- tests/checkasm/macos_kperf.c | 143 +++ tests/checkasm/macos_kperf.h | 23

Re: [FFmpeg-devel] [PATCH v2 0/4] avcodec/aarch64/hevcdsp

2021-02-18 Thread Josh Dekker
Set pushed with all Martin's changes implemented. More NEON & updates soon. -- Josh On 2021-02-04 12:32, Josh Dekker wrote: Hi, Rebases the unpushed part of my patches on top of Reimar's set. Also implements Martin's suggestions except 'unrolling the loop' for S

[FFmpeg-devel] [PATCH v2 4/4] avcodec/aarch64/hevcdsp: add sao_band NEON

2021-02-04 Thread Josh Dekker
Only works for 8x8. Signed-off-by: Josh Dekker --- libavcodec/aarch64/Makefile | 3 +- libavcodec/aarch64/hevcdsp_init_aarch64.c | 7 ++ libavcodec/aarch64/hevcdsp_sao_neon.S | 87 +++ 3 files changed, 96 insertions(+), 1 deletion(-) create mode 100644

[FFmpeg-devel] [PATCH v2 1/4] avcodec/aarch64/hevcdsp: port SIMD idct functions

2021-02-04 Thread Josh Dekker
the first 300 frames of "LG 4K HDR Demo - New York.ts", running on Apple M1. Signed-off-by: Josh Dekker --- libavcodec/aarch64/Makefile | 2 + libavcodec/aarch64/hevcdsp_idct_neon.S| 380 ++ libavcodec/aarch64/hevcdsp_init_aarch64.c | 45 +++

[FFmpeg-devel] [PATCH v2 2/4] avcodec/aarch64/hevcdsp: port add_residual functions

2021-02-04 Thread Josh Dekker
From: Reimar Döffinger Speedup is fairly small, around 1.5%, but these are fairly simple. Signed-off-by: Josh Dekker --- libavcodec/aarch64/hevcdsp_idct_neon.S| 190 ++ libavcodec/aarch64/hevcdsp_init_aarch64.c | 24 +++ 2 files changed, 214 insertions(+) diff --git

[FFmpeg-devel] [PATCH v2 3/4] avcodec/aarch64/hevcdsp: add idct_dc NEON

2021-02-04 Thread Josh Dekker
Signed-off-by: Josh Dekker --- libavcodec/aarch64/hevcdsp_idct_neon.S| 54 +++ libavcodec/aarch64/hevcdsp_init_aarch64.c | 16 +++ 2 files changed, 70 insertions(+) diff --git a/libavcodec/aarch64/hevcdsp_idct_neon.S b/libavcodec/aarch64/hevcdsp_idct_neon.S index

[FFmpeg-devel] [PATCH v2 0/4] avcodec/aarch64/hevcdsp

2021-02-04 Thread Josh Dekker
Hi, Rebases the unpushed part of my patches on top of Reimar's set. Also implements Martin's suggestions except 'unrolling the loop' for SAO band function, will update the band function when I fix non 8x8 cases. -- Josh ___ ffmpeg-devel mailing list

Re: [FFmpeg-devel] Patch for FFmpeg

2021-01-25 Thread Josh Dekker
On 2021-01-13 17:06, Robin Cooksey wrote: I’ve attached a patch which makes avformat handle the 308 Permanent Redirect HTTP status code – which is more recently defined in https://tools.ietf.org/html/rfc7538 The change just treats 308 in the same way as the other 30x status codes. Thanks. Ap

Re: [FFmpeg-devel] [PATCH 4/4] checkasm: add hevc_pel tests

2021-01-25 Thread Josh Dekker
On 2021-01-07 13:10, Josh Dekker wrote: Co-authored-by: Niklas Haas Signed-off-by: Josh Dekker --- tests/checkasm/Makefile | 2 +- tests/checkasm/checkasm.c | 10 + tests/checkasm/checkasm.h | 10 + tests/checkasm/hevc_pel.c | 523 ++ 4 files

Re: [FFmpeg-devel] [PATCH] configure: add fallback to $arch in msvc assembler check.

2021-01-25 Thread Josh Dekker
On 2021-01-23 14:14, Martin Storsjö wrote: On Sat, 23 Jan 2021, Reimar Döffinger wrote: Setting the defaults for $arch happens only later, so the current code would not set AS correctly if --arch was not specified on the command-line. Fix it by adding an explicit fallback to $arch_default. ---

Re: [FFmpeg-devel] [PATCH] libavcodec/hevcdsp: port SIMD idct functions from 32-bit.

2021-01-12 Thread Josh Dekker
Hi, On 2021-01-08 21:36, reimar.doeffin...@gmx.de wrote: From: Reimar Döffinger Makes SIMD-optimized 8x8 and 16x16 idcts for 8 and 10 bit depth available on aarch64. For a UHD HDR (10 bit) sample video these were consuming the most time and this optimization reduced overall decode time from 19

[FFmpeg-devel] [PATCH 4/4] checkasm: add hevc_pel tests

2021-01-07 Thread Josh Dekker
Co-authored-by: Niklas Haas Signed-off-by: Josh Dekker --- tests/checkasm/Makefile | 2 +- tests/checkasm/checkasm.c | 10 + tests/checkasm/checkasm.h | 10 + tests/checkasm/hevc_pel.c | 523 ++ 4 files changed, 544 insertions(+), 1 deletion(-) create

[FFmpeg-devel] [PATCH 3/4] lavc/aarch64: add HEVC sao_band NEON

2021-01-07 Thread Josh Dekker
Only works for 8x8. Signed-off-by: Josh Dekker --- libavcodec/aarch64/Makefile | 3 +- libavcodec/aarch64/hevcdsp_init.c | 7 +++ libavcodec/aarch64/hevcdsp_sao_neon.S | 87 +++ 3 files changed, 96 insertions(+), 1 deletion(-) create mode 100644

[FFmpeg-devel] [PATCH 2/4] lavc/aarch64: add HEVC idct_dc NEON

2021-01-07 Thread Josh Dekker
Signed-off-by: Josh Dekker --- libavcodec/aarch64/Makefile| 3 +- libavcodec/aarch64/hevcdsp_idct_neon.S | 74 ++ libavcodec/aarch64/hevcdsp_init.c | 19 +++ 3 files changed, 95 insertions(+), 1 deletion(-) create mode 100644 libavcodec/aarch64

[FFmpeg-devel] [PATCH 1/4] lavc/aarch64: add HEVC add_residual NEON

2021-01-07 Thread Josh Dekker
Signed-off-by: Josh Dekker --- libavcodec/aarch64/Makefile | 2 + libavcodec/aarch64/hevcdsp_add_res_neon.S | 298 ++ libavcodec/aarch64/hevcdsp_init.c | 59 + libavcodec/hevcdsp.c | 2 + libavcodec/hevcdsp.h

[FFmpeg-devel] [PATCH 0/4] AArch64 NEON for HEVC

2021-01-07 Thread Josh Dekker
checkasm: all 657 tests passed hevc_add_res_4x4_8_c: 49.7 hevc_add_res_4x4_8_neon: 20.5 hevc_add_res_4x4_10_c: 45.7 hevc_add_res_4x4_10_neon: 18.7 hevc_add_res_8x8_8_c: 211.0 hevc_add_res_8x8_8_neon: 24.5 hevc_add_res_8x8_10_c: 195.7 hevc_add_res_8x8_10_neon: 24.0 hevc_add_res_16x16_8_c: 787.2 hevc

[FFmpeg-devel] [PATCH] lavc/aarch64: add HEVC add_residual NEON

2021-01-07 Thread Josh Dekker
Signed-off-by: Josh Dekker --- checkasm: all 648 tests passed hevc_add_res_4x4_8_c: 49.7 hevc_add_res_4x4_8_neon: 20.5 hevc_add_res_4x4_10_c: 46.0 hevc_add_res_4x4_10_neon: 19.0 hevc_add_res_8x8_8_c: 209.0 hevc_add_res_8x8_8_neon: 24.5 hevc_add_res_8x8_10_c: 192.7 hevc_add_res_8x8_10_neon: 27.0

Re: [FFmpeg-devel] FFmpeg buying an Apple M1 Mac Mini

2021-01-03 Thread Josh Dekker
On 2021/01/03 20:18, Michael Niedermayer wrote: On Sun, Jan 03, 2021 at 06:32:11PM +0100, Kieran Kunhya wrote: Hello, As it's 2021 I would like to propose FFmpeg purchase one or more (e.g FATE + development) Apple M1 Mac Minis and provide access to developers. This is something I have done a fe

Re: [FFmpeg-devel] [PATCH] Moves yuv2yuvX_sse3 to yasm, unrolls main loop and other small optimizations for ~20% speedup.

2020-12-10 Thread Josh Dekker
On 2020/12/09 11:19, Alan Kelly wrote: --- Activates avx2 version of yuv2yuvX Adds checkasm for yuv2yuvX Modifies ff_yuv2yuvX_* signature to match yuv2yuvX_* Replaces non-temporal stores with temporal stores libswscale/x86/Makefile | 1 + libswscale/x86/swscale.c| 106 +++

[FFmpeg-devel] [RFC] Machines & Platforms of interest for testing

2020-12-08 Thread Josh Dekker
Hi, As discussed in the meeting, I'm starting a RFC for Machines & Platforms of interest for testing, developer access and FATE. These would be funded by SPI. The two platforms mentioned were a Mac Mini (M1 Apple Silicon platform) and a TALOS II (POWER9 platform). My personal suggestion would be

[FFmpeg-devel] [IMPORTANT] Meeting Notes - December 2020

2020-12-07 Thread Josh Dekker
vio) - Deprecating libpostproc - Writing down development rules - Switching to a merge request-like system - Propose FFmpeg/SPI purchase a Mac Mini ARM machine for development (and FATE if required). ## People Present (15) - Jean-Baptiste Kempf - Josh Dekker (Illya) - Jan Ekström - Michael Niederma