from:"xujunzz"

[FFmpeg-devel] [PATCH] dnn_backend_native_layer_conv2d.c: fix bug of loop boundary in single thread mode.

2020-09-19 Thread xujunzz

From: Xu Jun Before patch, fate test for dnn may fail in some Windows environment while succeed in my Linux. The bug was caused by a wrong loop boundary. Win10 and Linux seems to have different explaination for that.After patch, fate test succeed in my windows mingw 64-bit. Signed-off-by: Xu Jun

[FFmpeg-devel] [PATCH v3 2/2] dnn_backend_native_layer_conv2d.c: refine code.

2020-09-16 Thread xujunzz

From: Xu Jun Move thread area allocate out of thread function into main thread. Signed-off-by: Xu Jun --- .../dnn/dnn_backend_native_layer_conv2d.c | 30 +-- 1 file changed, 14 insertions(+), 16 deletions(-) diff --git a/libavfilter/dnn/dnn_backend_native_layer_conv2d.c b

[FFmpeg-devel] [PATCH v3 1/2] dnn_backend_native_layer_conv2d.c: fix memory allocation bug in multithread function.

2020-09-16 Thread xujunzz

From: Xu Jun Before patch, memory was allocated in each thread functions, which may cause more than one time of memory allocation and cause crash. After patch, memory is allocated in the main thread once, an index was parsed into thread functions. Bug fixed. Signed-off-by: Xu Jun --- v3: fix b

[FFmpeg-devel] [PATCH v2 2/2] dnn_backend_native_layer_conv2d.c: refine code.

2020-09-15 Thread xujunzz

From: Xu Jun Move thread area allocate out of thread function into main thread. Signed-off-by: Xu Jun --- v2: fix build warnings .../dnn/dnn_backend_native_layer_conv2d.c | 44 +-- 1 file changed, 20 insertions(+), 24 deletions(-) diff --git a/libavfilter/dnn/dnn_backend_

[FFmpeg-devel] [PATCH v2 1/2] dnn_backend_native_layer_conv2d.c: fix memory allocation bug in multithread function.

2020-09-15 Thread xujunzz

From: Xu Jun Before patch, memory was allocated in each thread functions, which may cause more than one time of memory allocation and cause crash. After patch, memory is allocated in the main thread once, an index was parsed into thread functions. Bug fixed. Signed-off-by: Xu Jun --- .../dnn/

[FFmpeg-devel] [PATCH 2/2] dnn_backend_native_layer_conv2d.c: refine code.

2020-09-14 Thread xujunzz

From: Xu Jun Move thread area allocate out of thread function into main thread. Signed-off-by: Xu Jun --- .../dnn/dnn_backend_native_layer_conv2d.c | 29 +-- 1 file changed, 13 insertions(+), 16 deletions(-) diff --git a/libavfilter/dnn/dnn_backend_native_layer_conv2d.c b

[FFmpeg-devel] [PATCH 1/2] dnn_backend_native_layer_conv2d.c: fix memory allocation bug in multithread function.

2020-09-14 Thread xujunzz

From: Xu Jun Before patch, memory was allocated in each thread functions, which may cause more than one time of memory allocation and cause crash. After patch, memory is allocated in the main thread once, an index was parsed into thread functions. Bug fixed. Signed-off-by: Xu Jun --- .../dnn/

[FFmpeg-devel] [PATCH v5 2/2] dnn_backend_native_layer_conv2d.c:Add mutithread function

2020-09-06 Thread xujunzz

From: Xu Jun Use pthread to multithread dnn_execute_layer_conv2d. Can be tested with command "./ffmpeg_g -i input.png -vf \ format=yuvj420p,dnn_processing=dnn_backend=native:model= \ espcn.model:input=x:output=y:options=conv2d_threads=23 \ -y sr_native.jpg -benchmark" before patch: utime=11.238

[FFmpeg-devel] [PATCH v5 1/2] dnn_backend_native.c: parse options in native backend

2020-09-06 Thread xujunzz

From: Xu Jun Signed-off-by: Xu Jun --- v2: use av_opt_set_from_string instead of function dnn_parse_option(). v3: make all the options supported, not just conv2d_threads v4: move dnn_native_options and dnn_native_class to from .h to .c. libavfilter/dnn/dnn_backend_native.c | 22 +++

[FFmpeg-devel] [PATCH v4 2/2] dnn_backend_native_layer_conv2d.c:Add mutithread function

2020-09-04 Thread xujunzz

From: Xu Jun Use pthread to multithread dnn_execute_layer_conv2d. Can be tested with command "./ffmpeg_g -i input.png -vf \ format=yuvj420p,dnn_processing=dnn_backend=native:model= \ espcn.model:input=x:output=y:options=conv2d_threads=23 \ -y sr_native.jpg -benchmark" before patch: utime=11.238

[FFmpeg-devel] [PATCH v4 1/2] dnn_backend_native.c: parse options in native backend

2020-09-04 Thread xujunzz

From: Xu Jun Signed-off-by: Xu Jun --- v2: use av_opt_set_from_string instead of function dnn_parse_option(). v3: make all the options supported, not just conv2d_threads v4: move dnn_native_options and dnn_native_class to from .h to .c. libavfilter/dnn/dnn_backend_native.c | 22 +++

[FFmpeg-devel] [PATCH v3 1/2] dnn_backend_native.c: parse options in native backend

2020-09-04 Thread xujunzz

From: Xu Jun Signed-off-by: Xu Jun --- v2: use av_opt_set_from_string instead of function dnn_parse_option(). v3: make all the options supported, not just conv2d_threads libavfilter/dnn/dnn_backend_native.c | 19 ++- libavfilter/dnn/dnn_backend_native.h | 21 +++

[FFmpeg-devel] [PATCH v2 1/2] dnn_backend_native.c: parse options in native backend

2020-09-04 Thread xujunzz

From: Xu Jun v2: use av_opt_set_from_string instead of function dnn_parse_option(). Signed-off-by: Xu Jun --- libavfilter/dnn/dnn_backend_native.c | 19 ++- libavfilter/dnn/dnn_backend_native.h | 21 + 2 files changed, 31 insertions(+), 9 deletions(-) diff

[FFmpeg-devel] [PATCH v2 2/2] dnn_backend_native_layer_conv2d.c:Add mutithread function

2020-09-04 Thread xujunzz

From: Xu Jun v2: add check for HAVE_PTHREAD_CANCEL and modify FATE test dnn-layer-conv2d-test.c Use pthread to multithread dnn_execute_layer_conv2d. Can be tested with command "./ffmpeg_g -i input.png -vf \ format=yuvj420p,dnn_processing=dnn_backend=native:model= \ espcn.model:input=x:output=y:o

[FFmpeg-devel] [PATCH 1/2] dnn_backend_native.c: parse options in native backend

2020-09-03 Thread xujunzz

From: Xu Jun Signed-off-by: Xu Jun --- libavfilter/dnn/dnn_backend_native.c | 22 -- libavfilter/dnn/dnn_backend_native.h | 13 + 2 files changed, 33 insertions(+), 2 deletions(-) diff --git a/libavfilter/dnn/dnn_backend_native.c b/libavfilter/dnn/dnn_backend_n

[FFmpeg-devel] [PATCH 2/2] Add mutithread function for dnn_backend_native_layer_conv2d.c

2020-09-03 Thread xujunzz

From: Xu Jun Use pthread to multithread dnn_execute_layer_conv2d. Can be tested with command "./ffmpeg_g -i input.png -vf \ format=yuvj420p,dnn_processing=dnn_backend=native:model= \ espcn.model:input=x:output=y:options=conv2d_threads=23 \ -y sr_native.jpg -benchmark" before patch: utime=11.238

[FFmpeg-devel] [PATCH 3/3][GSoC] Add x86-avx2 optimization for dnn_execute_layer_conv2d

2020-08-31 Thread xujunzz

From: Xu Jun Can be tested with command "./ffmpeg_g -i test_1s.mp4 -vf \ format=yuvj420p,dnn_processing=dnn_backend=native:model= \ espcn.model:input=x:output=y -y sr_native.mp4 -benchmark" before patch: utime=826.044s stime=0.550s rtime=39.680s after patch: utime=545.137s stime=0.467s rtime=27

[FFmpeg-devel] [PATCH 2/3][GSoC] Add x86-sse4 optimization for dnn_execute_layer_conv2d

2020-08-31 Thread xujunzz

From: Xu Jun Can be tested with command "./ffmpeg_g -i input.png -vf \ format=yuvj420p,dnn_processing=dnn_backend=native:model= \ espcn.model:input=x:output=y -y sr_native.jpg -benchmark"\ -cpuflags 0x100 before patch: utime=20.817s stime=0.047s rtime=1.051s after patch: utime=3.744s stime=0.03

[FFmpeg-devel] [PATCH 1/3][GSoC] Add mutithread function for dnn_backend_native_layer_conv2d.c

2020-08-31 Thread xujunzz

From: Xu Jun Use pthread to multithread dnn_execute_layer_conv2d. Can be tested with command "./ffmpeg_g -i input.png -vf \ format=yuvj420p,dnn_processing=dnn_backend=native:model= \ espcn.model:input=x:output=y -y sr_native.jpg -benchmark" before patch: utime=11.238s stime=0.005s rtime=11.248s

[FFmpeg-devel] [PATCH v2 3/3] avfilter/vf_convolution: Add X86 SIMD optimizations for filter_column()

2019-12-22 Thread xujunzz

From: Xu Jun Performance improves about 10% compared to v1. Tested using this command: ./ffmpeg_g -s 1280*720 -pix_fmt yuv420p -i test.yuv -vf convolution="1 2 3 4 5 6 7 8 9:1 2 3 4 5 6 7 8 9:1 2 3 4 5 6 7 8 9:1 2 3 4 5 6 7 8 9:1/45:1/45:1/45:1/45:1:2:3:4:column:column:column:column" -an -vfra

[FFmpeg-devel] [PATCH v2 1/3] avfilter/vf_convolution: add 16-column operation for filter_column() and modify filter_slice().

2019-12-22 Thread xujunzz

From: chen Replace the existing C code for filter_column() with chen's code. Modify filter_slice() to be compatible with this change. Tested using the command: ./ffmpeg_g -s 1280*720 -pix_fmt yuv420p -i test.yuv -vf convolution="1 2 3 4 5 6 7 8 9:1 2 3 4 5 6 7 8 9:1 2 3 4 5 6 7 8 9:1 2 3 4 5 6

[FFmpeg-devel] [PATCH v2 2/3] avfilter/vf_convolution: Add x86 SIMD optimizations for filter_row()

2019-12-22 Thread xujunzz

From: Xu Jun Read 16 elements from memory, shuffle and parallally compute 4 rows at a time, shuffle and parallelly write 16 results to memory. Performance improves about 15% compared to v1. Tested using this command: ./ffmpeg_g -s 1280*720 -pix_fmt yuv420p -i test.yuv -vf convolution="1 2 3 4 5

[FFmpeg-devel] [PATCH 2/3] avfilter/vf_convolution: Add x86 SIMD optimizations for filter_row()

2019-12-02 Thread xujunzz

From: Xu Jun Tested using this command: ./ffmpeg_g -s 1280*720 -pix_fmt yuv420p -i test.yuv -vf convolution="1 2 3 4 5 6 7 8 9:1 2 3 4 5 6 7 8 9:1 2 3 4 5 6 7 8 9:1 2 3 4 5 6 7 8 9:1/45:1/45:1/45:1/45:1:2:3:4:row:row:row:row" -an -vframes 5000 -f null /dev/null -benchmark after patch: frame=

[FFmpeg-devel] [PATCH 1/3] avfilter/vf_convolution: add 16-column operation for filter_column() and modify filter_slice().

2019-12-02 Thread xujunzz

From: chen Replace the existing C code for filter_column() with chen's code. Modify filter_slice() to be compatible with this change. Tested using the command: ./ffmpeg_g -s 1280*720 -pix_fmt yuv420p -i test.yuv -vf convolution="1 2 3 4 5 6 7 8 9:1 2 3 4 5 6 7 8 9:1 2 3 4 5 6 7 8 9:1 2 3 4 5 6

[FFmpeg-devel] [PATCH 3/3] avfilter/vf_convolution: add X86 SIMD for filter_column()

2019-12-02 Thread xujunzz

From: Xu Jun Tested using this command: ./ffmpeg_g -s 1280*720 -pix_fmt yuv420p -i test.yuv -vf convolution="1 2 3 4 5 6 7 8 9:1 2 3 4 5 6 7 8 9:1 2 3 4 5 6 7 8 9:1 2 3 4 5 6 7 8 9:1/45:1/45:1/45:1/45:1:2:3:4:column:column:column:column" -an -vframes 5000 -f null /dev/null -benchmark after pa

[FFmpeg-devel] [PATCH] avfilter/vf_convolution: add x86 SIMD for filter_column()

2019-11-27 Thread xujunzz

From: Xu Jun Tested using a simple command: ./ffmpeg_g -s 1280*720 -pix_fmt yuv420p -i test.yuv -vf convolution="1 2 3 4 5 6 7 8 9:1 2 3 4 5 6 7 8 9:1 2 3 4 5 6 7 8 9:1 2 3 4 5 6 7 8 9:1/45:1/45:1/45:1/45:1:2:3:4:column:column:column:column" -an -vframes 1000 -f null /dev/null The fps increas

[FFmpeg-devel] [PATCH] avfilter/vf_convolution: add 16-column operation for filter_column() to prepare for x86 SIMD.

2019-11-27 Thread xujunzz

From: Xu Jun In order to add x86 SIMD for filter_column(), I write a C function which processes 16 columns at a time. Signed-off-by: Xu Jun --- libavfilter/vf_convolution.c | 56 +++ libavfilter/x86/vf_convolution_init.c | 23 +++ 2 files changed, 79 i

[FFmpeg-devel] [PATCH] avfilter/vf_convolution:Add x86 SIMD optimizations for filter_row()

2019-11-27 Thread xujunzz

From: Xu Jun Tested using the following command: ./ffmpeg_g -s 1280*720 -pix_fmt yuv420p -i test.yuv -vf convolution="1 2 3 4 5\ 6 7 8 9:1 2 3 4 5 6 7 8 9:1 2 3 4 5 6 7 8 9:1 2 3 4 5 6 7 8 9:1/45:1/45:1/45\ :1/45:1:2:3:4:row:row:row:row" -an -vframes 1000 -f null /dev/null The fps increases fro

[FFmpeg-devel] [PATCH] dnn_backend_native_layer_conv2d.c: fix bug of loop boundary in single thread mode.

[FFmpeg-devel] [PATCH v3 2/2] dnn_backend_native_layer_conv2d.c: refine code.

[FFmpeg-devel] [PATCH v3 1/2] dnn_backend_native_layer_conv2d.c: fix memory allocation bug in multithread function.

[FFmpeg-devel] [PATCH v2 2/2] dnn_backend_native_layer_conv2d.c: refine code.

[FFmpeg-devel] [PATCH v2 1/2] dnn_backend_native_layer_conv2d.c: fix memory allocation bug in multithread function.

[FFmpeg-devel] [PATCH 2/2] dnn_backend_native_layer_conv2d.c: refine code.

[FFmpeg-devel] [PATCH 1/2] dnn_backend_native_layer_conv2d.c: fix memory allocation bug in multithread function.

[FFmpeg-devel] [PATCH v5 2/2] dnn_backend_native_layer_conv2d.c:Add mutithread function

[FFmpeg-devel] [PATCH v5 1/2] dnn_backend_native.c: parse options in native backend

[FFmpeg-devel] [PATCH v4 2/2] dnn_backend_native_layer_conv2d.c:Add mutithread function

[FFmpeg-devel] [PATCH v4 1/2] dnn_backend_native.c: parse options in native backend

[FFmpeg-devel] [PATCH v3 1/2] dnn_backend_native.c: parse options in native backend

[FFmpeg-devel] [PATCH v2 1/2] dnn_backend_native.c: parse options in native backend

[FFmpeg-devel] [PATCH v2 2/2] dnn_backend_native_layer_conv2d.c:Add mutithread function

[FFmpeg-devel] [PATCH 1/2] dnn_backend_native.c: parse options in native backend

[FFmpeg-devel] [PATCH 2/2] Add mutithread function for dnn_backend_native_layer_conv2d.c

[FFmpeg-devel] [PATCH 3/3][GSoC] Add x86-avx2 optimization for dnn_execute_layer_conv2d

[FFmpeg-devel] [PATCH 2/3][GSoC] Add x86-sse4 optimization for dnn_execute_layer_conv2d

[FFmpeg-devel] [PATCH 1/3][GSoC] Add mutithread function for dnn_backend_native_layer_conv2d.c

[FFmpeg-devel] [PATCH v2 3/3] avfilter/vf_convolution: Add X86 SIMD optimizations for filter_column()

[FFmpeg-devel] [PATCH v2 1/3] avfilter/vf_convolution: add 16-column operation for filter_column() and modify filter_slice().

[FFmpeg-devel] [PATCH v2 2/3] avfilter/vf_convolution: Add x86 SIMD optimizations for filter_row()

[FFmpeg-devel] [PATCH 2/3] avfilter/vf_convolution: Add x86 SIMD optimizations for filter_row()

[FFmpeg-devel] [PATCH 1/3] avfilter/vf_convolution: add 16-column operation for filter_column() and modify filter_slice().

[FFmpeg-devel] [PATCH 3/3] avfilter/vf_convolution: add X86 SIMD for filter_column()

[FFmpeg-devel] [PATCH] avfilter/vf_convolution: add x86 SIMD for filter_column()

[FFmpeg-devel] [PATCH] avfilter/vf_convolution: add 16-column operation for filter_column() to prepare for x86 SIMD.

[FFmpeg-devel] [PATCH] avfilter/vf_convolution:Add x86 SIMD optimizations for filter_row()

28 matches

Site Navigation

Mail list logo

Footer information