From: Tucker DiNapoli
The only changes were formating and moving code.
---
libpostproc/postprocess.c | 436 ++--
libpostproc/postprocess_c.c| 1328
libpostproc/postprocess_internal.h | 30 +-
libpostproc/postprocess_template.c
}
-}
-
-src += stride;
-}
-/*if(step==16){
-STOP_TIMER("step16")
-}else{
-STOP_TIMER("stepX")
-}*/
-}
+#include "postprocess_c.c"
//Note: we have C, MMX, MMX2, 3DNOW version there is no 3DNOW+MMX2 one
//Plain C versions
dif
Currently different versions of the postprocessing routines are
generated from a template. Ultimately I intend to remove this by
replacing the inline assembly with seperate yasm files. The c routines
will still be needed, so they need to be moved to a seperate file.
The routines were added to the f
+++ b/libpostproc/x86/PPUtil.asm
@@ -0,0 +1,116 @@
+;**
+;*
+;* Copyright (c) 2015 Tucker DiNapoli
+;*
+;* Utility code/marcos used in asm files for libpostproc
+;*
+;* This file is part of FFmpeg.
+;*
+;* FFmpeg is free
src[4*step]+= d;
-}
-}
-
-src += stride;
-}
-/*if(step==16){
-STOP_TIMER("step16")
-}else{
-STOP_TIMER("stepX")
-}*/
-}
+#include "postprocess_c.c"
//Note: we have C, MMX, MMX2, 3DNOW version there is no 3DNOW+MMX2 one
//Pl
This change is to allow support for different sized blocks, which will
be necessary for sse and avx. My plan is for the code to still act on
8x8 blocks, but to process multiple 8x8 blocks in parallel when using
sse/avx.
---
libpostproc/postprocess.c | 3 ---
libpostproc/postprocess_c.c
/PPContext.asm
diff --git a/libpostproc/x86/PPContext.asm b/libpostproc/x86/PPContext.asm
new file mode 100644
index 000..022dddb
--- /dev/null
+++ b/libpostproc/x86/PPContext.asm
@@ -0,0 +1,70 @@
+;*
+;* Definition of the PPContext and PPMode structs in assembly
+;* Copyright (C) 2015 Tucker
Currently different versions of the postprocessing routines are
generated from a template. Ultimately I intend to remove this by
replacing the inline assembly with seperate yasm files. The c routines
will still be needed, so they need to be moved to a seperate file.
The routines were added to the f
-* Copyright (C) 2001-2002 Michael Niedermayer (michae...@gmx.at)
-* Copyright (c) 2015 Tucker DiNapoli
-*
-* This file is part of FFmpeg.
-*
-* FFmpeg is free software; you can redistribute it and/or
-* modify it under the terms of the GNU Lesser General Public
-* License as published by the Free
= x86/deinterlace.o
diff --git a/libpostproc/x86/deinterlace.asm b/libpostproc/x86/deinterlace.asm
new file mode 100644
index 000..6e669bb
--- /dev/null
+++ b/libpostproc/x86/deinterlace.asm
@@ -0,0 +1,167 @@
+;*
+;* DeInterlacing filters written using SIMD extensions
+;* Copyright (C) 2015 T
nd thoes are more along the lines of what I want to
do
right now, this code is more of an idea for work to do over the summer.
Tucker DiNapoli
---
libpostproc/postprocess_main.c | 606 +
1 file changed, 606 insertions(+)
diff --git a/libpostproc/postproces
From: Tucker DiNapoli
This patch set contains implementations of various filters from libpostproc,
translated from inline asm (in postprocess_template.c) into seperate yasm files.
In addition support for sse2 and avx2 has been added via the use of the simd
abstraction layer from x86inc.asm
From: Tucker DiNapoli
diff --git a/libpostproc/x86/PPUtil.asm b/libpostproc/x86/PPUtil.asm
new file mode 100644
index 000..090ee18
--- /dev/null
+++ b/libpostproc/x86/PPUtil.asm
@@ -0,0 +1,116
From: Tucker DiNapoli
I also added a makefile which assembles this file into libpostproc. I
haven't yet modified the c code to use these functions yet.
diff --git a/libpostproc/x86/Makefile b/libpostproc/x86/Makefile
new file mode 100644
index 000..06838ca
--- /dev/null
+++ b/libpos
From: Tucker DiNapoli
I can't actually test it since there's a lot of work to be done in
interfacing the asm code to the c code, changing block sizes, changing
the way QP's are delt with, etc. But it assembles, there are warnings
for section redeclarations, and I'm not su
From: Tucker DiNapoli
This series of patches makes some changes to libpostproc in preperation for
adding
sse2 and avx2 simd versions of some functions. None of the changes should effect
the library in any major way, but they are necessary for future changes.
I've tested all the patches
From: Tucker DiNapoli
There's still an if, as QP needs to be modified if isColor=0, but it
still removes a unecessary branch.
---
libpostproc/postprocess_template.c | 12
1 file changed, 4 insertions(+), 8 deletions(-)
diff --git a/libpostproc/postprocess_template.c
b/libpos
---
libpostproc/postprocess_template.c | 189 +++--
1 file changed, 95 insertions(+), 94 deletions(-)
diff --git a/libpostproc/postprocess_template.c
b/libpostproc/postprocess_template.c
index 344152e..584cb4c 100644
--- a/libpostproc/postprocess_template.c
+++ b/
---
libpostproc/postprocess_template.c | 20 +---
1 file changed, 1 insertion(+), 19 deletions(-)
diff --git a/libpostproc/postprocess_template.c
b/libpostproc/postprocess_template.c
index 584cb4c..9096586 100644
--- a/libpostproc/postprocess_template.c
+++ b/libpostproc/postproc
From: Tucker DiNapoli
---
libpostproc/postprocess.c | 2 +-
libpostproc/postprocess_template.c | 41 --
2 files changed, 27 insertions(+), 16 deletions(-)
diff --git a/libpostproc/postprocess.c b/libpostproc/postprocess.c
index 9d89782..b8740db
Now instead of 3 loops of 4 blocks there's only one.
Also removed some variables that became unused because of this
---
libpostproc/postprocess_template.c | 29 +
1 file changed, 9 insertions(+), 20 deletions(-)
diff --git a/libpostproc/postprocess_template.c
b/libpos
From: Tucker DiNapoli
Also pulled QP initialization out of inner loop.
Added some dummy fields to PPContext to allow current code to work while
changing QP stuff.
---
libpostproc/postprocess_internal.h | 6 ++
libpostproc/postprocess_template.c | 138 ++---
2
Also pulled QP initialization out of inner loop.
Added some dummy fields to PPContext to allow current code to work while
changing QP stuff.
---
libpostproc/postprocess_internal.h | 10 -
libpostproc/postprocess_template.c | 82 ++
2 files changed, 47 inser
The structure of the postprocess function is to loop over x from 0 to
width, and in that loop to process 4 block at a time. This inner loop
was previously split into 3 seperate loops, i.e:
outer_loop over x
save current x location
loop over 4 blocks
restore x location
lo
These patches are updates to patches previously posted to the mailing lists,
with some bugs fixed and the reasoning behind some changes expanded on.
This addes macros in postprocess.c that use inline asm for x86,
__builtin_prefetch if using a recent enough gcc compatable compiler, and
that does n
---
libpostproc/postprocess_template.c | 296 +++--
1 file changed, 152 insertions(+), 144 deletions(-)
diff --git a/libpostproc/postprocess_template.c
b/libpostproc/postprocess_template.c
index 8220d36..866ba8f 100644
--- a/libpostproc/postprocess_template.c
+++
Also removed some variables that became unused (startx, srcBlockStart,
and dstBlockStart) due to this change.
---
libpostproc/postprocess_template.c | 32 ++--
1 file changed, 10 insertions(+), 22 deletions(-)
diff --git a/libpostproc/postprocess_template.c
b/libpostp
This set of patches is what I am submitting as qualification for the google
summer of code.
I wrote sse2/avx2 versions of several of the postprocessing filters
(namely the accurate deblock filter and all the deinterlace filters), and made
several
changes to the structure of the postprocess_temp
From: Tucker DiNapoli
Also pulled QP initialization out of inner loop, which removed some redundent
code.
Added some dummy fields to PPContext to allow current code to work while
changing the rest of the postprocessing code to support the arrays.
I also increased alignment requirements for
rcStride, uint8_t dst[
#undef TEMPLATE_PP_MMXEXT
#undef TEMPLATE_PP_3DNOW
#undef TEMPLATE_PP_SSE2
+#undef TEMPLATE_PP_AVX2
diff --git a/libpostproc/x86/Makefile b/libpostproc/x86/Makefile
new file mode 100644
index 000..8a7503b
--- /dev/null
+++ b/libpostproc/x86/Makefile
@@ -0,0 +1,2 @@
+YA
I added a new file with the sse2/avx2 code for do_a_deblock.
I also moved the code for running vertical deblock filters into it's own
function, both to clean up the postprocess funciton and to make it
easier to integrate the new sse2/avx2 versions of these filters.
---
libpostproc/postprocess_temp
I did my best to make as few changes as possible to the formatting when
adding new code, so this commit is just a means of making the format
changes that go along with the new code.
Mostly these are just changes in indentation, but I also re-formatted a
few assignment statments (from x= y -> x = y
This patch contains the code for the avx2/sse2 versions of the new
function, but they are deliberately ignored, since the support for
avx2/sse2 isn't yet present (the next commit fixes this).
This is a temporary measure until full sse2/avx2 implementation is
complete, but it works with sse2/avx2 a
_t src[], int
srcStride, uint8_t dst[
#undef TEMPLATE_PP_MMXEXT
#undef TEMPLATE_PP_3DNOW
#undef TEMPLATE_PP_SSE2
+#undef TEMPLATE_PP_AVX2
diff --git a/libpostproc/x86/Makefile b/libpostproc/x86/Makefile
new file mode 100644
index 000..8a7503b
--- /dev/null
+++ b/libpostproc/x86/Makefile
I added a new file with the sse2/avx2 code for do_a_deblock.
I also moved the code for running vertical deblock filters into it's own
function, both to clean up the postprocess funciton and to make it
easier to integrate the new sse2/avx2 versions of these filters.
---
libpostproc/postprocess_temp
The sse2/avx2 deblock functions now actually get called,
I added a new file with the sse2/avx2 code for do_a_deblock.
I also moved the code for running vertical deblock filters into it's own
function, both to clean up the postprocess funciton and to make it
easier to integrate the new sse2/avx2 ve
def RENAME
+#undef RENAME_SCALAR
#undef TEMPLATE_PP_C
#undef TEMPLATE_PP_ALTIVEC
#undef TEMPLATE_PP_MMX
#undef TEMPLATE_PP_MMXEXT
#undef TEMPLATE_PP_3DNOW
#undef TEMPLATE_PP_SSE2
+#undef TEMPLATE_PP_AVX2
diff --git a/libpostproc/x86/Makefile b/libpostproc/x86/Makefile
new file mode 100
37 matches
Mail list logo