On Wed, 2016-07-06 at 09:07 +0200, Hendrik Leppkes wrote:
> On Wed, Jul 6, 2016 at 4:37 AM, Dan Parrot wrote:
> > Finish providing SIMD versions for POWER8 VSX of functions in
> > libswscale/input.c That should allow trac ticket #5570 to be closed.
> > The speedups obtained
Finish providing SIMD versions for POWER8 VSX of functions in
libswscale/input.c That should allow trac ticket #5570 to be closed.
The speedups obtained for the functions are:
abgrToA_c 1.19
bgr24ToUV_c 1.23
bgr24ToUV_half_c1.37
bgr24ToY_c_vsx 1.43
nv12T
On Tue, 2016-07-05 at 15:45 +, Carl Eugen Hoyos wrote:
> Dan Parrot mail.com> writes:
>
> > These results for START_TIMER/STOP_TIMER are with ffmpeg
> > compiled using GCC 6.1.1
>
> I believe your results indicate that -cpuflags 0 has no
> effect on vsx.
>
On Mon, 2016-07-04 at 06:22 +, Carl Eugen Hoyos wrote:
> Dan Parrot mail.com> writes:
>
> > Finish providing SIMD versions for POWER8 VSX of functions
> > in libswscale/input.c
> > That should allow trac ticket #5570 to be closed.
>
> Please add some numbe
On Mon, 2016-07-04 at 23:31 -0500, Dan Parrot wrote:
> On Mon, 2016-07-04 at 09:20 +, Carl Eugen Hoyos wrote:
> > Dan Parrot mail.com> writes:
> >
> > > The dataset used was the entire FATE regression suite.
> >
> > I don't think this is a par
On Mon, 2016-07-04 at 09:20 +, Carl Eugen Hoyos wrote:
> Dan Parrot mail.com> writes:
>
> > The dataset used was the entire FATE regression suite.
>
> I don't think this is a particularly useful testcase:
> It takes very long but mostly tests other things.
>
> > > > Can you confirm with START_TIMER / STOP_TIMER that there is no
> > > > gain?
> > >
> > > SystemTap probes provide identical functionality by measuring
> > > deltas between function entry and function return.
> >
> > Sorry, I don't understand:
> > Did you test with both methods to verify
On Mon, 2016-07-04 at 16:30 +, Carl Eugen Hoyos wrote:
> Dan Parrot mail.com> writes:
>
> > > Did you test if using ffmpeg -benchmark -f rawvideo -i /dev/zero...
> > > showed different results?
> > > I believe this should be both easier and faster to tes
On Mon, 2016-07-04 at 20:55 +0200, Hendrik Leppkes wrote:
> On Mon, Jul 4, 2016 at 5:20 PM, Dan Parrot wrote:
> >> Why is this not faster?
> > Surprisingly, gcc is producing some badly suboptimal assembly. I need to
> > follow up with IBM's Linux Technology Ce
> > Just to make sure I don't misunderstand:
> > Does this mean intrinsics are suboptimal to write assembly
> > code?
> Here's what I mean: All variables below are of type "vector int"
>
> 1. v0 = v2 * v3
> 2. v0 = v4 * v5 + v6 * v7 + v8 * v9
>
> The first statement produces 1 multiply, 1 multi
On Mon, 2016-07-04 at 16:30 +, Carl Eugen Hoyos wrote:
> Dan Parrot mail.com> writes:
>
> > > Did you test if using ffmpeg -benchmark -f rawvideo -i /dev/zero...
> > > showed different results?
> > > I believe this should be both easier and faster to tes
On Mon, 2016-07-04 at 09:20 +, Carl Eugen Hoyos wrote:
> Dan Parrot mail.com> writes:
>
> > The dataset used was the entire FATE regression suite.
>
> I don't think this is a particularly useful testcase:
> It takes very long but mostly tests other things.
>
On Mon, 2016-07-04 at 06:22 +, Carl Eugen Hoyos wrote:
> Dan Parrot mail.com> writes:
>
> > Finish providing SIMD versions for POWER8 VSX of functions
> > in libswscale/input.c
> > That should allow trac ticket #5570 to be closed.
>
> Please add some numbe
Finish providing SIMD versions for POWER8 VSX of functions in libswscale/input.c
That should allow trac ticket #5570 to be closed.
---
libswscale/ppc/input_vsx.c | 1018 +++-
1 file changed, 1014 insertions(+), 4 deletions(-)
diff --git a/libswscale/ppc/inp
Here are execution times of SIMD and non-SIMD functions. The times were
obtained using SystemTap probes at functions' entry and return points.
The dataset used was fate-filter-pixfmts-scale.
SIMD versions have suffix _vsx:
yuy2ToY_c_vsx.
no. of calls: 864. min: 1880 ns. avg: 2014 ns. max:
\
ppc/yuv2yuv_altivec.o \
diff --git a/libswscale/ppc/input_vsx.c b/libswscale/ppc/input_vsx.c
new file mode 100644
index 000..d977a32
--- /dev/null
+++ b/libswscale/ppc/input_vsx.c
@@ -0,0 +1,437 @@
+/*
+ * Copyright (C) 2016 Dan Parrot
On Wed, 2016-06-22 at 20:33 -0300, James Almer wrote:
> On 6/22/2016 8:15 PM, Dan Parrot wrote:
> > On Thu, 2016-06-23 at 01:03 +0200, Michael Niedermayer wrote:
> >> On Tue, Jun 21, 2016 at 12:04:42AM -0500, Dan Parrot wrote:
> >>> On Tue, 2016-06-21 at 02:22 +0
On Thu, 2016-06-23 at 01:03 +0200, Michael Niedermayer wrote:
> On Tue, Jun 21, 2016 at 12:04:42AM -0500, Dan Parrot wrote:
> > On Tue, 2016-06-21 at 02:22 +0200, Michael Niedermayer wrote:
> > > On Mon, Jun 20, 2016 at 06:38:18PM -0500, Dan Parrot wrote:
> > > > On
On Wed, 2016-06-22 at 22:36 +, Carl Eugen Hoyos wrote:
> Dan Parrot mail.com> writes:
>
> [...]
>
> Did you already test the TIMER macros?
No I did not test with the TIMER macros. I don't see what that has to do
with Trac ticket #5570.
> I don't know if th
On Wed, 2016-06-22 at 21:02 +, Carl Eugen Hoyos wrote:
> Dan Parrot mail.com> writes:
>
> > Could I get a yes or no answer on whether the patch will be applied?
>
> Please comment on my email: time make fate can be used to show
> large performance changes (altho
Could I get a yes or no answer on whether the patch will be applied?
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
The MAINTAINERS file lists Luca Barbato for Linux/PowerPC. You can see
from his response below how he feels about that.
Forwarded Message
> From: Luca Barbato
> To: Dan Parrot
> Subject: Re: [FFmpeg-devel] [PATCH] PPC64: Add IBM POWER8 SIMD
> Implementation
> Da
On Tue, 2016-06-21 at 00:04 -0500, Dan Parrot wrote:
> On Tue, 2016-06-21 at 02:22 +0200, Michael Niedermayer wrote:
> > On Mon, Jun 20, 2016 at 06:38:18PM -0500, Dan Parrot wrote:
> > > On Tue, 2016-06-21 at 01:06 +0200, Michael Niedermayer wrote:
> > > > On Mon, Ju
On Tue, 2016-06-21 at 02:22 +0200, Michael Niedermayer wrote:
> On Mon, Jun 20, 2016 at 06:38:18PM -0500, Dan Parrot wrote:
> > On Tue, 2016-06-21 at 01:06 +0200, Michael Niedermayer wrote:
> > > On Mon, Jun 20, 2016 at 05:55:47PM -0500, Dan Parrot wrote:
> > > > On
On Tue, 2016-06-21 at 02:22 +0200, Michael Niedermayer wrote:
> On Mon, Jun 20, 2016 at 06:38:18PM -0500, Dan Parrot wrote:
> > On Tue, 2016-06-21 at 01:06 +0200, Michael Niedermayer wrote:
> > > On Mon, Jun 20, 2016 at 05:55:47PM -0500, Dan Parrot wrote:
> > > > On
On Tue, 2016-06-21 at 01:06 +0200, Michael Niedermayer wrote:
> On Mon, Jun 20, 2016 at 05:55:47PM -0500, Dan Parrot wrote:
> > On Tue, 2016-06-21 at 00:45 +0200, Michael Niedermayer wrote:
> > > On Sun, Jun 19, 2016 at 09:57:42PM +, Dan Parrot wrote:
> > > >
On Tue, 2016-06-21 at 00:45 +0200, Michael Niedermayer wrote:
> On Sun, Jun 19, 2016 at 09:57:42PM +0000, Dan Parrot wrote:
> > First commit addressing Trac ticket #5570. Functions defined in
> > libswscale/input.c
> > have corresponding SIMD definitions in libsw
21:57 +0000, Dan Parrot wrote:
> First commit addressing Trac ticket #5570. Functions defined in
> libswscale/input.c
> have corresponding SIMD definitions in libswscale/ppc/input_vsx.c
> ---
> libswscale/ppc/Makefile |1 +
> libswscale/pp
\
diff --git a/libswscale/ppc/input_vsx.c b/libswscale/ppc/input_vsx.c
new file mode 100644
index 000..adb0e38
--- /dev/null
+++ b/libswscale/ppc/input_vsx.c
@@ -0,0 +1,1070 @@
+/*
+ * Copyright (C) 2016 Dan Parrot
+ *
+ * This file is part of FFmpeg.
+ *
+ * FFmpeg is free software
On Wed, 2016-06-15 at 16:51 +0200, Hendrik Leppkes wrote:
> On Wed, Jun 15, 2016 at 6:25 AM, Dan Parrot wrote:
> > This is the first commit addressing Trac ticket #5570. Functions defined in
> > libswscale/input.c have corresponding definitions in
> > libswscale/pp
On Wed, 2016-06-15 at 11:19 +, Carl Eugen Hoyos wrote:
> Dan Parrot mail.com> writes:
>
> [...]
>
> I know this is isn't completely related but do you have time
> to look at ticket #5508?
> https://trac.ffmpeg.org/ticket/5508
> No active developer has har
On Wed, 2016-06-15 at 10:15 +0200, Michael Niedermayer wrote:
> On Wed, Jun 15, 2016 at 04:25:11AM +0000, Dan Parrot wrote:
> > This is the first commit addressing Trac ticket #5570. Functions defined in
> > libswscale/input.c have corresponding definitions in
> > libsw
ndex 000..09fe8c1
--- /dev/null
+++ b/libswscale/ppc/input_vsx.h
@@ -0,0 +1,831 @@
+/*
+ * Copyright (C) 2016 Dan Parrot
+ *
+ * This file is part of FFmpeg.
+ *
+ * FFmpeg is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU Lesser General Public
+ * Lice
On Tue, 2016-06-14 at 18:56 -0500, Dan Parrot wrote:
> ___
> ffmpeg-devel mailing list
> ffmpeg-devel@ffmpeg.org
> http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
Please disregard this attempted patch. Made wrong choice of using email
client
From e38eb7af05be27d8f36058373557d86e5a481db8 Mon Sep 17 00:00:00 2001
From: Dan Parrot
Date: Tue, 14 Jun 2016 23:19:21 +
Subject: [PATCH] PPC64: IBM POWER8 SIMD Implementation
This is the first commit addressing Trac ticket #5570. Functions defined in
libswscale/input.c have corresponding
On Thu, 2016-06-09 at 17:01 -0400, Ronald S. Bultje wrote:
> Hi,
>
> On Thu, Jun 9, 2016 at 4:02 PM, Dan Parrot wrote:
>
> > Line 72 of libswscale/input.c is:
> > dstU[i] = (ru*r + gu*g + bu*b + (0x10001<<(RGB2YUV_SHIFT-1))) >>
> > RGB2YUV_SHIFT;
> &
Line 72 of libswscale/input.c is:
dstU[i] = (ru*r + gu*g + bu*b + (0x10001<<(RGB2YUV_SHIFT-1))) >>
RGB2YUV_SHIFT;
The definition of macro RGB2YUV_SHIFT in libswscale/swscale_internal.h
is on line 417:
#define RGB2YUV_SHIFT 15
By examining the result of executing line 72 in input.c it appears that
ere a preferred method to implement such a change?
Thanks.
Dan Parrot.
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
38 matches
Mail list logo