> The remaining 1.1 projects include:
>
> * Autovectorization Enhancements (some parts)
>
1.2 Incrementally preserve loop-closed form when vectorizing
Submitted today:
http://gcc.gnu.org/ml/gcc-patches/2005-03/msg01318.html
1.3 Improvements to peeling for alignment
Submitted today:
http://gcc
> Steve Ellcey wrote:
> > Most of the gcc.dg/vect/* tests contain something like:
> >
> >typedef float afloat __attribute__ ((__aligned__(16)));
> >afloat a[N];
>
> It looks like what is really intended here is to apply the alignment to
> the array type. The point is that the entire ar
> Hi!
>
> On mainline we now use loop versioning and peeling for alignment
> for the following loop (-march=pentium4):
>
we don't yet use loop-versioning in the vectorizer in mainline (we do in
autovect). we do apply peeling.
> void foo3(float * __restrict__ a, float * __restrict__ b,
>
> On Mon, 21 Mar 2005 13:45:19 +0100 (CET), Richard Guenther
> <[EMAIL PROTECTED]> wrote:
> ...
>
> Uh, and with -funroll-loops we seem to be lost completely, as we
> produce peeling/loops for a eight times four rolling loop! Where is
> the information about the loop counter gone??
>
the thin
> GCC 4.1 is going rather well thus far.
>
> Technically, Stage 1 ended on April 25th, though I failed to announce
> that. There are a few stage 1 tasks that have not made it in yet,
> according to the Wiki:
>
> # Autovectorization Enhancements
> Items 1.4, 2.1, 2.3 (1.3)
Items 1.4 and 2.3 ar
produce executable
>
> Andreas
> --
> Andreas Jaeger, [EMAIL PROTECTED], http://www.suse.de/~aj
> SUSE Linux Products GmbH, Maxfeldstr. 5, 90409 Nürnberg, Germany
>GPG fingerprint = 93A3 365E CE47 B889 DF7F FED1 389A 563C C272 A126
> [attachment "attdhcxd.dat" deleted by Dorit Naishlos/Haifa/IBM]
Andreas Jaeger <[EMAIL PROTECTED]> wrote on 22/05/2005 17:29:24:
> On Sun, May 22, 2005 at 05:25:13PM +0300, Dorit Naishlos wrote:
> > I also see these failures on powerpc-apple-darwin, but they are all
solved
> > when I apply Keith's patch:
> > http://gc
"Giovanni Bajo" <[EMAIL PROTECTED]> wrote on 09/06/2005 20:37:43:
> Janis Johnson <[EMAIL PROTECTED]> wrote:
>
> > It sounds as if there should be a check in target-supports.exp for
> > SSE2 support that determines whether the default test action is 'run'
> > or 'compile' for i686 targets.
>
>
Devang Patel <[EMAIL PROTECTED]> wrote on 14/06/2005 00:24:27:
>
> On Jun 10, 2005, at 2:01 PM, Dorit Naishlos wrote:
>
> > Devang, is vect-dv-2.c a duplicate of vect-ifcvt-1.c or are they
> > both there
> > on purpose?
>
> It is duplicate. I'l
I'm preparing the third part of the reduction support for mainline,
introducing vector shifts (see
http://gcc.gnu.org/ml/gcc-patches/2005-06/msg01317.html). The vectorizer
generates the following epilog code:
vect_var_.53_60 = vect_var_.50_59 v>> 64;
vect_var_.53_61 = vect_var_.50_59 + vect_va
> why??
>
The problem is that in 'expand_vector_operations_1()' in
tree-vect-generic.c we call 'optab_for_tree_code()' to get an optab for
VEC_RSHIFT_EXPR; 'optab_for_tree_code' does not have a case for
VEC_RSHIFT_EXPR, so the vector-lowering function concludes that this
tree-code is not suppo
Richard Henderson <[EMAIL PROTECTED]> wrote on 19/06/2005 19:49:46:
> On Sun, Jun 19, 2005 at 07:36:15PM +0300, Dorit Naishlos wrote:
> > ... because at least for the vector-shift case I need to
> > check that the shift operand is constant, and only then return
> >
Richard Henderson <[EMAIL PROTECTED]> wrote on 19/06/2005 20:33:02:
> On Sun, Jun 19, 2005 at 08:00:22PM +0300, Dorit Naishlos wrote:
> > Altivec does support non immediate shift amount (even if less
efficiently -
> > I have to put the shift amount in a vector register
Richard Henderson <[EMAIL PROTECTED]> wrote on 20/06/2005 01:13:11:
> On Sun, Jun 19, 2005 at 11:46:52PM +0300, Dorit Naishlos wrote:
> > The thought was to supply an API that would let the vectorizer ask for
the
> > minimal capability it needs - if all we need i
> ...
>
> The problem seems to be that analyze_offset_expr calls the scev
> analyzer explicitely asking for recomputation (third parameter is
> true):
>
> ...
>
> Why should we start the analysis from scratch in this case? The same
> question could be asked for all the uses of analyze_scalar_e
> I was wondering if any addition work had been completed toward pragma
> support for the autovectorization branch (see
> http://gcc.gnu.org/ml/gcc-patches/2005-02/msg01560.html)?
>
I think Devang was planning to continue this work - I'm not sure where it
stands
dorit
> Thanks..
>
> Chad
Hi,
PR22543 has a testcase in which we fail with:
error: definition in block 1 does not dominate use in block 3
for SSA_NAME: SFT.3_39 in statement:
# VUSE ;
lsm_tmp.35_36 = D.2625.j;
In this testcase block 3 is a loop exit block, and block 1 is a loop header
block. During vectorization th
Daniel Berlin <[EMAIL PROTECTED]> wrote on 12/08/2005 17:56:11:
> > comments/ideas?
>
> I would start by figuring out why update_ssa + rewrite_into_loop_closed
> isn't putting SFT.3 into loop closed ssa form.
>
> Even if we do put virtual vars back into loop closed, that's still a
> bug.
>
I f
Planned vectorization enhancements for 4.2:
1. Recognize reduction patterns (Dorit).
Some computations have specialized target support and can be
vectorized more efficiently if the computation idiom is recognized and
vectorized as a whole. This is especially true to idioms that involve
We've had the testcase below in autovect-branch for a while, testing that
the 3 loops get vectorized. On mainline the third loop now gets eliminated
by DCE (.t44.dce3). Not sure I understand why... isn't the print loop
enough to keep it alive?
==
subroutine foo(a,b)
real a,
Andrew Pinski <[EMAIL PROTECTED]> wrote on 20/09/2005 18:09:20:
>
> On Sep 20, 2005, at 3:01 AM, Dorit Naishlos wrote:
>
> > We've had the testcase below in autovect-branch for a while, testing
> > that
> > the 3 loops get vectorized. On mainline the thi
Hi Toon,
Thanks for the testcases.
This one does get vectorized with autovect-branch:
~/autovect_cvs/bin/gfortran -O3 -ftree-vectorize -maltivec
-ftree-vectorizer-verbose=4 -S hilaram1.f90
hilaram1.f90:5: note: dependence distance = 0.
hilaram1.f90:5: note: accesses have the same alignmen
>
> On Oct 21, 2005, at 9:19 AM, Toon Moene wrote:
>
> > L.S.,
> >
> > This code:
> >
> > SUBROUTINE S(A, B, N)
> > DIMENSION A(N), B(N)
> > READ*,Z,B
> > DO I = 1, N
> > A(I) = Z * B(I)
> > ENDDO
> > PRINT*,A
> > END
> >
> > when compiled thus
[EMAIL PROTECTED] wrote on 21/10/2005 03:19:57 PM:
> L.S.,
>
> This code:
>
> SUBROUTINE S(A, B, N)
> DIMENSION A(N), B(N)
> READ*,Z,B
> DO I = 1, N
> A(I) = Z * B(I)
> ENDDO
> PRINT*,A
> END
>
> when compiled thusly:
>
> $ gfortran -g -S -O
This one gets vectorized for me, on powerpc-linux:
~/mainline_cvs/bin/gfortran -O3 -ftree-vectorize -maltivec
-ftree-vectorizer-verbose=4 -S hilaram4.f90
hilaram4.f90:4: note: Alignment of access forced using peeling.
hilaram4.f90:4: note: Vectorizing an unaligned access.
hilaram4.f90:4: note
Like HIRLAM 6, this is also an aliasing problem:
hilaram5.f90:4: note: not vectorized: can't determine dependence between
com.b[D.909_22] and (*a_8)[D.909_22]
hilaram5.f90:7: note: not vectorized: unhandled data-ref
hilaram5.f90:7: note: vectorized 0 loops in function.
dorit
> L.S.,
>
> This
Yes, we don't vectorize complex types yet.
dorit
> L.S.,
>
> The following code:
>
> SUBROUTINE S(N)
> DOUBLE COMPLEX A(N), B(N)
> READ*,B
> DO I = 1, N
> A(I) = B(I)
> ENDDO
> PRINT*,A
> END
>
> when compiled thusly:
>
> $ gfortran -g -S -O
It looks like maybe a 64bit scalar-evolution issue - when I compile on
powerpc-linux with -m64, I also get the
"vect4.f:4: note: not consecutive access"
message.
This problem looks very similar to PR18403 which has been resolved a while
ago:
When compiling for 32bit, we get the following repre
We're going to commit to autovect-branch vectorization support for
non-unit-stride accesses.
We'd like to suggest a few new tree-codes/optabs in order to express the
extraction and merging of elements from/to vectors.
Here are the suggested tree-codes/optabs; an example on how they are going
t
Steven Bosscher <[EMAIL PROTECTED]> wrote on 11/16/2005 10:39:24 PM:
> On Wednesday 16 November 2005 15:35, Dorit Naishlos wrote:
> > We'd like to suggest a few new tree-codes/optabs in order to express
the
> > extraction and merging of elements from/to vectors.
&
Paul Brook <[EMAIL PROTECTED]> wrote on 11/16/2005 05:03:47 PM:
> On Wednesday 16 November 2005 14:35, Dorit Naishlos wrote:
> > We're going to commit to autovect-branch vectorization support for
> > non-unit-stride accesses.
> > We'd like to suggest a
> Hello Everyone,
> I am interested in knowing more about the vectorizer in GCC. Does
> anyone have or know of any statistics about the percentage of loops
> that can be vectorized in some benchmarks like MediaBench, SPEC2K and
> so forth?
>
I have some old Spec2000 statistics, from around
>
> The used compile options are:
>
> ...
> -ftree-vectorize
I don't know how much vectorization takes place in your code, but
vectorization would currently greedily increase code size (i.e. without
trying to estimate when it is profitable). One way to avoid some of the
code size growth during vec
Given a pointer to type T - when can we assume that the data pointed to is
naturally aligned (aligned on the size of the type T)?
The vectorizer currently works under the assumption that all data is
naturally aligned. At least one place where this may result in generation
of wrong code by t
Richard Guenther <[EMAIL PROTECTED]> wrote on 15/12/2005 14:52:27:
> On 12/15/05, Dorit Naishlos <[EMAIL PROTECTED]> wrote:
>
> > So, in short - when can we assume that pointer types have the minimum
> > alignment required by their underlying type?
>
> I th
send me the following information:
>
* Project Title: Autovectorization enhancements
* Project Contributors:
Ira Rosen, Devang Patel, Dorit Naishlos, Keith Besaw.
* Dependencies: None.
* Description and Delivery Dates
1- Cleanups and improvement of existing functionality
1.
36 matches
Mail list logo