with unknown alignment that
you have here.
Ira
>
> Cheers,
> Bingfeng
>
> > -Original Message-
> > From: Ira Rosen [mailto:i...@il.ibm.com]
> > Sent: 01 November 2011 11:13
> > To: Bingfeng Mei
> > Cc: gcc@gcc.gnu.org
> > Subject: Re: SLP ve
gcc-ow...@gcc.gnu.org wrote on 01/11/2011 12:41:32 PM:
> Hello,
> I have one example with two very similar loops. cunrolli pass
> unrolls one loop completely
> but not the other based on slightly different cost estimations. The
> not-unrolled loop
> get SLP-vectorized, then unrolled by "cunroll"
Frederic Riss wrote on 31/05/2011 12:34:35 PM:
> Hi Ira,
>
> thanks for your answer, however:
>
> On 31 May 2011 08:06, Ira Rosen wrote:
> >> This test fails for me because I get 4 vectorized loops instead of 3.
> >> There are multiple other tests that
gcc-ow...@gcc.gnu.org wrote on 30/05/2011 06:36:36 PM:
>
> Hi,
>
> I've been playing with the vectorizer for my port, and of course I use
> the testsuite to check the generated code. I fail to understand some
> of the FAILs I get. For example, in slp-3.c, the test contains:
>
> /* { dg-final { s
>> ...Ira would know best, but I don't think it would be used for this
>> kind of loop. It would be more something like:
>>
>> for (i=0; i> X[i] = Y[i].red + Y[i].blue + Y[i].green;
>>
>> (not a realistic example). You'd then have:
>>
>> compoundY = __builtin_load_lanes (Y);
>> red =
Hi,
"Bingfeng Mei" wrote on 10/02/2011 05:35:45 PM:
>
> Hi,
> I noticed that vector permutation gets more use in GCC
> 4.6, which is great. It is used to handle negative step
> by reversing vector elements now.
>
> However, after reading the related code, I understood
> that it only works when t
Hi,
gcc-ow...@gcc.gnu.org wrote on 24/01/2011 03:21:51 PM:
> Hello,
> Some of our target processors support complete hardware misaligned
> memory access. I implemented movmisalignm patterns, and found
> TARGET_SUPPORT_VECTOR_MISALIGNMENT
> (TARGET_VECTORIZE_SUPPORT_VECTOR_MISALIGNMENT
> On 4.6) h
Richard Guenther wrote on 03/06/2010 02:00:00
PM:
> >> tree-vectorizer.h:#ifndef TARG_COND_TAKEN_BRANCH_COST
> >> tree-vectorizer.h:#ifndef TARG_COND_NOT_TAKEN_BRANCH_COST
> >> tree-vectorizer.h:#ifndef TARG_SCALAR_STMT_COST
> >> tree-vectorizer.h:#ifndef TARG_SCALAR_LOAD_COST
> >> tree-vectori
Steven Bosscher wrote on 02/06/2010 06:13:36 PM:
>
> On Wed, May 26, 2010 at 7:16 PM, Mark Mitchell
wrote:
> > Ulrich Weigand wrote:
> >
> >>> So the question is: The goal is to have hooks, not macros, right? If
> >>> so, can reviewers please take care to reject patches that introduce
> >>> ne
gcc-ow...@gcc.gnu.org wrote on 28/05/2010 03:52:30 PM:
> Hi,
>
> I just noticed today that (implicit) loops of the kind
>
> xmin = minval(nodes(1,inductor_number(1:number_of_nodes)))
>
> (lines 5057 to 5062 of the polyhedron test induct.f90) are no longer
> vectorized (the change occurre
signed cast. I think it is related to PR
26128.
Ira
>
> If it is the case, then is there good reason for it, or can I fix
itmyself by
> adding additional vectorizable operations?
>
> I've attached both test case and full output of
ftree-vectorized-verbose=9
>
> Best regards
> `Allan
>
> [attachment "vectorizesign.cpp" deleted by Ira Rosen/Haifa/IBM]
> [attachment "vectorizesign-debug.txt" deleted by Ira Rosen/Haifa/IBM]
> > I can imagine having some sort of target hook that computed a cost
> > metric for a given constant permutation pattern. For instance, I'd
> > imagine that the interleave patterns are half as expensive as a full
> > permute for altivec, due to not having to load a mask. This hook would
> > be
Richard Henderson wrote on 17/11/2009 03:39:42:
> Richard Henderson
> 17/11/2009 03:39
>
> To
>
> Ira Rosen/Haifa/i...@ibmil
>
> cc
>
> gcc@gcc.gnu.org
>
> Subject
>
> targetm.vectorize.builtin_vec_perm
>
> What is this hook supposed to do? Th
ectorization.
The memory accesses are consecutive in the inner loop and strided in the
outer loop. Therefore, inner loop vectorization is preferable in this case
(and also strided accesses are not yet supported in outer loop
vectorization).
Ira
>
> On Tue, May 26, 2009 at 5:57 PM
gcc-ow...@gcc.gnu.org wrote on 25/05/2009 21:53:41:
> for a loop like
>
> 1 for(i=0;i 2 for(j=0;j 3 a[i][j] = a[i][j]+b[i][j];
>
> GCC 4.3.* is unable to get the information for the inner loop that
> array reference 'a' is alias of each other and generates code f
Richard Guenther wrote on 29/03/2009 13:05:56:
> On Sun, 29 Mar 2009, Ira Rosen wrote:
>
> >
> > > I will announce the time I am doing the last trunk ->
alias-improvements
> > > branch merge and freeze the trunk for that.
> > >
> > > Thus,
> I will announce the time I am doing the last trunk -> alias-improvements
> branch merge and freeze the trunk for that.
>
> Thus, this is a heads-up - if I collide with your planned merge schedule
> just tell me and we can sort it out.
I was planning to commit the vectorizer reorganization patch
gt; for(int i=0; i data[i]+=sum;
> }
> }
>
>
> Is there a fundamental problem in using the vectorizer in C++?
>
> Regards!
>Georg
> [attachment "signature.asc" deleted by Ira Rosen/Haifa/IBM]
[EMAIL PROTECTED] wrote on 19/03/2008 06:01:19:
> The web page
>
> http://gcc.gnu.org/gcc-4.3/changes.html
>
> states that "The -ftree-vectorize option is now on by default under -
> O3.", but on
>
> http://gcc.gnu.org/onlinedocs/gcc-4.3.0/gcc/Optimize-Options.html
>
> -ftree-vectorize is not li
[EMAIL PROTECTED] wrote on 17/03/2008 21:08:43:
> It might be nice to think about an option that automatically aligns large
> arrays without having to do the declaration (or even have the vectorizer
> override the alignment for statics/auto).
The vectorizer is already doing this.
Ira
>
> --
>
e special routines tuned for the different CPU families and I
> > recommend the use of the standard intrinsics headers (*mmintr.h) for
> > this. Of course this comes at a high cost of maintainance (and initial
> > work), so autovectorization might prove good enough. Often tuning the
> > source for a given compiler has a similar effect than producing
vectorized
> > code manually. Looking at GCC tree dumps and knowing a bit about
> > GCC internals helps you here ;)
> >
> > > A roadmap or a GCC developer sharing his thoughts would be very
helpful.
> >
> > Thanks,
> > Richard.
>
>
> [attachment "signature.asc" deleted by Ira Rosen/Haifa/IBM]
Hi Andi,
[EMAIL PROTECTED] wrote on 10/03/2008 18:32:35:
>
> I noticed the gcc 4.3.0 changes document on the website does not
> mention that the vectorizer is now on by default in -O3.
> Perhaps that should be added? It seems like an important noteworthy
> change to me.
Thanks for pointing this
Dorit Nuzman/Haifa/IBM wrote on 18/02/2008 09:40:37:
> Thanks a lot for tracking down / opening the relevant PRs.
>
> about:
>
> > > (6) loop distribution is required to break a dependence. This may
> > > already be handled by Sebastian's loop-distribution pass that will
> > > be incorporated in
Hi,
Dorit Nuzman/Haifa/IBM wrote on 14/02/2008 17:02:45:
> This is an old debt: A while back Tim had sent me a detailed report
> off line showing which C++ tests (originally from the Dongara loops
> suite) were vectorized by current g++ or icpc, or both, as well as
> when the vectorization by icp
(I am resending this, since some of the addresses got corrupted. My
apologies.)
Hi,
[EMAIL PROTECTED] wrote on 16/01/2008 15:20:00:
> > When a loop is vectorized, some statements are removed from the basic
> > blocks, but the vectorizer information attached to these BBs is never
> > freed.
>
>
(I am resending this, since some of the addresses got corrupted. My
apologies.)
Hi,
[EMAIL PROTECTED] wrote on 16/01/2008 15:20:00:
> > When a loop is vectorized, some statements are removed from the basic
> > blocks, but the vectorizer information attached to these BBs is never
> > freed.
>
>
Hi,
[EMAIL PROTECTED] wrote on 16/01/2008 15:20:00:
> > When a loop is vectorized, some statements are removed from the basic
> > blocks, but the vectorizer information attached to these BBs is never
> > freed.
>
> Sebastian, thanks for bringing this to our attention. I'll look into
this.
> I ho
Dorit Nuzman/Haifa/IBM wrote on 23/01/2008 21:49:51:
> There are however a couple of small cost-model changes that were
> going to be submitted this week for the Cell SPU - it's unfortunate
> if these cannot get into 4.3.
It's indeed unfortunate. However, those changes are not crucial and there
Hi,
[EMAIL PROTECTED] wrote on 01/01/2008 22:00:11:
> some time ago I listened that GCC supports vectorization,
> but still can't find anything about it, how can I use it in my programs.
Here is the link to the vectorizer's documentation:
http://gcc.gnu.org/projects/tree-ssa/vectorization.html
"Daniel Berlin" <[EMAIL PROTECTED]> wrote on 16/06/2007:
> On 6/16/07, Dorit Nuzman <[EMAIL PROTECTED]> wrote:
>
> > Do you have specific examples where SLP helps performance out of loops?
>
> hash calculations.
>
> For md5, you can get a 2x performance improvement by straight-line
> vectorizing
"Richard Guenther" <[EMAIL PROTECTED]> wrote on 06/05/2007
16:17:05:
> On 5/6/07, Ira Rosen <[EMAIL PROTECTED]> wrote:
> >
> > Yes, this should get vectorized. The problem is in data dependencies
> > analysis. We fail to prove that s_5->a[i_1
Toon Moene <[EMAIL PROTECTED]> wrote on 06/05/2007 15:33:38:
> I'd be willing to test out your solution privately, if you prefer such a
> round first ...
>
Thanks. I'll send you a patch when it's ready.
Ira
Yes, this should get vectorized. The problem is in data dependencies
analysis. We fail to prove that s_5->a[i_16] and s_5->a[i_16] access the
same memory location. I think, it happens since when we compare the bases
of the data references (s_5->a and s_5->a) in base_object_differ_p(), we do
that b
Hi,
We were looking at the implementation of vcond for altivec and we have a
couple of questions.
vcond has 6 operands, rs6000_emit_vector_cond_expr is called from
define_expand for "vcond". It gets those operands in their original
order, as in vcond, and emits op0 = (op4 cond op5 ? op1 : op2),
Dorit Nuzman/Haifa/IBM wrote on 05/02/2007 21:13:40:
> Richard Guenther <[EMAIL PROTECTED]> wrote on 05/02/2007 17:59:00:
>
> > On Mon, 5 Feb 2007, Paolo Bonzini wrote:
> >
> > >
> > > > As we also only vectorize innermost loops I believe doing a
> > > > complete unrolling pass early will help i
Hi Diego,
In the example of dynamic partitioning below (Figure 6), I don't understand
why MEM7 is not killed in line 13 and is killed in line 20 later. As far as
I understand, in line 13 'c' is in the alias set, and it's currdef is MEM7,
so it must be killed by the store in line 14. What am I mis
Zdenek Dvorak <[EMAIL PROTECTED]> wrote on 28/09/2006
15:04:07:
>
> I have commited the documentation, including the parts from Daniel and
> Sebastian (but not yours) now.
>
> Zdenek
I've committed my part.
Ira
Sebastian Pop <[EMAIL PROTECTED]> wrote on 26/09/2006 21:24:18:
> It is probably better to include the loop indexes in the example, and
> modify the syntax of the scev for making it more explicit, like:
>
> @smallexample
> for1 i
> for2 j
> *((int *)p + i + j) = a[i][j];
Sebastian Pop <[EMAIL PROTECTED]> wrote on 08/09/2006 18:04:01:
> Ira Rosen wrote:
> >
> > > Here is the documentation for the data dependence analysis.
> >
> > I can add a description of data-refs creation/analysis if it is useful.
> >
>
> That&
"Erich Plondke" <[EMAIL PROTECTED]> wrote on 20/09/2006 04:09:14:
> On 9/19/06, Erich Plondke <[EMAIL PROTECTED]> wrote:
> > On 9/19/06, Ira Rosen <[EMAIL PROTECTED]> wrote:
> > > Hi,
> > >
> > > Does this patch fix the pr
Hi,
Does this patch fix the problem?
Ira
Index: tree-vect-transform.c
===
--- tree-vect-transform.c (revision 117002)
+++ tree-vect-transform.c (working copy)
@@ -1916,10 +1916,10 @@ vectorizable_load (tree stmt, block_s
> Here is the documentation for the data dependence analysis.
I can add a description of data-refs creation/analysis if it is useful.
Ira
*** Volker Reichelt
[EMAIL PROTECTED]
*** 287,292
--- 287,293
Tom Rix [EMAIL PROTECTED]
Craig Rodrigues [EMAIL PROTECTED]
Gavin Romig-Koch [EMAIL PROTECTED]
+ Ira Rosen
43 matches
Mail list logo