Jakub Jelinek wrote on 15/12/2011 03:51:25 PM:
> On Thu, Dec 15, 2011 at 03:35:34PM +0200, Ira Rosen wrote:
> > > This patch also fixes
> > > a problem where vect_determine_vectorization_factor would iterate the
> > same
> > > stmt twice - for some reason both the original stmt and pattern stmt
On 12/14/2011 04:25 AM, Jakub Jelinek wrote:
> * config/i386/sse.md (vcond,
> vcond, vcondv2di):
> Use general_operand instead of nonimmediate_operand for
> operand 5 and no predicate for operands 1 and 2.
> * config/i386/i386.c (ix86_expand_int_vcond): Optimize
>
Jakub Jelinek wrote on 15/12/2011 03:51:25 PM:
> On Thu, Dec 15, 2011 at 03:35:34PM +0200, Ira Rosen wrote:
> > > This patch also fixes
> > > a problem where vect_determine_vectorization_factor would iterate the
> > same
> > > stmt twice - for some reason both the original stmt and pattern stmt
On Thu, Dec 15, 2011 at 03:35:34PM +0200, Ira Rosen wrote:
> > This patch also fixes
> > a problem where vect_determine_vectorization_factor would iterate the
> same
> > stmt twice - for some reason both the original stmt and pattern stmt (and
> > def stmt) are marked as relevant,
>
> Do you have
Jakub Jelinek wrote on 15/12/2011 12:54:29 PM:
> Perhaps it would be even cleaner to get rid of the pattern stmt and def
stmt
> seq distinction and just have pattern as whole be represented as
gimple_seq,
> but perhaps that cleanup can be deferred for later.
Sounds good.
> This patch also fix
On Thu, 15 Dec 2011, Ira Rosen wrote:
>
>
> Jakub Jelinek wrote on 15/12/2011 09:02:57 AM:
>
> > On Thu, Dec 15, 2011 at 08:32:26AM +0200, Ira Rosen wrote:
> > > > + cond = build2 (LT_EXPR, boolean_type_node, oprnd0, build_int_cst
> > > > (itype, 0));
> > > > + gsi = gsi_for_stmt (last_stmt)
Jakub Jelinek wrote on 15/12/2011 09:02:57 AM:
> On Thu, Dec 15, 2011 at 08:32:26AM +0200, Ira Rosen wrote:
> > > + cond = build2 (LT_EXPR, boolean_type_node, oprnd0, build_int_cst
> > > (itype, 0));
> > > + gsi = gsi_for_stmt (last_stmt);
> > > + if (rhs_code == TRUNC_DIV_EXPR)
> > > +{
On Thu, Dec 15, 2011 at 08:32:26AM +0200, Ira Rosen wrote:
> > + cond = build2 (LT_EXPR, boolean_type_node, oprnd0, build_int_cst
> > (itype, 0));
> > + gsi = gsi_for_stmt (last_stmt);
> > + if (rhs_code == TRUNC_DIV_EXPR)
> > +{
> > + tree var = vect_recog_temp_ssa_var (itype, NULL);
>
Jakub Jelinek wrote on 14/12/2011 02:25:13 PM:
>
> @@ -1573,6 +1576,211 @@ vect_recog_vector_vector_shift_pattern (
>return pattern_stmt;
> }
>
> +/* Detect a signed division by power of two constant that wouldn't be
> + otherwise vectorized:
> +
> + type a_t, b_t;
> +
> + S1 a_t =
On Wed, Dec 14, 2011 at 01:25:13PM +0100, Jakub Jelinek wrote:
> On Tue, Dec 13, 2011 at 05:57:40PM +0400, Kirill Yukhin wrote:
> > > Let me hack up a quick pattern recognizer for this...
>
> Here it is, untested so far.
> On the testcase doing 200 f1+f2+f3+f4 calls in the loop with -O3 -mavx
Hi Jakub,
For 401.bzip2 it looks perfect. This is loop is vectorized:
.L6:
vmovdqa (%rsi,%rax), %ymm0
addl$1, %ecx
vpsrad $8, %ymm0, %ymm0
vpsrld $31, %ymm0, %ymm1
vpaddd %ymm1, %ymm0, %ymm0
vpsrad $1, %ymm0, %ymm0
vpaddd %ymm2, %ymm0
11 matches
Mail list logo