Re: [PATCH, vec-tails] Support loop epilogue vectorization

2016-12-21 Thread Yuri Rumyantsev
Sorry, I put wrong test - fix it here. 2016-12-21 13:12 GMT+03:00 Yuri Rumyantsev : > Hi Richard, > > I occasionally found out a bug in my patch related to epilogue > vectorization without masking : need to put label before > initialization. > > Could you please review and

Re: [PATCH, vec-tails] Support loop epilogue vectorization

2016-12-21 Thread Yuri Rumyantsev
Hi Richard, I occasionally found out a bug in my patch related to epilogue vectorization without masking : need to put label before initialization. Could you please review and integrate it to trunk. Test-case is also attached. Thanks ahead. Yuri. ChangeLog: 2016-12-21 Yuri Rumyantsev

Re: [PATCH, vec-tails] Support loop epilogue vectorization

2016-12-01 Thread Yuri Rumyantsev
. Best regards. Yuri 2016-12-01 14:33 GMT+03:00 Richard Biener : > On Mon, 28 Nov 2016, Yuri Rumyantsev wrote: > >> Richard! >> >> I attached vect dump for hte part of attached test-case which >> illustrated how vectorization of epilogues works through masking: >> #

Re: [PATCH, vec-tails] Support loop epilogue vectorization

2016-11-28 Thread Yuri Rumyantsev
ct-short-loops", + "Enable vectorization of low trip count loops using masking.", + 0, 0, 1) I assume that this ability can be included very quickly but it requires cost model enhancements also. Best regards. Yuri. 2016-11-28 17:39 GMT+03:00 Richard Biener : > On Thu, 24 Nov 2

Re: [PATCH, vec-tails] Support loop epilogue vectorization

2016-11-24 Thread Yuri Rumyantsev
ppreciated. ChangeLog: 2016-11-24 Yuri Rumyantsev * params.def (PARAM_VECT_EPILOGUES_MASK): New. * tree-vect-data-refs.c (vect_get_new_ssa_name): Support vect_mask_var. * tree-vect-loop.c: Include insn-config.h, recog.h and alias.h. (new_loop_vec_info): Add zeroing can_be_masked, mas

Re: [PATCH, vec-tails] Support loop epilogue vectorization

2016-11-18 Thread Yuri Rumyantsev
=16\\)" 2 "vect" { target avx2_runtime } } } */ Could you please clarify what is the reason of the failure? Thanks. 2016-11-18 16:20 GMT+03:00 Christophe Lyon : > On 15 November 2016 at 15:41, Yuri Rumyantsev wrote: >> Hi All, >> >> Here is patch for non-m

Re: [PATCH, vec-tails] Support loop epilogue vectorization

2016-11-15 Thread Yuri Rumyantsev
Hi All, Here is patch for non-masked epilogue vectoriziation. Bootstrap and regression testing did not show any new failures. Is it OK for trunk? Thanks. Changelog: 2016-11-15 Yuri Rumyantsev * params.def (PARAM_VECT_EPILOGUES_NOMASK): New. * tree-if-conv.c (tree_if_conversion): Make

Re: [PATCH, vec-tails] Support loop epilogue vectorization

2016-11-14 Thread Yuri Rumyantsev
required parts are removed > (and you'd add the testcases covering non-masked tail vect). > > Thus, can you please produce a single complete patch containing only > non-masked epilogue vectoriziation? > > Thanks, > Richard. > >> Thanks. >> Yuri. >> &g

Re: [PATCH, vec-tails] Support loop epilogue vectorization

2016-11-14 Thread Yuri Rumyantsev
Richard, In my previous patch I forgot to remove couple lines related to aux field. Here is the correct updated patch. Thanks. Yuri. 2016-11-14 15:51 GMT+03:00 Richard Biener : > On Fri, 11 Nov 2016, Yuri Rumyantsev wrote: > >> Richard, >> >> I prepare updated 3 patch

Re: [PATCH, vec-tails] Support loop epilogue vectorization

2016-11-11 Thread Yuri Rumyantsev
Richard, Here is fixed version of updated patch 3. Any comments will be appreciated. Thanks. Yuri. 2016-11-11 17:15 GMT+03:00 Yuri Rumyantsev : > Richard, > > Sorry for confusion but my updated patch does not work properly, so I > need to fix it. > > Yuri. > > 2016-11

Re: [PATCH, vec-tails] Support loop epilogue vectorization

2016-11-11 Thread Yuri Rumyantsev
Richard, Sorry for confusion but my updated patch does not work properly, so I need to fix it. Yuri. 2016-11-11 14:15 GMT+03:00 Yuri Rumyantsev : > Richard, > > I prepare updated 3 patch with passing additional argument to > vect_analyze_loop as you proposed (untested). > >

Re: [PATCH, vec-tails] Support loop epilogue vectorization

2016-11-11 Thread Yuri Rumyantsev
-10 15:36 GMT+03:00 Richard Biener : > On Thu, 10 Nov 2016, Richard Biener wrote: > >> On Tue, 8 Nov 2016, Yuri Rumyantsev wrote: >> >> > Richard, >> > >> > Here is updated 3 patch. >> > >> > I checked that all new tests related to epilo

Re: [PATCH, vec-tails] Support loop epilogue vectorization

2016-11-09 Thread Yuri Rumyantsev
st of required changes to our implementation to try to remove them. Thanks. Yuri. 2016-11-09 14:46 GMT+03:00 Bin.Cheng : > On Wed, Nov 9, 2016 at 11:28 AM, Yuri Rumyantsev wrote: >> Thanks Richard for your comments. >> Your proposed to handle epilogue loop just like normal short-trip

Re: [PATCH, vec-tails] Support loop epilogue vectorization

2016-11-09 Thread Yuri Rumyantsev
-09 13:37 GMT+03:00 Bin.Cheng : > On Tue, Nov 1, 2016 at 12:38 PM, Yuri Rumyantsev wrote: >> Hi All, >> >> I re-send all patches sent by Ilya earlier for review which support >> vectorization of loop epilogues and loops with low trip count. We >> assume that the

Re: [PATCH, vec-tails] Support loop epilogue vectorization

2016-11-08 Thread Yuri Rumyantsev
Richard, Here is updated 3 patch. I checked that all new tests related to epilogue vectorization passed with it. Your comments will be appreciated. 2016-11-08 15:38 GMT+03:00 Richard Biener : > On Thu, 3 Nov 2016, Yuri Rumyantsev wrote: > >> Hi Richard, >> >> I did

Re: [PATCH, vec-tails] Support loop epilogue vectorization

2016-11-03 Thread Yuri Rumyantsev
e there). > > That said, if we can get in non-masked epilogue vectorization > separately that would be great. Could you please clarify your proposal. Thanks. Yuri. 2016-11-02 15:27 GMT+03:00 Richard Biener : > On Tue, 1 Nov 2016, Yuri Rumyantsev wrote: > >> Hi All, >>

[PATCH, vec-tails] Support loop epilogue vectorization

2016-11-01 Thread Yuri Rumyantsev
Hi All, I re-send all patches sent by Ilya earlier for review which support vectorization of loop epilogues and loops with low trip count. We assume that the only patch - vec-tails-07-combine-tail.patch - was not approved by Jeff. I did re-base of all patches and performed bootstrapping and regre

Re: Compile-time improvement for if conversion.

2016-10-14 Thread Yuri Rumyantsev
Richard, Here is updated patch with the changes proposed by you. Bootstrapping and regression testing did not show any new failures. Is it OK for trunk? ChangeLog: 2016-10-14 Yuri Rumyantsev * dominance.c (dom_info::dom_info): Add new constructor for region presented by vector of basic

Re: Compile-time improvement for if conversion.

2016-10-12 Thread Yuri Rumyantsev
he still > "large" complexity of zeroing arrays in the constructor). > > And if we use that DFS walk directly we should be able to avoid > creating those fake entry/exit > blocks by using entry/exit edges instead... (somehow). > > Richard. > > > >> Thanks.

Re: Compile-time improvement for if conversion.

2016-10-11 Thread Yuri Rumyantsev
Richard, I implemented this by passing callback function in_region which returns true if block belongs to region. I am testing it now I attach modified patch for your quick review. Thanks. 2016-10-11 13:33 GMT+03:00 Richard Biener : > On Mon, Oct 10, 2016 at 4:17 PM, Yuri Rumyantsev wr

Re: Compile-time improvement for if conversion.

2016-10-10 Thread Yuri Rumyantsev
Richard, If "fake" exit or entry block is created in dominance how we can determine what is its the only predecessor or successor without using a notion of loop? 2016-10-10 15:00 GMT+03:00 Richard Biener : > On Mon, Oct 10, 2016 at 1:42 PM, Yuri Rumyantsev wrote: >> Than

Re: Compile-time improvement for if conversion.

2016-10-10 Thread Yuri Rumyantsev
ating ssa form. Other changes look reasonable and will fix them. 2016-10-10 12:52 GMT+03:00 Richard Biener : > On Wed, Oct 5, 2016 at 3:22 PM, Yuri Rumyantsev wrote: >> Hi All, >> >> Here is implementation of Richard proposal: >> >> < For general infrastr

Compile-time improvement for if conversion.

2016-10-05 Thread Yuri Rumyantsev
orporate this change to if conversion pass. SESE region is built by adding loop pre-header and possibly fake post-header blocks to loop body. Fake post-header is deleted after predication completion. Bootstrapping and regression testing did not show any new failures. Is it OK for trunk? ChangeLog: 2

Re: [PATCH, vec-tails 07/10] Support loop epilogue combining

2016-09-02 Thread Yuri Rumyantsev
Hi Jeff, I am trying to reduce cost of repeated call of if-conversion for epilogue vectorization. I'd like to clarify your recommendation - should I design additional support for versioning in vect_do_peeling_for_loop_bound or lightweight version of if-conversion is sufficient? Any help in clarifi

Re: [PATCH] Restrict jump threading statement simplifier to scalar types (PR71077)

2016-08-19 Thread Yuri Rumyantsev
Hi, Here is a simple test-case to reproduce 176.gcc failure (I run it on Haswell machine). Using 20160819 compiler build we get: gcc -O3 -m32 -mavx2 test.c -o test.ref.exe /users/ysrumyan/isse_6866$ ./test.ref.exe Aborted (core dumped) If I apply patch proposed by Patrick test runs properly Inste

[PATCH PR71956] Add missed check on MASK_STORE builtin.

2016-08-11 Thread Yuri Rumyantsev
Hi All, Jakub introduced regression after r235764 and we got RF for spec2000/176.gcc on HSW if loop vectorization is on (32-bit only). Here is a simple fix which cures the issue. Is it OK for trunk? ChangeLog: 2016-08-11 Yuri Rumyantsev PR rtl-optimization/71956 * ipa-pure-const.c

Re: [PATCH PR71734] Add missed check that reference defined inside loop.

2016-08-09 Thread Yuri Rumyantsev
Richard, I checked that this move helps. Does it mean that I've got approval to integrate it to trunk? 2016-08-09 14:33 GMT+03:00 Richard Biener : > On Tue, Aug 9, 2016 at 1:26 PM, Yuri Rumyantsev wrote: >> Richard, >> >> The patch proposed by you does not work p

Re: [PATCH PR71734] Add missed check that reference defined inside loop.

2016-08-09 Thread Yuri Rumyantsev
CTORIZED' pr70729-nest.cc.149t.vect You missed additional check I added before check on cached dependence. 2016-08-09 13:00 GMT+03:00 Richard Biener : > On Tue, Aug 9, 2016 at 11:20 AM, Yuri Rumyantsev wrote: >> Yes it is impossible since all basic blocks are handled from outer >&

Re: [PATCH PR71734] Add missed check that reference defined inside loop.

2016-08-05 Thread Yuri Rumyantsev
as you pointed out. Regression testing did not show any new failures and both failed tests from libgomp.fortran suite now passed. Is it OK for trunk? ChangeLog: 2016-08-05 Yuri Rumyantsev PR tree-optimization/71734 * tree-ssa-loop-im.c (ref_indep_loop_p): Add new argument REF_L

Re: [PATCH PR71734] Add missed check that reference defined inside loop.

2016-08-02 Thread Yuri Rumyantsev
Hi Richard, Did you have a chance to look at this patch? Thanks. 2016-07-29 17:00 GMT+03:00 Yuri Rumyantsev : > Hi Richard. > > It turned out that the fix proposed by you does not work for liggomp > tests simd3 and simd4. > The reason is that we can't change safelen valu

Re: [PATCH PR71734] Add missed check that reference defined inside loop.

2016-07-29 Thread Yuri Rumyantsev
Hi Richard. It turned out that the fix proposed by you does not work for liggomp tests simd3 and simd4. The reason is that we can't change safelen value for references not defined inside loop. So I add missed check on it to patch. Is it OK for trunk? ChangeLog: 2016-07-29 Yuri Rumyantsev

Re: [PATCH PR71734] Add missed check that reference defined inside loop.

2016-07-28 Thread Yuri Rumyantsev
Richard, I prepare a patch which is based on yours. New test is also included. Bootstrapping and regression testing did not show any new failures. Is it OK for trunk? Thanks. ChangeLog: 2016-07-28 Yuri Rumyantsev PR tree-optimization/71734 * tree-ssa-loop-im.c (ref_indep_loop_p_1): Pass

Re: [PATCH PR71734] Add missed check that reference defined inside loop.

2016-07-26 Thread Yuri Rumyantsev
; file or directory >> >> Before the change here, it's gated by vect_simd_clones target selector, >> which limit it to i?86/x86_64 platform only. >> >> Regards, >> Renlin Li >> >> >> >> >> On 08/07/16 15:07, Yuri Rumyantsev wrote: &

Re: [PATCH PR71734] Add missed check that reference defined inside loop.

2016-07-20 Thread Yuri Rumyantsev
gt; file or directory > > Before the change here, it's gated by vect_simd_clones target selector, > which limit it to i?86/x86_64 platform only. > > Regards, > Renlin Li > > > > > On 08/07/16 15:07, Yuri Rumyantsev wrote: >> >> Hi Richard, >> &g

Re: [PATCH] Fix test problem for pr70729.

2016-07-19 Thread Yuri Rumyantsev
Thanks Jakub for your comments. I changed the test as you proposed. Yuri. 2016-07-19 15:50 GMT+03:00 Jakub Jelinek : > On Tue, Jul 19, 2016 at 03:40:47PM +0300, Yuri Rumyantsev wrote: >> Hi All, >> >> I was informed that the test pr70729.cc from g++.dg/vect is failed on >

[PATCH] Fix test problem for pr70729.

2016-07-19 Thread Yuri Rumyantsev
Hi All, I was informed that the test pr70729.cc from g++.dg/vect is failed on non-x86 targets. I did minor changes to delete target specific stuff like xmmintrin.h. Is it OK for trunk? Changelog: 2016-07-19 Yuri Rumyantsev PR tree-optimization/71734 gcc/testsuite/ChangeLog: * g++.dg

Re: [PATCH PR71734] Add missed check that reference defined inside loop.

2016-07-15 Thread Yuri Rumyantsev
Richard! Did you have a chance to look at this patch? Thanks. Yuri. 2016-07-08 17:07 GMT+03:00 Yuri Rumyantsev : > Hi Richard, > > Thanks for your help - your patch looks much better. > Here is new patch in which additional argument was added to determine > source lo

Re: [PATCH PR71734] Add missed check that reference defined inside loop.

2016-07-08 Thread Yuri Rumyantsev
Hi Richard, Thanks for your help - your patch looks much better. Here is new patch in which additional argument was added to determine source loop of reference. Bootstrap and regression testing did not show any new failures. Is it OK for trunk? ChangeLog: 2016-07-08 Yuri Rumyantsev PR tree

[PATCH PR71518] Adjust misalign for outer loops also.

2016-07-06 Thread Yuri Rumyantsev
Hi All, Here is a simple patch which add missed misalign adjustment for outer loop. Bootstrapping and regression testing did not show any new failures. Is it OK for trunk? ChangeLog: 2016-07-06 Yuri Rumyantsev PR tree-optimization/71518 * tree-vect-data-refs.c

Re: [PATCH PR71734] Add missed check that reference defined inside loop.

2016-07-06 Thread Yuri Rumyantsev
: > On Tue, Jul 5, 2016 at 4:56 PM, Yuri Rumyantsev wrote: >> Hi All, >> >> Here is a simple fix to cure regressions introduced by my fix for >> 70729. Patch also contains minor changes in test found by Jakub. >> >> Bootstrapping and regression testing did no

[PATCH PR71734] Add missed check that reference defined inside loop.

2016-07-05 Thread Yuri Rumyantsev
Hi All, Here is a simple fix to cure regressions introduced by my fix for 70729. Patch also contains minor changes in test found by Jakub. Bootstrapping and regression testing did not show any new failures. Is it OK for trunk? ChangeLog: 2016-07-05 Yuri Rumyantsev PR tree-optimization

Re: [PATCH PR70729] The second part of patch.

2016-06-30 Thread Yuri Rumyantsev
f (bb->loop_father && bb->loop_father->safelen > 0) +bb->loop_father->safelen = 0; if (htab) { p = htab->find (&data); ChangeLog: 2016-06-30 Yuri Rumyantsev PR tree-optimization/70729 * tree-vectorizer.c (adjust_simduid_builtins): Nullify safelen field o

Re: [PATCH PR70729] The second part of patch.

2016-06-30 Thread Yuri Rumyantsev
0 15:21 GMT+03:00 Jakub Jelinek : > On Thu, Jun 30, 2016 at 03:16:51PM +0300, Yuri Rumyantsev wrote: >> Hi Grüße. >> >> Could you please tell me how to reproduce your regression - did not >> see any new failures in my local area: >> >> PASS: libgomp.fortran/e

Re: [PATCH PR70729] The second part of patch.

2016-06-30 Thread Yuri Rumyantsev
. 2016-06-29 19:19 GMT+03:00 Thomas Schwinge : > Hi! > > On Tue, 28 Jun 2016 18:50:37 +0300, Yuri Rumyantsev > wrote: >> Here is the second part of patch to improve loop invariant code motion >> for loop marked with pragma omp simd. >> >> Bootstrapping and reg

[PATCH] Generate more effective one-operand permutation instruction for knl.

2016-06-29 Thread Yuri Rumyantsev
Hi All, Here is a simple patch which generates on-operand vperm instructions introduced in knl. Using this patch we got +5% speed-up on one important benchmark. Bootstrapping and regression testing did not show any new failures. Is it OK for trunk? ChangeLog: 2016-06-29 Yuri Rumyantsev

[PATCH PR70729] The second part of patch.

2016-06-28 Thread Yuri Rumyantsev
Hi All! Here is the second part of patch to improve loop invariant code motion for loop marked with pragma omp simd. Bootstrapping and regression testing did not show any new failures. Is it OK for trunk? ChangeLog: 2016-06-28 Yuri Rumyantsev PR tree-optimization/70729 * tree-ssa-loop-im.c

]PATCH][RFC] Initial patch for better performance of 64-bit math instructions in 32-bit mode on x86-64

2016-05-31 Thread Yuri Rumyantsev
Hi Uros, Here is initial patch to improve performance of 64-bit integer arithmetic in 32-bit mode. We discovered that gcc is significantly behind icc and clang on rsa benchmark from eembc2.0 suite. Te problem function looks like typedef unsigned long long ull; typedef unsigned long ul; ul mul_add(

Re: [PATCH][RFC] Remove ifcvt_repair_bool_pattern, re-do bool patterns

2016-05-31 Thread Yuri Rumyantsev
Richard, I built compiler with your patch and did not find out any issues with vectorization of loops marked with pragma simd. I also noticed that the size of the vectorized loop looks smaller (I can't tell you exact numbers since the fresh compiler performs fool unroll even if "-funroll-loops" op

[PATCH PR70935, Regression 6,7]

2016-05-05 Thread Yuri Rumyantsev
Hi All, Here is a simple patch which cures the problem with nonlegal transformation of endless loop. THe fix is simply check that guard edge destination is not loop latch block. Bootstrapping and regression testing did not show any new failures. Is it OK for trunk? ChangeLog: 2016-05-05 Yuri

Re: [PATCH PR69652, Regression]

2016-02-29 Thread Yuri Rumyantsev
Jacub! Here is patch and ChangeLog to move pr69652.c to /vect directory. Is it OK for trunk. Thanks. Yuri. ChangeLog: 2016-02-29 Yuri Rumyantsev PR tree-optimization/69652 gcc/testsuite/ChangeLog: * gcc.dg/torture/pr69652.c: Delete test. * gcc.dg/vect/pr69652.c: New test. 2016-02-29 16

Re: [PATCH PR69652, Regression]

2016-02-29 Thread Yuri Rumyantsev
This test simply checks that ICE is not occurred but not any vectorization issues. Best regards. Yuri. 2016-02-28 20:29 GMT+03:00 H.J. Lu : > On Wed, Feb 10, 2016 at 2:26 AM, Yuri Rumyantsev wrote: >> Thanks Richard for your comments. >> I changes algorithm to remove dead scala

[PATCH PR69942] Fix test problem

2016-02-29 Thread Yuri Rumyantsev
Hi All, Here is a simple patch for gcc.dg/ifcvt5.c test - detect "6 basic blocks" string in rtl dump also to accept speculative motion of else-part of if-stmt before test-part aka IF-CASE-2. Is it OK for trunk? ChanageLog: 2016-02-29 Yuri Rumyantsev PR rtl-optimization/69942 gcc

Re: [PATCH PR69652, Regression]

2016-02-10 Thread Yuri Rumyantsev
Thanks Richard for your comments. I changes algorithm to remove dead scalar statements as you proposed. Bootstrap and regression testing did not show any new failures on x86-64. Is it OK for trunk? Changelog: 2016-02-10 Yuri Rumyantsev PR tree-optimization/69652 * tree-vect-loop.c

Re: [Patch] Gate vect-mask-store-move-1.c correctly, and actually output the dump

2016-02-08 Thread Yuri Rumyantsev
Sorry for troubles. One line must be excluded from test: -/* { dg-options "-O3" } */ Here is updated patch. Best regards. Yuri. 2016-02-08 16:40 GMT+03:00 James Greenhalgh : > On Mon, Feb 08, 2016 at 04:29:31PM +0300, Yuri Rumyantsev wrote: >> Hi James, >> >>

Re: [Patch] Gate vect-mask-store-move-1.c correctly, and actually output the dump

2016-02-08 Thread Yuri Rumyantsev
Hi James, Thanks for reporting this issue. I prepared slightly different patch since we don't need to add tree-vect dump option - it is on by default for all tests in /vect directory. gcc/testsuite/ChangeLog: * gcc.dg/vect/vect-mask-store-move-1.c: Gate dump with x86 target. 2016-02-08 16:07 GM

Re: [PATCH PR69652, Regression]

2016-02-05 Thread Yuri Rumyantsev
all vector statements in semi-hammock including SQRT. Bootstrap and regression testing did not show any new failures. Is it OK for trunk? ChangeLog: 2016-02-05 Yuri Rumyantsev PR tree-optimization/69652 * tree-vect-loop.c (optimize_mask_stores): Move declaration of STMT1 to nested loop, introd

[PATCH PR69652, Regression]

2016-02-04 Thread Yuri Rumyantsev
. Bootstrapping and regression testing on v86-64 did not show any new failures. Is it OK for trunk? ChangeLog: 2016-02-04 Yuri Rumyantsev PR tree-optimization/69652 * tree-vect-loop.c (optimize_mask_stores): Move declaration of STMT1 to nested loop, introduce new SCALAR_VUSE vector to keep vuse of all

Re: [off-list] Re: [PATCH PR68542]

2016-01-29 Thread Yuri Rumyantsev
Uros, Here is update patch which includes (1) couple changes proposed by Richard in tree-vect-loop.c and (2) the changes in back-end proposed by you. Is it OK for trunk? Bootstrap and regression testing dis not show any new failures. ChangeLog: 2016-01-29 Yuri Rumyantsev PR middle-end

Re: [PATCH PR68542]

2016-01-28 Thread Yuri Rumyantsev
Thanks Richard. Uros, Could you please review back-end part of this patch? Thanks. Yuri. 2016-01-28 16:26 GMT+03:00 Richard Biener : > On Fri, Jan 22, 2016 at 3:29 PM, Yuri Rumyantsev wrote: >> Richard, >> >> I fixed all remarks pointed by you in vectorizer part of patch

Re: [PATCH PR68542]

2016-01-22 Thread Yuri Rumyantsev
. Is it OK for trunk? Thanks. Yuri. ChangeLog: 2016-01-22 Yuri Rumyantsev PR middle-end/68542 * config/i386/i386.c (ix86_expand_branch): Add support for conditional brnach with vector comparison. * config/i386/sse.md (define_expand "cbranch4): Add define-expand for vector comparion with

Re: [PATCH PR68542]

2016-01-18 Thread Yuri Rumyantsev
Richard, Here is the second part of patch which really preforms mask stores and all statements related to it to new basic block guarded by test on zero mask. Hew test is also added. Is it OK for trunk? Thanks. Yuri. 2016-01-18 Yuri Rumyantsev PR middle-end/68542 * config/i386/i386.c

Re: [PATCH PR68542]

2016-01-18 Thread Yuri Rumyantsev
Thanks Richard. I changed the check on type as you proposed. What about the second back-end part of patch (it has been sent 08.12.15). Thanks. Yuri. 2016-01-18 15:44 GMT+03:00 Richard Biener : > On Mon, Jan 11, 2016 at 11:06 AM, Yuri Rumyantsev wrote: >> Hi Richard, >> >

Re: [Patch ifcvt] Add a new parameter to limit if-conversion

2016-01-13 Thread Yuri Rumyantsev
XECUTE + || c > limit) goto done; /* Try to emit the conditional moves. First do the then block, ChangeLog: 2016-01-13 Yuri Rumyantsev PR rtl-optimization/68920 * ifcvt.c (cond_move_process_if_block): Limit number of conditional moves. 2016-01-13 4:52 GMT+03:00 Bernd Schmidt : &

Re: [Patch ifcvt] Add a new parameter to limit if-conversion

2016-01-12 Thread Yuri Rumyantsev
Hi All, Here is a simple fix to exclude dg/ifcvt-5.c test from ia64 testing. Is it OK for trunk? testsuite/ChangeLog: 2016-01-12 Yuri Rumyantsev PR rtl-optimization/68920 gcc/testsuite/ChangeLog * gcc.dg/ifcvt-5.c: Exclude it from ia64 testing. 2016-01-12 17:01 GMT+03:00 Andreas Schwab

Re: [Patch ifcvt] Add a new parameter to limit if-conversion

2016-01-12 Thread Yuri Rumyantsev
Andreas, Is it OK for you if we exclude dg/ifcvt-5.c from ia64 testing since predication must be used instead of conditional move's. 2016-01-12 13:07 GMT+03:00 Andreas Schwab : > gcc.dg/ifcvt-5.c fails on ia64: > > From ifcvt-5.c.223r.ce1: > > == Pass 2 == > > > == no more

Re: [RFC] Combine vectorized loops with its scalar remainder.

2016-01-11 Thread Yuri Rumyantsev
Hi Richard, Did you have a chance to look at this updated patch? Thanks. Yuri. 2015-12-15 19:41 GMT+03:00 Yuri Rumyantsev : > Hi Richard, > > I re-designed the patch to determine ability of loop masking on fly of > vectorization analysis and invoke it after loop transformation. &g

Re: [PATCH PR68542]

2016-01-11 Thread Yuri Rumyantsev
Hi Richard, Did you have anu chance to look at updated patch? Thanks. Yuri. 2015-12-18 13:20 GMT+03:00 Yuri Rumyantsev : > Hi Richard, > > Here is updated patch for middle-end part of the whole patch which > fixes all your remarks I hope. > > Regression testing and bootstra

Re: [Patch ifcvt] Add a new parameter to limit if-conversion

2015-12-31 Thread Yuri Rumyantsev
(for some targets). This fix did not show any performance regressions on different x86 platforms in comparison with James patch. Bootstrap and regression testing did not show any new failures. Is it OK for trunk? ChangeLog: 2015-12-31 Yuri Rumyantsev PR rtl-optimization/68920 * config/i386/i386

Re: [Patch ifcvt] Add a new parameter to limit if-conversion

2015-12-18 Thread Yuri Rumyantsev
James, We implemented slightly different patch - we restrict number of SET instructions for if-conversion through new parameter and add check in bb_ok_for_noce_convert_multiple_sets: + unsigned limit = MIN (ii->branch_cost, + (unsigned) PARAM_VALUE (PARAM_MAX_IF_CONV_SET_INSNS)); .. - if (count

Re: [PATCH PR68542]

2015-12-18 Thread Yuri Rumyantsev
Hi Richard, Here is updated patch for middle-end part of the whole patch which fixes all your remarks I hope. Regression testing and bootstrapping did not show any new failures. Is it OK for trunk? Yuri. ChangeLog: 2015-12-18 Yuri Rumyantsev PR middle-end/68542 * fold-const.c

Re: [PATCH PR68906]

2015-12-17 Thread Yuri Rumyantsev
Richard, Here is modified patch which checks only that exit block belongs to loop. Bootstrapping and regression testing were successful. Is it OK for trunk? ChangeLog: 2014-12-17 Yuri Rumyantsev PR tree-optimization/68906 * tree-ssa-loop-unswitch.c (tree_unswitch_outer_loop): Add a check

Re: [PATCH PR68906]

2015-12-16 Thread Yuri Rumyantsev
OK for trunk. ChangeLog: 2014-12-16 Yuri Rumyantsev PR tree-optimization/68021 PR tree-optimization/68906 * tree-ssa-loop-unswitch.c : Include couple header files. (tree_unswitch_outer_loop): Add check that an exit is not inside inner loop, use number_of_latch_executions to detect non-iterated

[PATCH PR68906]

2015-12-16 Thread Yuri Rumyantsev
Hi All, Here is simple patch which cures the issue with outer-loop unswitching - added invocation of number_of_latch_executions() to reject unswitching for non-iterated loops. Bootstrapping and regression testing did not show any new failures. Is it OK for trunk? ChangeLog: 2014-12-16 Yuri

Re: [RFC] Combine vectorized loops with its scalar remainder.

2015-12-15 Thread Yuri Rumyantsev
Hi Richard, I re-designed the patch to determine ability of loop masking on fly of vectorization analysis and invoke it after loop transformation. Test-case is also provided. what is your opinion? Thanks. Yuri. ChangeLog:: 2015-12-15 Yuri Rumyantsev * config/i386/i386.c

Re: [PATCH PR68542]

2015-12-11 Thread Yuri Rumyantsev
second patch related to back-end patch which I sent earlier (12-08). Bootstrapping and regression testing did not show any new failures. Is it OK for trunk? ChangeLog: 2015-12-11 Yuri Rumyantsev PR middle-end/68542 * fold-const.c (fold_binary_op_with_conditional_arg): Add checks oh vector

Re: [PATCH PR68542]

2015-12-08 Thread Yuri Rumyantsev
Hi Richard, Here is the second part of patch. Is it OK for trunk? I assume that it should fix huge degradation on 481.wrf for -march=bdver4 also. ChangeLog: 2015-12-08 Yuri Rumyantsev PR middle-end/68542 * config/i386/i386.c (ix86_expand_branch): Implement integral vector comparison with

Re: [PATCH PR68542]

2015-12-07 Thread Yuri Rumyantsev
Richard! Here is middle-end part of patch with changes proposed by you. Is it OK for trunk? Thanks. Yuri. ChangeLog: 2015-12-07 Yuri Rumyantsev PR middle-end/68542 * fold-const.c (fold_relational_const): Add handling of vector comparison with boolean result. * tree-cfg.c

Re: [PATCH PR68542]

2015-12-04 Thread Yuri Rumyantsev
this transformation works for "-march=bdver4" option and regression for 481.wrf must disappear too. Thanks. Yuri. 2015-12-04 15:18 GMT+03:00 Richard Biener : > On Mon, Nov 30, 2015 at 2:11 PM, Yuri Rumyantsev wrote: >> Hi All, >> >> Here is a patch for 481.wrf pref

Re: [RFC] Combine vectorized loops with its scalar remainder.

2015-11-30 Thread Yuri Rumyantsev
15-11-27 16:45 GMT+03:00 Richard Biener : > On Fri, Nov 13, 2015 at 11:35 AM, Yuri Rumyantsev wrote: >> Hi Richard, >> >> Here is updated version of the patch which 91) is in sync with trunk >> compiler and (2) contains simple cost model to estimate profitability >&g

[PATCH PR68542]

2015-11-30 Thread Yuri Rumyantsev
Hi All, Here is a patch for 481.wrf preformance regression for avx2 which is sligthly modified mask store optimization. This transformation allows perform unpredication for semi-hammock containing masked stores, other words if we have a loop like for (i=0; i PR middle-end/68542 * config/i386/i386

Re: [RFC] Combine vectorized loops with its scalar remainder.

2015-11-23 Thread Yuri Rumyantsev
Hi Richard, Did you have a chance to look at this? Thanks. Yuri. 2015-11-13 13:35 GMT+03:00 Yuri Rumyantsev : > Hi Richard, > > Here is updated version of the patch which 91) is in sync with trunk > compiler and (2) contains simple cost model to estimate profitability > of

Re: [PATCH] Simple optimization for MASK_STORE.

2015-11-19 Thread Yuri Rumyantsev
is is applied to very specialized context. My answers are below. 2015-11-12 16:58 GMT+03:00 Richard Biener : > On Wed, Nov 11, 2015 at 2:13 PM, Yuri Rumyantsev wrote: >> Richard, >> >> What we should do to cope with this problem (structure size increasing)? >> Shoul

Re: [RFC] Combine vectorized loops with its scalar remainder.

2015-11-13 Thread Yuri Rumyantsev
n Tue, Nov 3, 2015 at 1:08 PM, Yuri Rumyantsev wrote: >>>> Richard, >>>> >>>> It looks like misunderstanding - we assume that for GCCv6 the simple >>>> scheme of remainder will be used through introducing new IV : >>>> https://gcc.

Re: [PATCH] Simple optimization for MASK_STORE.

2015-11-11 Thread Yuri Rumyantsev
rd Biener : >>> On Tue, Nov 10, 2015 at 1:48 PM, Ilya Enkovich >>> wrote: >>>> 2015-11-10 15:33 GMT+03:00 Richard Biener : >>>>> On Fri, Nov 6, 2015 at 2:28 PM, Yuri Rumyantsev >>>>> wrote: >>>>>> Richard, >>>>&

Re: [PATCH] Simple optimization for MASK_STORE.

2015-11-06 Thread Yuri Rumyantsev
Richard, I tried it but 256-bit precision integer type is not yet supported. Yuri. 2015-11-06 15:56 GMT+03:00 Richard Biener : > On Mon, Nov 2, 2015 at 4:24 PM, Yuri Rumyantsev wrote: >> Hi Richard, >> >> I've come back to this optimization and try to implement your

Re: [PATCH] Simple optimization for MASK_STORE.

2015-11-05 Thread Yuri Rumyantsev
not successful and I returned to vectori comparison with scalar Boolean result. ChangeLog: 2015-11-05 Yuri Rumyantsev * config/i386/i386.c: Add conditional initialization of PARAM_ZERO_TEST_FOR_MASK_STORE. (ix86_expand_branch): Implement vector comparison with boolean result. * config/i386/i386

Re: [RFC] Combine vectorized loops with its scalar remainder.

2015-11-03 Thread Yuri Rumyantsev
-constant trip count. Yuri. 2015-11-03 14:47 GMT+03:00 Richard Biener : > On Wed, Oct 28, 2015 at 11:45 AM, Yuri Rumyantsev wrote: >> Hi All, >> >> Here is a preliminary patch to combine vectorized loop with its scalar >> remainder, draft of which was proposed by Kirill

Re: [RFC] Combine vectorized loops with its scalar remainder.

2015-11-03 Thread Yuri Rumyantsev
+03:00 Richard Henderson : > On 10/28/2015 11:45 AM, Yuri Rumyantsev wrote: >> >> Hi All, >> >> Here is a preliminary patch to combine vectorized loop with its scalar >> remainder, draft of which was proposed by Kirill Yukhin month ago: >> https://gcc.gnu.org/ml/

Re: [PATCH] Simple optimization for MASK_STORE.

2015-11-02 Thread Yuri Rumyantsev
pe)), MODE_INT, 0); ext_type = lang_hooks.types.type_for_mode (ext_mode , 1); but I've got zero type for it. Should I miss something? Any help will be appreciated. Yuri. 2015-08-13 14:40 GMT+03:00 Richard Biener : > On Thu, Aug 13, 2015 at 1:32 PM, Yuri Rumyantsev wrote: >> Hi

[RFC] Combine vectorized loops with its scalar remainder.

2015-10-28 Thread Yuri Rumyantsev
Hi All, Here is a preliminary patch to combine vectorized loop with its scalar remainder, draft of which was proposed by Kirill Yukhin month ago: https://gcc.gnu.org/ml/gcc-patches/2015-09/msg01435.html It was tested wwith '-mavx2' option to run on Haswell processor. The main goal of it is to impr

Re: [PATCH PR67909 PR67947]

2015-10-13 Thread Yuri Rumyantsev
Here is updated patch with splitting long line. The patch is attached. Yuri. 2015-10-13 15:38 GMT+03:00 H.J. Lu : > On Tue, Oct 13, 2015 at 4:57 AM, Yuri Rumyantsev wrote: >> Hi All, >> >> Here is a simple patch for unswitching outer loop through guard-edge >> hoi

[PATCH PR67909 PR67947]

2015-10-13 Thread Yuri Rumyantsev
Hi All, Here is a simple patch for unswitching outer loop through guard-edge hoisting. The check that guard-edge is around the inner loop was missed. Bootstrapping and regression testing did not show new failures. Is it OK for trunk? ChangeLog: 2014-10-13 Yuri Rumyantsev PR tree

[PATCH] Simple 2-lines fix for outer-loop vectorization.

2015-10-08 Thread Yuri Rumyantsev
Yuri Rumyantsev * tree-vect-loop.c (vect_analyze_loop_operations): Skip virtual phi in the tail of outer-loop. gcc/testsuite/ChangeLog: * gcc.dg/vect/vect-outer-simd-3.c: New test. patch.outer-vec Description: Binary data

Re: [PATCH] Unswitching outer loops.

2015-10-07 Thread Yuri Rumyantsev
Richard, I noticed that 'gimple' type was changed and send you updated patch. Thanks. Yuri. 2015-10-07 12:53 GMT+03:00 Yuri Rumyantsev : > Richard, > > I've fixed adding virtual phi argument and add check on irreducible basic > block. > New patch is attached. >

Re: [PATCH] Unswitching outer loops.

2015-10-07 Thread Yuri Rumyantsev
Richard, I've fixed adding virtual phi argument and add check on irreducible basic block. New patch is attached. I checked it for bootstrap and regression testing, no new failures. ChangeLog: 2015-10-07 Yuri Rumyantsev * tree-ssa-loop-unswitch.c: Include "gimple-iterator.h"

Re: [PATCH] Unswitching outer loops.

2015-10-06 Thread Yuri Rumyantsev
: 2015-10-06 Yuri Rumyantsev * tree-ssa-loop-unswitch.c: Include "gimple-iterator.h" and "cfghooks.h", add prototypes for introduced new functions. (tree_ssa_unswitch_loops): Use from innermost loop iterator, move all checks on ability of loop unswitching to tree_unswitch

Re: [PATCH] Unswitching outer loops.

2015-10-05 Thread Yuri Rumyantsev
-10-05 13:57 GMT+03:00 Richard Biener : > On Wed, Sep 30, 2015 at 12:46 PM, Yuri Rumyantsev wrote: >> Hi Richard, >> >> I re-designed outer loop unswitching using basic idea of 23855 patch - >> hoist invariant guard if loop is empty without guard. Note that this

Re: [PATCH] Unswitching outer loops.

2015-09-30 Thread Yuri Rumyantsev
any new failures. What is your opinion? Thanks. ChangeLog: 2015-09-30 Yuri Rumyantsev * tree-ssa-loop-unswitch.c: Include "gimple-iterator.h" and "cfghooks.h", add prototypes for introduced new functions. (tree_ssa_unswitch_loops): Use from innermost loop iterator, move all

[x86 PATCH] Improve performance for Haswell family.

2015-09-10 Thread Yuri Rumyantsev
Hi All, Here is updated patch introducing new .md file describing Haswell pipeline. Reassociation width for float-point instructions was increased to 4 for Haswell family. Regression testing did not show any new failures. Is it OK for trunk? ChangeLog 2015-09-10 Yuri Rumyantsev * config

Re: [PATCH] Improve performance for Haswell family.

2015-08-14 Thread Yuri Rumyantsev
:44 AM, Yuri Rumyantsev wrote: >> Hi All, >> >> Here is patch which contains >> (1) modifying of core2.md to conform Haswell pipeline and adding of >> missed instruction reservation for instructions with vector operands. >> (2) increase reassociation width fo

  1   2   3   >