All:
Given a Data Dependency Graph(DDG) the unrolling degree proposed by Monica Lam
et.al calculates the unrolling degree as follows.
Unrolling degree = Length of Longest Live range/ Number of cycles in the kernel
( Initiation Interval). The unrolling degree based on the
Above leads to more re
-Original Message-
From: gcc-ow...@gcc.gnu.org [mailto:gcc-ow...@gcc.gnu.org] On Behalf Of Jeff Law
Sent: Wednesday, June 24, 2015 10:36 AM
To: gcc@gcc.gnu.org
Subject: Re: set_src_cost lying comment
On 06/21/2015 11:57 PM, Alan Modra wrote:
> set_src_cost says it is supposed to
> /* Ret
-Original Message-
From: gcc-ow...@gcc.gnu.org [mailto:gcc-ow...@gcc.gnu.org] On Behalf Of Richard
Kenner
Sent: Wednesday, June 24, 2015 9:28 PM
To: l...@redhat.com
Cc: gcc@gcc.gnu.org
Subject: Re: set_src_cost lying comment
> These are good examples of things the costing model simply w
All:
The presence of aliases disables many optimizations like CCP(conditional
constant propagation) , PRE(Partial Redundancy Elimination),
Scalar Replacements for conditional IF-THEN-ELSE. The presence of aliasing
also disables the IF-conversion.
I am proposing the Multi-version IF-THEN-ELSE w
All:
Single Entry and Multiple Exits disables traditional Loop optimization. The
presence of short circuit also makes the CFG as
Single Entry and Multiple Exits. The transformation from SEME(Single Entry and
Multiple Exits) to SESE( Single Entry and
Single Exits enables many Loop Optimizations.
All:
The Cost Calculation for a candidate to Spill in the Integrated Register
Allocator(IRA) considers only the SESE regions.
The Cost Calculation in the IRA should consider the SEME regions into consider
for spilling decisions.
The Cost associated with the path that has un-matured exists shou
Ajit
-Original Message-
From: Ajit Kumar Agarwal
Sent: Thursday, July 02, 2015 3:33 PM
To: vmaka...@redhat.com; l...@redhat.com; gcc@gcc.gnu.org
Cc: Vinod Kathail; Shail Aditya Gupta; Vidhumouli Hunsigida; Nagaraju Mekala
Subject: Consideration of Cost associated with SEME regions.
All:
The
All:
Design and Analysis of Profile-Based Optimization in Compaq's
Compilation Tools for Alpha; Journal of Instruction-Level
Parallelism 3 (2000) 1-25
The above paper based on this paper the existing tracer pass (This pass
performs the tail duplication needed for superblock formation.)
Sorry for the typo error.
The below is the corrected Fig (1).
While (a[i] != key)
I = i+1;
Return I;
Fig (1).
Thanks & Regards
Ajit
-Original Message-
From: gcc-ow...@gcc.gnu.org [mailto:gcc-ow...@gcc.gnu.org] On Behalf Of Ajit
Kumar Agarwal
Sent: Saturday, July 04, 20
All:
I am wondering allocation of hot data structure closer to the top of the stack
increases the performance of the application.
The data structure are identified as hot and cold data structure and all the
data structures are sorted in decreasing order of
The hotness and the hot data structure
All:
The scalar and array reduction patterns can be identified if the result of
commutative updates
Is applied to the same scalar or array variables on the LHS with +, *, Min or
Max. Thus the reduction pattern identified with
the commutative update help in vectorization or parallelization.
Fo
-Original Message-
From: Bin.Cheng [mailto:amker.ch...@gmail.com]
Sent: Monday, July 06, 2015 7:04 AM
To: Steven Bosscher
Cc: Ajit Kumar Agarwal; l...@redhat.com; Richard Biener; gcc@gcc.gnu.org; Vinod
Kathail; Shail Aditya Gupta; Vidhumouli Hunsigida; Nagaraju Mekala
Subject: Re: Live
-Original Message-
From: Bin.Cheng [mailto:amker.ch...@gmail.com]
Sent: Monday, July 06, 2015 10:26 AM
To: Ajit Kumar Agarwal
Cc: Steven Bosscher; l...@redhat.com; Richard Biener; gcc@gcc.gnu.org; Vinod
Kathail; Shail Aditya Gupta; Vidhumouli Hunsigida; Nagaraju Mekala
Subject: Re
All:
While/For ( condition1)
{
Some code here.
If(condition2 )
continue;
Some code here.
}
Fig(1)
For the above loop in Fig(1) there will be two backedges and multiple latches.
The below code can be transformed to the below in order to
have a single backedge.
While/For (condition
-Original Message-
From: Jeff Law [mailto:l...@redhat.com]
Sent: Friday, July 10, 2015 4:04 AM
To: Ajit Kumar Agarwal; Richard Biener; gcc@gcc.gnu.org
Cc: Vinod Kathail; Shail Aditya Gupta; Vidhumouli Hunsigida; Nagaraju Mekala
Subject: Re: [RFC] Design and Implementation for Path
-Original Message-
From: Jeff Law [mailto:l...@redhat.com]
Sent: Friday, July 10, 2015 4:04 AM
To: Ajit Kumar Agarwal; Richard Biener; gcc@gcc.gnu.org
Cc: Vinod Kathail; Shail Aditya Gupta; Vidhumouli Hunsigida; Nagaraju Mekala
Subject: Re: [RFC] Design and Implementation for Path
All:
I am trying the place the following Analysis in the vectorizer of GCC that
helps in improving the vectorizer to a great extent
For the unit stride, zero stride and non stride accesses of memory that helps
in vectorizer.
For the Data Dependency graph, the topological sort is performed. The
All:
I am wondering how useful to form the traces on Data Dependency Graph. On top
of the traces in the Control flow graph,
I was thinking of forming the traces on data Dependency graph(DDG).
Would this helps in further vectorization and parallelization candidates.
Thoughts?
Thanks & Regar
-Original Message-
From: Richard Biener [mailto:richard.guent...@gmail.com]
Sent: Tuesday, July 14, 2015 6:35 PM
To: Ajit Kumar Agarwal
Cc: Jeff Law; Jan Hubicka; gcc@gcc.gnu.org; Vinod Kathail; Shail Aditya Gupta;
Vidhumouli Hunsigida; Nagaraju Mekala
Subject: Re: Traces on Data
All:
>From the description of the definition of the macro
>RETURN_ADDRESS_POINTER_REGNUM , it is derived that this macro is used to
Define a register for the above macro that helps in getting the return address
from the stack or frame pointer.
I could see many of the architectures supported by
All:
The definition of the following macro that determine the statement cost that
adds to vectorization cost.
#define TARGET_VECTORIZE_ADD_STMT_COST.
In the implementation of the above macro the following is done for many
vectorization supported architectures like i386, ARM.
if (where == vect
-Original Message-
From: Richard Biener [mailto:richard.guent...@gmail.com]
Sent: Monday, August 03, 2015 2:59 PM
To: Ajit Kumar Agarwal
Cc: Jeff Law; gcc@gcc.gnu.org; Vinod Kathail; Shail Aditya Gupta; Vidhumouli
Hunsigida; Nagaraju Mekala
Subject: Re: vectorization cost macro
All:
For the Loop given in Fig(1), there is no possibility of loop distribution
because of the dependency of S1 and S2 on the outerloop index k.
Due to the dependency the Loop cannot be distributed.
The Loop can be distributed with the transformation given in Fig(2) where the
loop given in Fi
All:
Loop distribution considers DDG to decide on distributing the Loops. The Loops
with control statements like IF-THEN-ELSE can also be
Distributed. Instead of Data Dependency Graph, the Control Dependence Graph
should be considered in order to distribute the loops
In presence of control Stat
-Original Message-
From: Richard Biener [mailto:richard.guent...@gmail.com]
Sent: Thursday, August 13, 2015 3:23 PM
To: Ajit Kumar Agarwal
Cc: Jeff Law; gcc@gcc.gnu.org; Vinod Kathail; Shail Aditya Gupta; Vidhumouli
Hunsigida; Nagaraju Mekala
Subject: Re: More of a Loop distribution
-Original Message-
From: Richard Biener [mailto:richard.guent...@gmail.com]
Sent: Friday, August 14, 2015 11:30 AM
To: Ajit Kumar Agarwal
Cc: Jeff Law; gcc@gcc.gnu.org; Vinod Kathail; Shail Aditya Gupta; Vidhumouli
Hunsigida; Nagaraju Mekala
Subject: RE: More of a Loop distribution
-Original Message-
From: Richard Biener [mailto:richard.guent...@gmail.com]
Sent: Monday, August 03, 2015 2:59 PM
To: Ajit Kumar Agarwal
Cc: Jeff Law; gcc@gcc.gnu.org; Vinod Kathail; Shail Aditya Gupta; Vidhumouli
Hunsigida; Nagaraju Mekala
Subject: Re: vectorization cost macro
All:
Loop fusion is an important optimizations that fuses the set of Loops if the
following condition is valid.
1) Loops are conformant ( i.e. they have same iteration count).
2. Loops are control equivalent. The control equivalence of the loops can be
identified with the dominator and post dom
-Original Message-
From: Richard Biener [mailto:richard.guent...@gmail.com]
Sent: Friday, August 14, 2015 9:59 PM
To: Ajit Kumar Agarwal
Cc: Jeff Law; gcc@gcc.gnu.org; Vinod Kathail; Shail Aditya Gupta; Vidhumouli
Hunsigida; Nagaraju Mekala
Subject: RE: vectorization cost macro
-Original Message-
From: gcc-ow...@gcc.gnu.org [mailto:gcc-ow...@gcc.gnu.org] On Behalf Of Ajit
Kumar Agarwal
Sent: Monday, August 17, 2015 4:03 PM
To: Richard Biener
Cc: Jeff Law; gcc@gcc.gnu.org; Vinod Kathail; Shail Aditya Gupta; Vidhumouli
Hunsigida; Nagaraju Mekala
Subject: RE
All:
I have done the vectorization cost changes as given below. I have considered
only the cost associated with the inner instead of outside.
The consideration of inside scalar and vector cost is done as the inner cost
are the most cost effective than the outside cost.
min_profitable_i
-Original Message-
From: Richard Biener [mailto:richard.guent...@gmail.com]
Sent: Friday, August 21, 2015 2:03 PM
To: Ajit Kumar Agarwal
Cc: Jeff Law; GCC Patches; gcc@gcc.gnu.org; Vinod Kathail; Shail Aditya Gupta;
Vidhumouli Hunsigida; Nagaraju Mekala
Subject: Re: [RFC
All;
The Global code motion are the important optimization that have an impact on
register spills and Fetch. Thus
The Global code motion takes into account the increase or decrease of register
pressure.
Strength Reductions is an important optimization that has an impact on register
pressure. T
All:
The Data Dependency graph augmented with control dependence can be common out
based on the dominator info.
The instruction I1 dominates all the uses say instruction I2 and I3. Then I2
and I3 depends on I1. Thus the Graph can be
Formed from the dominator tree of all the instructions and the
All:
The Live ranges info on tree SSA representation is important step towards the
SSA based code motion optimizations.
As the code motion optimization based on the SSA representation effects the
register pressure and reasons for performance
Bottleneck.
I am proposing the Live range Analysis ba
-Original Message-
From: gcc-ow...@gcc.gnu.org [mailto:gcc-ow...@gcc.gnu.org] On Behalf Of Ajit
Kumar Agarwal
Sent: Wednesday, August 19, 2015 2:53 PM
To: Richard Biener
Cc: Jeff Law; gcc@gcc.gnu.org; Vinod Kathail; Shail Aditya Gupta; Vidhumouli
Hunsigida; Nagaraju Mekala
Subject: RE
-Original Message-
From: Aaron Sawdey [mailto:acsaw...@linux.vnet.ibm.com]
Sent: Wednesday, September 02, 2015 8:23 PM
To: Ajit Kumar Agarwal
Cc: Jeff Law; vmaka...@redhat.com; Richard Biener; gcc@gcc.gnu.org; Vinod
Kathail; Shail Aditya Gupta; Vidhumouli Hunsigida; Nagaraju Mekala
All:
The cost and benefit associated for moving a given expression above conditional
are the important factors for the performance boost.
Considering the above, the cost and benefit calculation can be derived based on
below.
For a given conditional entry point 'n', the benefit path 'p' are t
All:
Inlining decisions that reduces the formulation of callee's stacks frame and
including the callee in the caller context increases
The performance.
The priority function of Inlining decisions can be calculated as follows
considering the following.
1. Level nest of the callee.
2. code size
All:
The Loop unrolling and the decisions on unrolling factor is an important
criteria for loop Unrolling optimization.
The decision on unrolling factor for the loops based on the below criteria
improves the performance of unrolled loops.
1. Number of operations.
2. Number of operands.
3. Numb
-Original Message-
From: Aaron Sawdey [mailto:acsaw...@linux.vnet.ibm.com]
Sent: Friday, September 04, 2015 11:51 PM
To: Ajit Kumar Agarwal
Cc: Jeff Law; vmaka...@redhat.com; Richard Biener; gcc@gcc.gnu.org; Vinod
Kathail; Shail Aditya Gupta; Vidhumouli Hunsigida; Nagaraju Mekala
All:
The replacement of malloc with alloca can be done on the following analysis.
If the lifetime of an object does not stretch beyond the immediate scope. In
such cases the malloc can be replaced with alloca.
This increases the performance to a great extent.
Inlining helps to a great extent th
Sorry for resending again as Plain Text as my earlier mail was sent with HTML
enable. This makes enable to send it to gcc@gcc.gnu.org.
Sorry once again.
Thanks & Regards
Ajit
From: Ajit Kumar Agarwal
Sent: Wednesday, May 14, 2014 10:43 PM
To: 'gcc@gcc.gnu.org'; 'vmak
On 2014-05-14, 1:33 PM, Ajit Kumar Agarwal wrote:
>
> Hello All:
>
> I am planning to implement the Live range splitting based on the following
> cases in the Integrated Register Allocator.
>
> For a given Live range that spans from from outer region to inner region of
Thanks Vladimir for the clarification.
Thanks & Regards
Ajit
-Original Message-
From: Vladimir Makarov [mailto:vmaka...@redhat.com]
Sent: Thursday, May 15, 2014 8:39 PM
To: Ajit Kumar Agarwal; gcc@gcc.gnu.org
Cc: Michael Eager; Vinod Kathail; Vidhumouli Hunsigida; Nagaraju Me
Is it the case of code speculation where the negative latencies are used?
Thanks & Regards
Ajit
-Original Message-
From: gcc-ow...@gcc.gnu.org [mailto:gcc-ow...@gcc.gnu.org] On Behalf Of shmeel
gutl
Sent: Monday, May 19, 2014 12:23 PM
To: Andrew Pinski
Cc: gcc@gcc.gnu.org; Vladimir Makar
Hello All:
Simpson does the Live range shrinking and reduction of register pressure by
using the computation that are not load and store but the arithmetic
computation. The computation
where the operands and registers are live at the entry and exit of the basic
block but not touched inside the
On Friday, May 23, 2014 1:46 AM Vladimir Makarov wrote:
On 05/21/2014 12:25 AM, Ajit Kumar Agarwal wrote:
> Hello All:
>
> Simpson does the Live range shrinking and reduction of register
> pressure by using the computation that are not load and store but the
> arithmetic c
Hello All:
I was looking further the aspect of reducing register pressure based on
Register Allocation and Instruction Scheduling and the
Following observation being made on reducing register pressure based on the
existing papers on reducing register pressure
Based on scheduling approach.
Does
Hello All:
There has been work done for load rematerialization. Instead of Store and Load
of variables they kept in registers for the Live range. Till now we are doing
the rematerialization of scalar loads.
Is it feasible to have rematerialization for the vector Loads? This will be
helpful
Hello All:
I have worked on the Open64 compiler where the Register Pressure Guided Unroll
and Jam gave a good amount of performance improvement for the C and C++ Spec
Benchmark and also Fortran benchmarks.
The Unroll and Jam increases the register pressure in the Unrolled Loop leading
to inc
-Original Message-
From: Richard Biener [mailto:richard.guent...@gmail.com]
Sent: Monday, June 16, 2014 7:55 PM
To: Ajit Kumar Agarwal
Cc: gcc@gcc.gnu.org; Vladimir Makarov; Michael Eager; Vinod Kathail; Shail
Aditya Gupta; Vidhumouli Hunsigida; Nagaraju Mekala
Subject: Re: Register
The cause of xmalloc occurring at times given below in Register Allocator will
not be caused only by the structure and changing the passed S as template
argument.
It depends on how the below structures is referenced or used. From the stack
trace I can see the live ranges creation is based on how
-Original Message-
From: Daniel Gutson [mailto:daniel.gut...@tallertechnologies.com]
Sent: Wednesday, August 27, 2014 8:53 PM
To: Ajit Kumar Agarwal
Cc: gcc Mailing List
Subject: Re: Possible LRA issue?
On Wed, Aug 27, 2014 at 12:16 PM, Ajit Kumar Agarwal
wrote:
> The cause
Hello All:
Please find the different Global Value numbering techniques on SSA
representation and proposing in GCC Global Value Numbering on SSA
representation based on Redundancy Class. Can this be proposed.
SSA representation with control graph can be formulated with Global Value
Numbering A
Hello All:
Memset and Memcpy calls are extensively used in many benchmarks. Inlining or
expansion
the memcpy and memset calls improves the performance of many performance
Benchmark.
I have implemented the expansion of strcmp to the optimizaed sequence of
instruction
In open64 compiler for AMD
Hello All:
I was looking at the optimized usage and allocation to argument registers.
There are two aspects to it as follows.
1. We need to specify the argument registers as followed by ABI in the target
specific code. Based on the function
argument registers defined in the target dependent co
-Original Message-
From: Vladimir Makarov [mailto:vmaka...@redhat.com]
Sent: Tuesday, November 18, 2014 1:57 AM
To: Ajit Kumar Agarwal; gcc Mailing List
Cc: Vinod Kathail; Shail Aditya Gupta; Vidhumouli Hunsigida; Nagaraju Mekala
Subject: Re: Optimized Allocation of Argument registers
-Original Message-
From: Jeff Law [mailto:l...@redhat.com]
Sent: Monday, November 17, 2014 9:27 PM
To: Ajit Kumar Agarwal; Vladimir Makarov; gcc Mailing List
Cc: Vinod Kathail; Shail Aditya Gupta; Vidhumouli Hunsigida; Nagaraju Mekala
Subject: Re: Optimized Allocation of Argument
From: Ajit Kumar Agarwal
Sent: Tuesday, November 18, 2014 7:01 PM
To: 'Vladimir Makarov'; gcc Mailing List
Cc: Vinod Kathail; Shail Aditya Gupta; Vidhumouli Hunsigida; Nagaraju Mekala
Subject: RE: Optimized Allocation of Argument registers
-Original Message-
From: Vl
-Original Message-
From: gcc-ow...@gcc.gnu.org [mailto:gcc-ow...@gcc.gnu.org] On Behalf Of Jeff Law
Sent: Tuesday, December 09, 2014 11:26 PM
To: Vladimir Makarov; lin zuojian; gcc@gcc.gnu.org
Subject: Re: A Question About LRA/reload
On 12/09/14 10:10, Vladimir Makarov wrote:
> generate
Hello All:
Since the prefetch instruction have no direct consumers in the code stream,
they provide considerable freedom to the
Instruction scheduler. They are typically assigned lower priorities than most
of the instructions in the code stream.
This tends to cause all the prefetch instruction
-Original Message-
From: paul_kon...@dell.com [mailto:paul_kon...@dell.com]
Sent: Saturday, December 13, 2014 9:46 PM
To: Ajit Kumar Agarwal
Cc: vmaka...@redhat.com; l...@redhat.com; richard.guent...@gmail.com;
gcc@gcc.gnu.org; Vinod Kathail; Shail Aditya Gupta; Vidhumouli Hunsigida
Hello All:
I was going through the following article
" Register Allocation with instruction scheduling: a new approach" by Pinter
etal.
The phase ordering of register allocation and Instruction scheduling is
important topic. The scheduling before register allocator
increases the register pr
The following fig (1) shows an implementation of the SSQ kernel from the BLAS
Library in ATLAS.
Fig(2) shows the conversions of the IF-THEN-ELSE in Fig(1) to vectorized code.
Normally in the automatic vectorization the IF-THEN-ELSE is
vectorized only after the IF-CONVERSION that converts con
s are given below.
From 758ee2227e9dde946ac35b772bee99279b1bf996 Mon Sep 17 00:00:00 2001
From: Ajit Kumar Agarwal
Date: Tue, 6 Jan 2015 19:42:16 +0530
Subject: [PATCH] IRA : Changes in the cost of putting allocno into memory.
Changes are made to not consider the back edge frequency for
-Original Message-
From: gcc-ow...@gcc.gnu.org [mailto:gcc-ow...@gcc.gnu.org] On Behalf Of Joel
Sherrill
Sent: Thursday, January 08, 2015 8:59 PM
To: Eric Botcazou; Claudiu Zissulescu
Cc: gcc@gcc.gnu.org; David Kang
Subject: Re: Support for architectures without hardware interlocks
On
I was thinking of some of the opportunities with respect to reducing spills
inside the Loop. If the Live range(allocno)
spans through the Loop and Live out at the exit of the Loop and there are no
references or not being touched upon
inside the Loop, assign the allocno to the memory. This incre
-Original Message-
From: Richard Biener [mailto:richard.guent...@gmail.com]
Sent: Sunday, January 11, 2015 8:05 PM
To: Ajit Kumar Agarwal; vmaka...@redhat.com; l...@redhat.com; gcc@gcc.gnu.org
Cc: Vinod Kathail; Shail Aditya Gupta; Vidhumouli Hunsigida; Nagaraju Mekala
Subject: Re
-Original Message-
From: Richard Biener [mailto:richard.guent...@gmail.com]
Sent: Monday, January 12, 2015 2:33 PM
To: Ajit Kumar Agarwal
Cc: vmaka...@redhat.com; l...@redhat.com; gcc@gcc.gnu.org; Vinod Kathail; Shail
Aditya Gupta; Vidhumouli Hunsigida; Nagaraju Mekala
Subject: Re
Register allocation with two phase approach does optimal coalescing after the
spilling. Sometime Live range splitting makes
the coalescing non optimal. The splitted Live range are connected by move
instruction. Thus the Live range splitting and more
specifically aggressive Live range splitting
Hello All:
Looks like Live range splitting and rematerialization are connected to each
other. If the boundary of Live range
Splitting is in the high frequency of the region then the move connected to
splitted live ranges are inside the
High frequency region which is the performance bottleneck f
Thanks Vladimir for the inputs. It is quite helpful.
Thanks & Regards
Ajit
-Original Message-
From: Vladimir Makarov [mailto:vmaka...@redhat.com]
Sent: Tuesday, January 27, 2015 1:10 AM
To: Ajit Kumar Agarwal; l...@redhat.com; gcc@gcc.gnu.org
Cc: Vinod Kathail; Shail Aditya G
Hello All:
The Loop unrolling without good unrolling factor heuristics becomes the
performance bottleneck. The Unrolling factor heuristics based on minimum
Initiation interval is quite useful with respect to better ILP. The minimum
Initiation interval based on recurrence and resource calculati
Hello All:
The large functions are the important part of high performance application.
They contribute to performance bottleneck with many
respect. Some of the large hot functions are frequently executed but many
regions inside the functions are cold regions. The large
Function blocks the functi
Hello All:
The unaligned array access are the blocking factor in the vectorization. This
is due to unaligned load and stores with respect to
SIMD instructions are costly operations.
To enable the vectorizations for unaligned array access the loop peeling is
done to make the multiversioning of
-Original Message-
From: Jan Hubicka [mailto:hubi...@ucw.cz]
Sent: Thursday, February 12, 2015 10:34 PM
To: Ajit Kumar Agarwal
Cc: hubi...@ucw.cz; gcc@gcc.gnu.org; Vinod Kathail; Shail Aditya Gupta;
Vidhumouli Hunsigida; Nagaraju Mekala
Subject: Re: Function outlining and partial
Hello All:
I can see the IF-combining (If-merging) pass of optimization on tree-ssa form
of intermediate representation.
The IF-combine or merging takes of merging the IF-THEN-ELSE if the condition
Expr found be congruent or
Similar.
The IF-combine happens if the two IF-THEN-ELSE are contiguo
Hello All:
I can see the Loop invariant pass in the GCC on RTL considering the register
pressure and the cost manipulation
With respect to SET destination node in RTL.
The Loop invariant takes care of only address arithmetic candidates of Loop
invariance.
In the function get_inv_cost, I can se
-Original Message-
From: Richard Biener [mailto:richard.guent...@gmail.com]
Sent: Tuesday, February 17, 2015 3:42 PM
To: Ajit Kumar Agarwal
Cc: gcc@gcc.gnu.org; Vinod Kathail; Shail Aditya Gupta; Vidhumouli Hunsigida;
Nagaraju Mekala
Subject: Re: Tree SSA If-combine optimization pass
-Original Message-
From: Richard Biener [mailto:richard.guent...@gmail.com]
Sent: Tuesday, February 17, 2015 5:49 PM
To: Ajit Kumar Agarwal
Cc: gcc@gcc.gnu.org; Vinod Kathail; Shail Aditya Gupta; Vidhumouli Hunsigida;
Nagaraju Mekala
Subject: Re: Tree SSA If-combine optimization pass
Hello All:
I would like to propose the Unrolling factor based on Data reuse between
different iterations. This combines the data
reuse of different iterations into single iterations. There is a use of
MaxFactor which decides on the calculation of unroll
factor based on Data reuse.The MaxFactor
om: gcc-ow...@gcc.gnu.org [mailto:gcc-ow...@gcc.gnu.org] On Behalf Of Ajit
Kumar Agarwal
Sent: Saturday, March 07, 2015 3:31 PM
To: Richard Biener; gcc@gcc.gnu.org
Cc: Vinod Kathail; Shail Aditya Gupta; Vidhumouli Hunsigida; Nagaraju Mekala
Subject: Proposal on Unrolling factor based on Data reuse.
Hello All:
I am proposing the inter-procedural Loop fusion. Generally the Loops adjacent
to each other and the conformable
Candidates of loop fusions are done with respect to intra-procedural loops. The
whole program analysis needs to
Be done with array sections analysis across the procedure ca
Hello All:
The path splitting that replicates the code for better Data flow Analysis
available. One of the properties
of path splitting removes the joining nodes for the forked path like
IF-THEN-ELSE and the Loops.
The removal of joining nodes makes the path splitted into two independent
path
-Original Message-
From: Richard Biener [mailto:richard.guent...@gmail.com]
Sent: Sunday, March 08, 2015 9:05 PM
To: Ajit Kumar Agarwal; vmaka...@redhat.com; Jeff Law; gcc@gcc.gnu.org
Cc: Vinod Kathail; Shail Aditya Gupta; Vidhumouli Hunsigida; Nagaraju Mekala
Subject: Re: Proposal for
-Original Message-
From: Jeff Law [mailto:l...@redhat.com]
Sent: Monday, March 09, 2015 11:01 PM
To: Richard Biener
Cc: Ajit Kumar Agarwal; vmaka...@redhat.com; gcc@gcc.gnu.org; Vinod Kathail;
Shail Aditya Gupta; Vidhumouli Hunsigida; Nagaraju Mekala
Subject: Re: Proposal for path
Hello All:
I am proposing the new approach to Loop transformation as given below in the
example For the loops with
conditional expression inside the Loops. The Loop body should be reducible
control flow graph. The iteration
space is partitioned into different spaces for which either the cond_exp
-Original Message-
From: Aditya K [mailto:hiradi...@msn.com]
Sent: Sunday, March 15, 2015 11:37 AM
To: Ajit Kumar Agarwal; Jeff Law; Richard Biener; gcc@gcc.gnu.org
Cc: Vinod Kathail; Shail Aditya Gupta; Vidhumouli Hunsigida; Nagaraju Mekala
Subject: RE: Proposal for another approach
Hello All:
Short circuit compiler transformation for conditional branches. The conditional
branches based on the conditional
Expressions one of the path is always executed thus short circuiting the path.
Certains values of the conditional
Expressions makes the conditional expressions always true
-Original Message-
From: Richard Biener [mailto:richard.guent...@gmail.com]
Sent: Sunday, March 15, 2015 3:05 PM
To: Ajit Kumar Agarwal; Jeff Law; gcc@gcc.gnu.org
Cc: Vinod Kathail; Shail Aditya Gupta; Vidhumouli Hunsigida; Nagaraju Mekala
Subject: Re: Short Circuit compiler
-Original Message-
From: gcc-ow...@gcc.gnu.org [mailto:gcc-ow...@gcc.gnu.org] On Behalf Of Ajit
Kumar Agarwal
Sent: Sunday, March 15, 2015 3:35 PM
To: Richard Biener; Jeff Law; gcc@gcc.gnu.org
Cc: Vinod Kathail; Shail Aditya Gupta; Vidhumouli Hunsigida; Nagaraju Mekala
Subject: RE
-Original Message-
From: Jan Hubicka [mailto:hubi...@ucw.cz]
Sent: Thursday, February 12, 2015 10:34 PM
To: Ajit Kumar Agarwal
Cc: hubi...@ucw.cz; gcc@gcc.gnu.org; Vinod Kathail; Shail Aditya Gupta;
Vidhumouli Hunsigida; Nagaraju Mekala
Subject: Re: Function outlining and partial
Hello All:
Below examples are the transformation for the given loop in Fig(1). Fig(2)
unroll and jam and the Fig(3) does the
Code motion to bring two IF adjacent to each other and two while loops adjacent
to each other.
The Fig(4 ) does the IF-merging and the Loop fusion on the transformed Loo
-Original Message-
From: Richard Biener [mailto:richard.guent...@gmail.com]
Sent: Sunday, March 15, 2015 9:30 PM
To: Ajit Kumar Agarwal; Jeff Law; gcc@gcc.gnu.org
Cc: Vinod Kathail; Shail Aditya Gupta; Vidhumouli Hunsigida; Nagaraju Mekala
Subject: RE: Short Circuit compiler
-Original Message-
From: Jeff Law [mailto:l...@redhat.com]
Sent: Monday, March 16, 2015 11:45 PM
To: Ajit Kumar Agarwal; Richard Biener; gcc@gcc.gnu.org
Cc: Vinod Kathail; Shail Aditya Gupta; Vidhumouli Hunsigida; Nagaraju Mekala
Subject: Re: Proposal for another approach for Loop
Hello All:
To reduce the register pressure, I am proposing the following methods of
reducing the registers.
1. Assigning same registers or sharing same register for the logical registers
having the same value.
To determine the logical registers having the same value is the real challenge.
Is t
-Original Message-
From: gcc-ow...@gcc.gnu.org [mailto:gcc-ow...@gcc.gnu.org] On Behalf Of Richard
Biener
Sent: Tuesday, April 28, 2015 4:12 PM
To: Jeff Law
Cc: Alan Lawrence; gcc@gcc.gnu.org
Subject: Re: dom1 prevents vectorization via partial loop peeling?
On Mon, Apr 27, 2015 at 7:06
I have Designed and implemented with the following design for the path
splitting of the loops with conditional IF-THEN-ELSE.
The implementation has gone through the bootstrap for Microblaze target along
DEJA GNU regressions tests and
running the MIBench/EEMBC benchmarks. There is no regression s
-Original Message-
From: Jeff Law [mailto:l...@redhat.com]
Sent: Friday, May 29, 2015 9:24 PM
To: Ajit Kumar Agarwal; Richard Biener; gcc@gcc.gnu.org
Cc: Vinod Kathail; Shail Aditya Gupta; Vidhumouli Hunsigida; Nagaraju Mekala
Subject: Re: [RFC] Design and Implementation for Path
1 - 100 of 102 matches
Mail list logo