Hi, I committed the attached update.
Ira
? yy cvs diff: Diffing . Index: vectorization.html =================================================================== RCS file: /cvs/gcc/wwwdocs/htdocs/projects/tree-ssa/vectorization.html,v retrieving revision 1.27 diff -r1.27 vectorization.html 9,12c9,10 < <p>The goal of this project is to develop a loop vectorizer in < GCC, based on the <a href="./">tree-ssa</a> framework. This < work is taking place in the autovect-branch, and is merged periodically < to mainline.</p> --- > <p>The goal of this project is to develop a loop and basic block > vectorizer in > GCC, based on the <a href="./">tree-ssa</a> framework.</p> 36a35,69 > <dt>2011-10-23</dt> > <dd> > <ol> > <li>Vectorization of reduction in loop SLP. > Both <a href="#slp-reduc-2"> multiple reduction cycles</a> and > <a href="#slp-reduc-1"> reduction chains</a> are supported. </li> > <li>Various <a href="#slp"> basic block vectorization (SLP)</a> > improvements, such as > better data dependence analysis, support of misaligned accesses > and multiple types, cost model.</li> > <li>Detection of vector size: > <a href= > "http://gcc.gnu.org/ml/gcc-patches/2010-10/msg00441.html"> > http://gcc.gnu.org/ml/gcc-patches/2010-10/msg00441.html</a>.</li> > <li>Vectorization of loads with <a href="#negative"> negative > step</a>.</li> > <li>Improved realignment scheme: > <a href= > "http://gcc.gnu.org/ml/gcc-patches/2010-06/msg02301.html"> > http://gcc.gnu.org/ml/gcc-patches/2010-06/msg02301.html</a>.</li> > <li>A new built-in, <a href="#assume-aligned"> > <code>__builtin_assume_aligned</code></a>, has been added, > through which the compiler can be hinted about pointer > alignment.</li> > <li>Support of <a href="#strided"> strided accesses</a> using > memory instructions that have > the interleaving "built in", such as NEON's vldN and vstN.</li> > <li>The vectorizer now attempts to reduce over-promotion of operands > in some vector > operations: <a href= > "http://gcc.gnu.org/ml/gcc-patches/2011-07/msg01472.html"> > http://gcc.gnu.org/ml/gcc-patches/2011-07/msg01472.html</a>.</li> > <li><a href="#widen-shift"> Widening shifts</a> are now detected and > vectorized > if supported by the target.</li> > <li>Vectorization of conditions with <a href="#cond-mix"> mixed > types</a>.</li> > <li>Support of loops with <a href="#bool"> bool</a>.</li> > </ol> > </dd> > </dl> > > <dl> 44c77 < other then reduction cycles in nested loops) (2009-06-16)</li> --- > other than reduction cycles in nested loops) (2009-06-16)</li> 82c115,116 < to this project include Revital Eres, Richard Guenther, and Ira Rosen. --- > to this project include Revital Eres, Richard Guenther, Jakub Jelinek, > Michael Matz, > Richard Sandiford, and Ira Rosen. 279c313 < <strong>example11</strong>: --- > <a name="strided"><strong>example11</strong>:</a> 323d356 < 341d373 < 356d387 < 361d391 < 371a402,498 > </pre> > > <a name="slp-reduc-2"><strong>example18</strong>: Simple reduction in SLP:</a> > <pre> > int sum1; > int sum2; > int a[128]; > void foo (void) > { > int i; > > for (i = 0; i < 64; i++) > { > sum1 += a[2*i]; > sum2 += a[2*i+1]; > } > } > </pre> > > <a name="slp-reduc-1"><strong>example19</strong>: Reduction chain in SLP:</a> > <pre> > int sum; > int a[128]; > void foo (void) > { > int i; > > for (i = 0; i < 64; i++) > { > sum += a[2*i]; > sum += a[2*i+1]; > } > } > </pre> > > <a name="slp"><strong>example20</strong>: Basic block SLP with > multiple types, loads with different offsets, misaligned load, > and not-affine accesses:</a> > <pre> > void foo (int * __restrict__ dst, short * __restrict__ src, > int h, int stride, short A, short B) > { > int i; > for (i = 0; i < h; i++) > { > dst[0] += A*src[0] + B*src[1]; > dst[1] += A*src[1] + B*src[2]; > dst[2] += A*src[2] + B*src[3]; > dst[3] += A*src[3] + B*src[4]; > dst[4] += A*src[4] + B*src[5]; > dst[5] += A*src[5] + B*src[6]; > dst[6] += A*src[6] + B*src[7]; > dst[7] += A*src[7] + B*src[8]; > dst += stride; > src += stride; > } > } > </pre> > > <a name="negative"><strong>example21</strong>: Backward access:</a> > <pre> > int foo (int *b, int n) > { > int i, a = 0; > > for (i = n-1; i ≥ 0; i--) > a += b[i]; > > return a; > } > </pre> > > <a name="assume-aligned"><strong>example22</strong>: Alignment hints:</a> > <pre> > void foo (int *out1, int *in1, int *in2, int n) > { > int i; > > out1 = __builtin_assume_aligned (out1, 32, 16); > in1 = __builtin_assume_aligned (in1, 32, 16); > in2 = __builtin_assume_aligned (in2, 32, 0); > > for (i = 0; i < n; i++) > out1[i] = in1[i] * in2[i]; > } > </pre> > > <a name="widen-shift"><strong>example23</strong>: Widening shift:</a> > <pre> > void foo (unsigned short *src, unsigned int *dst) > { > int i; > > for (i = 0; i < 256; i++) > *dst++ = *src++ << 7; > } > </pre> 372a500,530 > <a name="cond-mix"><strong>example24</strong>: Condition with mixed types:</a> > <pre> > #define N 1024 > float a[N], b[N]; > int c[N]; > > void foo (short x, short y) > { > int i; > for (i = 0; i < N; i++) > c[i] = a[i] < b[i] ? x : y; > } > </pre> > > <a name="bool"><strong>example25</strong>: Loop with bool:</a> > <pre> > #define N 1024 > float a[N], b[N], c[N], d[N]; > int j[N]; > > void foo (void) > { > int i; > _Bool x, y; > for (i = 0; i < N; i++) > { > x = (a[i] < b[i]); > y = (c[i] < d[i]); > j[i] = x & y; > } > } 1355a1514,1517 > <li>"Vapor SIMD: Auto-vectorize once, run everywhere", > Dorit Nuzman, Sergei Dyshel, Erven Rohou, Ira Rosen, Kevin Williams, > David Yuste, Albert Cohen, Ayal Zaks, CGO 2011: 151-160</li> > 1363c1525 < <li>"Loop-Aware SLP in GCC - two years later", Ira Rosen, Dorit Nuzman --- > <li>"Loop-Aware SLP in GCC", Ira Rosen, Dorit Nuzman