On Tue, Sep 23, 2014 at 2:51 PM, Jiong Wang <wong.kwongyuan.to...@gmail.com> wrote: > 2014-09-23 21:59 GMT+01:00 Sebastian Pop <seb...@gmail.com>: >> Marcus Shawcroft wrote: >>> On 4 September 2014 15:14, Jiong Wang <jiong.w...@arm.com> wrote: >>> > this patch enabled stack shrink-wrap support on AArch64. >>> > >>> > no regression on aarch64-none-elf bare-metal. >>> > aarch64 bootstrap OK. >>> > >>> > ok to install? >>> > >>> > 2014-09-04 Renlin Li<renlin...@arm.com> >>> > >>> > gcc/ >>> > * config/aarch64/aarch64.md (return): New expand. >>> > (simple_return): Likewise. >>> > * config/aarch64/aarch64.c (aarch64_use_return_insn_p): New function. >>> > * config/aarch64/aarch64-protos.h (aarch64_use_return_insn_p): New >>> > declaration. >>> > >>> > gcc/testsuite >>> > * gcc.dg/ira-shrinkwrap-prep-1.c: Enable aarch64. >>> > * gcc.dg/ira-shrinkwrap-prep-2.c: Likewise. >>> > * gcc.dg/pr10474.c: Likewise. >>> >>> OK, committed as 215508 >>> /Marcus >> >> This patch causes a regression when compiling this testcase with -O2 and >> above: >> >> void mm(double** A, double** B, double** C, unsigned N, unsigned S) >> { >> unsigned i, j, k; >> >> for (i = 0; i < N; i += S) { >> for (j = 0; j < N; j += S) { >> for (k = 0; k < N; k += S) { >> >> unsigned ib = (N) < ((i + S)) ? N : (i + S); >> unsigned jb = (N) < ((j + S)) ? N : (j + S); >> unsigned kb = (N) < ((k + S)) ? N : (k + S); >> unsigned i0, j0, k0; >> >> for (i0 = i; i0 < ib; i0++) { >> for (j0 = j; j0 < jb; j0++) { >> double* a = A[i0]; >> double* b = B[j0]; >> double scratch = C[i0][j0]; >> for (k0 = k; k0 < kb; k0++) { >> scratch += a[k0] * b[k0]; >> } >> C[i0][j0] = scratch; >> } >> } >> } >> } >> } >> asm volatile ("bar:"); >> } >> >> $ aarch64-gcc -O2 mm.c -S -o - | grep 'bar:' | wc -l >> 2 >> $ aarch64-gcc -O1 mm.c -S -o - | grep 'bar:' | wc -l >> 1 > > Interesting. > > I have done a quick investigation on x86/mips/arm32/aarch64 and found > > * x86/mips couldn't shrink wrap this function. > * arm32/aarch64 could shrink wrap it. > but both generate the same code layout, with redundant exit basic block. > > So, looks like it's caused by some hiding issue in generic code.
There is no bug here. inline-asm can be duplicated (I thought this was documented too). See https://gcc.gnu.org/bugzilla/show_bug.cgi?id=20468 also. Thanks, Andrew > >> >> Thanks, >> Sebastian