Re: [patch, fortran] Reduce stack use in blocked matmul

2017-05-10 Thread Andreas Schwab
On Mai 10 2017, Thomas Koenig wrote: > ... on a 32-bit system, of course. http://gcc.gnu.org/ml/gcc-testresults/2017-05/msg01063.html FAIL: gfortran.dg/generic_20.f90 -O0 execution test FAIL: gfortran.dg/generic_20.f90 -O1 execution test FAIL: gfortran.dg/generic_20.f90 -O2 execution t

Re: [patch, fortran] Reduce stack use in blocked matmul

2017-05-10 Thread Andreas Schwab
On Mai 10 2017, Thomas Koenig wrote: > If you manage to come up with a legal Fortran testcas which > sets b_dim1 to 0xdeadbeef, I owe you a beer :-) grep is your friend. Andreas. -- Andreas Schwab, SUSE Labs, sch...@suse.de GPG Key fingerprint = 0196 BAD8 1CE9 1970 F4BE 1748 E4D4 88E3 0EEA B

Re: [patch, fortran] Reduce stack use in blocked matmul

2017-05-10 Thread Thomas Koenig
Am 10.05.2017 um 17:42 schrieb Thomas Koenig: If you manage to come up with a legal Fortran testcas which sets b_dim1 to 0xdeadbeef, I owe you a beer :-) ... on a 32-bit system, of course.

Re: [patch, fortran] Reduce stack use in blocked matmul

2017-05-10 Thread Thomas Koenig
Hi Andreas, + index_type t1_dim; + t1_dim = (a_dim1-1) * 256 + b_dim1; + if (t1_dim > 65536) + t1_dim = 65536; + +#pragma GCC diagnostic push +#pragma GCC diagnostic ignored "-Wvla" + 'rtype_name` t1[t1_dim]; /* was [256][256] */ That does the wrong thing if b_dim1 ==

Re: [patch, fortran] Reduce stack use in blocked matmul

2017-05-10 Thread Andreas Schwab
On Mai 05 2017, Thomas Koenig wrote: > @@ -227,6 +226,17 @@ sinclude(`matmul_asm_'rtype_code`.m4')dnl >if (m == 0 || n == 0 || k == 0) > return; > > + /* Adjust size of t1 to what is needed. */ > + index_type t1_dim; > + t1_dim = (a_dim1-1) * 256 + b_dim1; > +

Re: [patch, fortran] Reduce stack use in blocked matmul

2017-05-09 Thread Thomas Koenig
Am 09.05.2017 um 12:43 schrieb Andreas Schwab: On Mai 05 2017, Thomas Koenig wrote: @@ -227,6 +226,17 @@ sinclude(`matmul_asm_'rtype_code`.m4')dnl if (m == 0 || n == 0 || k == 0) return; + /* Adjust size of t1 to what is needed. */ + index_type t1_dim; + t1_

Re: [patch, fortran] Reduce stack use in blocked matmul

2017-05-09 Thread Christophe Lyon
Hi, On 8 May 2017 at 18:58, Jerry DeLisle wrote: > On 05/05/2017 01:31 PM, Thomas Koenig wrote: >> Hello world, >> >> the attached patch reduces the stack usage by the blocked >> version of matmul for cases where we don't need the full buffer. >> This should improve stack usage. >> >> Regression-

Re: [patch, fortran] Reduce stack use in blocked matmul

2017-05-09 Thread Andreas Schwab
On Mai 05 2017, Thomas Koenig wrote: > @@ -227,6 +226,17 @@ sinclude(`matmul_asm_'rtype_code`.m4')dnl >if (m == 0 || n == 0 || k == 0) > return; > > + /* Adjust size of t1 to what is needed. */ > + index_type t1_dim; > + t1_dim = (a_dim1-1) * 256 + b_dim1; > +

Re: [patch, fortran] Reduce stack use in blocked matmul

2017-05-08 Thread Jerry DeLisle
On 05/08/2017 12:29 PM, Thomas Koenig wrote: > Am 08.05.2017 um 18:58 schrieb Jerry DeLisle: > > he attached patch reduces the stack usage by the blocked >>> version of matmul for cases where we don't need the full buffer. >>> This should improve stack usage. >>> >>> OK for trunk? >>> >> >> OK, th

Re: [patch, fortran] Reduce stack use in blocked matmul

2017-05-08 Thread Thomas Koenig
Am 08.05.2017 um 18:58 schrieb Jerry DeLisle: he attached patch reduces the stack usage by the blocked version of matmul for cases where we don't need the full buffer. This should improve stack usage. OK for trunk? OK, thanks. Is this something we should consider for backporting to gcc-7?

Re: [patch, fortran] Reduce stack use in blocked matmul

2017-05-08 Thread Jerry DeLisle
On 05/05/2017 01:31 PM, Thomas Koenig wrote: > Hello world, > > the attached patch reduces the stack usage by the blocked > version of matmul for cases where we don't need the full buffer. > This should improve stack usage. > > Regression-tested. I also added a stress test (around 3 secs of > CP

[patch, fortran] Reduce stack use in blocked matmul

2017-05-05 Thread Thomas Koenig
Hello world, the attached patch reduces the stack usage by the blocked version of matmul for cases where we don't need the full buffer. This should improve stack usage. Regression-tested. I also added a stress test (around 3 secs of CPU time on my system), it will only run once due to the "dg-d