https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91030
Thomas Koenig changed:
What|Removed |Added
Status|WAITING |RESOLVED
Resolution|---
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91030
--- Comment #41 from Thomas Koenig ---
Author: tkoenig
Date: Tue Jul 23 08:57:45 2019
New Revision: 273727
URL: https://gcc.gnu.org/viewcvs?rev=273727&root=gcc&view=rev
Log:
2019-07-23 Thomas König
Backport from trunk
PR libf
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91030
--- Comment #40 from Thomas Koenig ---
Author: tkoenig
Date: Sun Jul 21 15:55:49 2019
New Revision: 273643
URL: https://gcc.gnu.org/viewcvs?rev=273643&root=gcc&view=rev
Log:
2019-07-21 Thomas König
PR libfortran/91030
* gfort
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91030
--- Comment #39 from Janne Blomqvist ---
Now, with the fixed benchmark in the previous comment, on Lustre (version 2.5)
system I get:
Test using 25000 bytes
Block size of file system: 4096
bs = 1024, 53.27 MiB/s
bs = 2048, 73.99
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91030
--- Comment #38 from Janne Blomqvist ---
First, I think there's a bug in the benchmark in comment #c20. It writes
blocksize * sizeof(double), but then advances only blocksize for each iteration
of the loop. Fixed version writing just bytes below:
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91030
--- Comment #37 from Janne Blomqvist ---
One thing we could do would be to switch to pread and pwrite instead of using
lseek. That would avoid a few syscalls when updating the record length marker.
Though I guess the issue with GPFS isn't directl
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91030
Janne Blomqvist changed:
What|Removed |Added
CC||jb at gcc dot gnu.org
--- Comment #36
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91030
--- Comment #35 from Jerry DeLisle ---
(In reply to Thomas Koenig from comment #34)
> There is another point to consider.
>
> I suppose not very many people use big-endian data formats
> these days. Little-endian dominates these days, and people
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91030
--- Comment #34 from Thomas Koenig ---
There is another point to consider.
I suppose not very many people use big-endian data formats
these days. Little-endian dominates these days, and people
who require that conversion on a regular basis (why
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91030
--- Comment #33 from Jerry DeLisle ---
Well, I am not opposed to it. What we do not want is to pessimize older smaller
machines where it does matter a lot. However if Thomas strategy above is
adjusted from 32768 to 65536 then out of the box it wi
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91030
--- Comment #32 from David Edelsohn ---
If the performance measured by Jerry is hitting limits of the 4 x 32KiB L1
D-Cache of the Ryzen 2500U, then the system has bigger problems than FORTRAN
I/O buffer size.
What is the target audience / market
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91030
--- Comment #31 from David Edelsohn ---
What is the PAGESIZE on the Ryzen system? On the POWER systems, the PAGESIZE
is 64K. Maybe the optimal buffer size (write size) allows the filesystem to
perform double-buffering at the PAGESIZE.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91030
--- Comment #30 from Thomas Koenig ---
> Why are you opposed to the larger 65536 or 131072 as a default?
Please look at Jerry's numbers from comment #24.
They show a severe regression (for his system) for blocksizes > 32768.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91030
--- Comment #29 from David Edelsohn ---
> For formatted files, chose the value that the user supplied
> via an environment variable. If the user supplied nothing, then
>
> - query the recommended block size via calling fstat and evaluating
> s
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91030
--- Comment #28 from Thomas Koenig ---
(In reply to Jerry DeLisle from comment #27)
> (In reply to Thomas Koenig from comment #26)
> > Jerry, you are working on a Linux box, right? What does
> >
> > stat -f -c %b .
> >
> > tell you?
>
> 13429
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91030
--- Comment #27 from Jerry DeLisle ---
(In reply to Thomas Koenig from comment #26)
> Jerry, you are working on a Linux box, right? What does
>
> stat -f -c %b .
>
> tell you?
13429330
Ryzen 2500U with M.2 SSD
Fedora 30, Kernel 5.1.15-300.fc
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91030
--- Comment #26 from Thomas Koenig ---
Jerry, you are working on a Linux box, right? What does
stat -f -c %b .
tell you?
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91030
--- Comment #25 from Thomas Koenig ---
(In reply to Jerry DeLisle from comment #24)
> On a different Ryzen machine:
>
> $ ./run.sh
> 1024 3.2604169845581055
> 2048 2.7804551124572754
> 4096 2.6416599750518799
> 8192 2.598
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91030
--- Comment #24 from Jerry DeLisle ---
On a different Ryzen machine:
$ ./run.sh
1024 3.2604169845581055
2048 2.7804551124572754
4096 2.6416599750518799
8192 2.5986809730529785
16384 2.5525100231170654
32768
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91030
--- Comment #23 from Thomas Koenig ---
Some numbers for the provisionary patch, varying the
size for the buffers.
With the patch, the original benchmark (minus some output, only
the elapsed time is shown) and the script
for a in 1024 2048 4096
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91030
--- Comment #22 from David Edelsohn ---
The following are unofficial results on an unspecified system running GPFS.
These should not be considered official anything and should not be referenced
for benchmarking.
Test using 2.50e+08 doubles
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91030
--- Comment #21 from Thomas Koenig ---
Created attachment 46537
--> https://gcc.gnu.org/bugzilla/attachment.cgi?id=46537&action=edit
Something to benchmark.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91030
--- Comment #20 from Thomas Koenig ---
(In reply to David Edelsohn from comment #18)
> For GPFS, the striping unit is 16M. The 8K buffer size chosen by GFortran
> is a huge performance sink. We have confirmed this with testing.
Could you share
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91030
--- Comment #19 from David Edelsohn ---
IBM XLF provides an XLFRTEOPTS environment variable, which includes control
over buffer size. The documentation makes it clear that XLF uses the block
size of the device by default:
buffer_size=size
Speci
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91030
--- Comment #18 from David Edelsohn ---
For GPFS, the striping unit is 16M. The 8K buffer size chosen by GFortran is a
huge performance sink. We have confirmed this with testing.
The recommendation from GPFS is that one should query the filesys
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91030
--- Comment #17 from Thomas Koenig ---
(In reply to David Edelsohn from comment #16)
> libgfortran unix.c:raw_write() will access the WRITE system call with up to
> 2GB of data, which the testcase is using for the native format.
>
> Should libgf
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91030
--- Comment #16 from David Edelsohn ---
libgfortran unix.c:raw_write() will access the WRITE system call with up to 2GB
of data, which the testcase is using for the native format.
Should libgfortran I/O buffer at least use sysconf(_SC_PAGESIZE)
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91030
--- Comment #15 from Thomas Koenig ---
(In reply to David Edelsohn from comment #13)
> Why should -fconvert affect the strategy for writing?
If we get passed a contiguous block of memory (like in
your test case) we can do this in a single write.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91030
--- Comment #14 from Jerry DeLisle ---
(In reply to David Edelsohn from comment #13)
> Why should -fconvert affect the strategy for writing?
Hi David, very interesting bug report and a good question. I would like to
investigate further if I know
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91030
--- Comment #13 from David Edelsohn ---
Why should -fconvert affect the strategy for writing?
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91030
--- Comment #12 from Andrew Pinski ---
(In reply to David Edelsohn from comment #10)
> With EXT4: difference is 2x
> With SHM: difference is 4.5x
> With GPFS: difference is 10x
>
> Is libgfortran doing something unusual with the creation of file
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91030
--- Comment #11 from Thomas Koenig ---
(In reply to David Edelsohn from comment #10)
> With EXT4: difference is 2x
> With SHM: difference is 4.5x
> With GPFS: difference is 10x
>
> Is libgfortran doing something unusual with the creation of file
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91030
--- Comment #10 from David Edelsohn ---
With EXT4: difference is 2x
With SHM: difference is 4.5x
With GPFS: difference is 10x
Is libgfortran doing something unusual with the creation of files?
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91030
--- Comment #9 from Thomas Koenig ---
On powerpc64le-unknown-linux-gnu:
write time(sec) = 0.48150300979614258
done
real0m0.889s
user0m0.279s
sys 0m0.608s
vs.
write time(sec) =1.4788339138031006
done
real0m
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91030
--- Comment #8 from Andrew Pinski ---
(In reply to Thomas Koenig from comment #7)
> Also, which version of gfortran did you use?
>
> If it was before r195413, I can very well believe those
> numbers.
Note that revision made it into GCC 4.8.0.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91030
Thomas Koenig changed:
What|Removed |Added
Status|NEW |WAITING
--- Comment #7 from Thomas Koeni
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91030
--- Comment #6 from Thomas Koenig ---
I cannot reproduce this on an AMD Ryzen 7 1700X (little-endian):
$ gfortran -fconvert=native wr.f90 walltime.c
cc1: Warnung: command-line option »-fconvert=native« is valid for Fortran but
not for C
$ rm -f
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91030
--- Comment #5 from David Edelsohn ---
XL Fortran with -qufmt=be : 0.75 sec
XL Fortran native : 0.30 sec
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91030
--- Comment #4 from Thomas Koenig ---
(In reply to David Edelsohn from comment #3)
> Conversion carries an overhead, but the overhead need not be worse than
> necessary. The conversion overhead for libgfortran is significantly worse
> than for c
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91030
Thomas Koenig changed:
What|Removed |Added
Severity|normal |enhancement
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91030
David Edelsohn changed:
What|Removed |Added
Severity|enhancement |normal
--- Comment #3 from David Edelso
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91030
Thomas Koenig changed:
What|Removed |Added
Severity|normal |enhancement
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91030
Thomas Koenig changed:
What|Removed |Added
CC||tkoenig at gcc dot gnu.org
--- Comment #
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91030
David Edelsohn changed:
What|Removed |Added
Status|UNCONFIRMED |NEW
Last reconfirmed|
44 matches
Mail list logo