About the GCC mirror on igor.onlinedirect.bg
Hello, I write to inform you that unfortunately OnlineDirect (the sponsoring company) was acquired and the Igor machine will be stopped in the coming weeks. Best regards, Igor team
New mirror
Hello, we decided to run new GCC mirror in Bulgaria. Here are the details. Country: Bulgaria City: Sofia Bandwidth: 2 gbps aggregated link to the Bulgarian Peering, 500 mbps international Contact: i...@onlinedirect.bg URL: http://gcc.igor.onlinedirect.bg/ FTP: ftp://gcc.igor.onlinedirect.bg/others/gcc/ 1000 connections limit. Gets synced every 6 hours. Best regards
return void from void function is allowed.
GCC 4.1.2 and 4.0.3 incorrectly accepts the following program: void f(); void g() { return f(); } No warning are issued on my Ubuntu Pentium-M box. Is it a known bug? Regards, Igor
return void from void function is allowed.
-- Forwarded message -- From: Igor Bukanov <[EMAIL PROTECTED]> Date: Oct 31, 2006 9:48 PM Subject: Re: return void from void function is allowed. To: Mike Stump <[EMAIL PROTECTED]> On 10/31/06, Mike Stump <[EMAIL PROTECTED]> wrote: This is valid in C++. My copy of 1997 C++ public draft contains: 6.6.3 The return statement ... 2 A return statement without an expression can be used only in functions that do not return a value, that is, a function with the return value type void, a constructor (_class.ctor_), or a destructor (_class.dtor_). A return statement with an expression can be used only in functions returning a value; the value of the expression is returned to the caller of the function. If required, the expression is implicitly converted to the return type of the function in which it appears. A return statement can involve the construction and copy of a temporary object (_class.temporary_). Flowing off the end of a function is equivalent to a return with no value; this results in undefined behavior in a value-returning function. My reading of that is C++ does not allow return void-expression from void function. Was it changed later? And final thought, wrong mailing list... gcc-help would have been better. I thought bugs in GCC can be discussed here. Sorry if it is a wrong assumption. Regards, Igor
gcc-4.1.2 testsuite report MAC OS 10.3.9 Power PC G4 darwin7.9.0
Test Run By igor on Sat Apr 14 03:32:31 2007 Native configuration is powerpc-apple-darwin7.9.0 === gcc tests === Schedule of variations: unix FAIL: gcc.c-torture/compile/pr23237.c -O0 (test for excess errors) FAIL: gcc.c-torture/compile/pr23237.c -O1 (test for excess errors) FAIL: gcc.c-torture/compile/pr23237.c -O2 (test for excess errors) FAIL: gcc.c-torture/compile/pr23237.c -O3 -fomit-frame-pointer (test for excess errors) FAIL: gcc.c-torture/compile/pr23237.c -O3 -fomit-frame-pointer -funroll-loops (test for excess errors) FAIL: gcc.c-torture/compile/pr23237.c -O3 -fomit-frame-pointer -funroll-all-loops -finline-functions (test for excess errors) FAIL: gcc.c-torture/compile/pr23237.c -O3 -g (test for excess errors) FAIL: gcc.c-torture/compile/pr23237.c -Os (test for excess errors) FAIL: tmpdir-gcc.dg-struct-layout-1/t001 c_compat_x_tst.o-c_compat_y_tst.o execute FAIL: tmpdir-gcc.dg-struct-layout-1/t024 c_compat_x_tst.o-c_compat_y_tst.o execute FAIL: tmpdir-gcc.dg-struct-layout-1/t025 c_compat_x_tst.o-c_compat_y_tst.o execute FAIL: tmpdir-gcc.dg-struct-layout-1/t026 c_compat_x_tst.o-c_compat_y_tst.o execute FAIL: tmpdir-gcc.dg-struct-layout-1/t027 c_compat_x_tst.o-c_compat_y_tst.o execute FAIL: tmpdir-gcc.dg-struct-layout-1/t028 c_compat_x_tst.o-c_compat_y_tst.o execute FAIL: gcc.dg/attr-weakref-1.c (test for excess errors) FAIL: gcc.dg/builtins-18.c (test for excess errors) FAIL: gcc.dg/builtins-20.c (test for excess errors) FAIL: gcc.dg/builtins-55.c (test for excess errors) FAIL: gcc.dg/darwin-version-1.c (test for excess errors) FAIL: gcc.dg/torture/builtin-convert-1.c -O0 (test for excess errors) FAIL: gcc.dg/torture/builtin-convert-1.c -O1 (test for excess errors) FAIL: gcc.dg/torture/builtin-convert-1.c -O2 (test for excess errors) FAIL: gcc.dg/torture/builtin-convert-1.c -O3 -fomit-frame-pointer (test for excess errors) FAIL: gcc.dg/torture/builtin-convert-1.c -O3 -g (test for excess errors) FAIL: gcc.dg/torture/builtin-convert-1.c -Os (test for excess errors) FAIL: gcc.dg/torture/builtin-convert-2.c -O0 (test for excess errors) FAIL: gcc.dg/torture/builtin-convert-2.c -O1 (test for excess errors) FAIL: gcc.dg/torture/builtin-convert-2.c -O2 (test for excess errors) FAIL: gcc.dg/torture/builtin-convert-2.c -O3 -fomit-frame-pointer (test for excess errors) FAIL: gcc.dg/torture/builtin-convert-2.c -O3 -g (test for excess errors) FAIL: gcc.dg/torture/builtin-convert-2.c -Os (test for excess errors) FAIL: gcc.dg/torture/builtin-convert-3.c -O0 (test for excess errors) FAIL: gcc.dg/torture/builtin-convert-3.c -O1 (test for excess errors) FAIL: gcc.dg/torture/builtin-convert-3.c -O2 (test for excess errors) FAIL: gcc.dg/torture/builtin-convert-3.c -O3 -fomit-frame-pointer (test for excess errors) FAIL: gcc.dg/torture/builtin-convert-3.c -O3 -g (test for excess errors) FAIL: gcc.dg/torture/builtin-convert-3.c -Os (test for excess errors) FAIL: gcc.dg/torture/builtin-power-1.c -O0 (test for excess errors) FAIL: gcc.dg/torture/builtin-power-1.c -O1 (test for excess errors) FAIL: gcc.dg/torture/builtin-power-1.c -O2 (test for excess errors) FAIL: gcc.dg/torture/builtin-power-1.c -O3 -fomit-frame-pointer (test for excess errors) FAIL: gcc.dg/torture/builtin-power-1.c -O3 -g (test for excess errors) FAIL: gcc.dg/torture/builtin-power-1.c -Os (test for excess errors) FAIL: gcc.target/powerpc/darwin-longlong.c execution test FAIL: gcc.target/powerpc/pr18096-1.c stack frame too large (test for warnings, line 11) FAIL: gcc.target/powerpc/pr18096-1.c (test for excess errors) FAIL: gcc.target/powerpc/stabs-attrib-vect-darwin.c scan-assembler .stabs.*vi:\\(0,16\\)[EMAIL PROTECTED] === gcc Summary === # of expected passes39466 # of unexpected failures47 # of expected failures 98 # of untested testcases 28 # of unsupported tests 382 === g++ tests === Schedule of variations: unix FAIL: g++.dg/abi/rtti3.C scan-assembler .weak[ \t]_?_ZTSPP1A XPASS: g++.dg/tree-ssa/pr14814.C scan-tree-dump-times &this 0 FAIL: g++.dg/warn/huge-val1.C (test for excess errors) FAIL: g++.dg/warn/weak1.C (test for excess errors) FAIL: g++.dg/special/conpr-3.C execution test XPASS: g++.old-deja/g++.eh/badalloc1.C execution test === g++ Summary === # of expected passes12287 # of unexpected failures4 # of unexpected successes 2 # of expected failures 67 # of unsupported tests 120 === gfortran tests === Schedule of variations: unix FAIL: gfortran.dg/large_real_kind_2.F90 -O0 execution test FAIL: gfortran.dg/large_real_kind_2.F90 -O1 execution test FAIL: gfortran.dg/large_real_kind_2.F90 -O2 execution test FAIL: gfortran.dg/large_real_kind_2.F90 -O3 -fomit-frame-pointer execution test FAIL: gfortran.dg/large_real_kind_2.F90 -O3 -fomit-f
Reorder/combine insns on superscalar arch
Guys, I'm trying to make compiler to generate better code on superscalar in-order machine but can't find the right way to do it. Imagine the following code: long f(long* p, long a, long b) { long a1 = a << 2; long a2 = a1 + b; return p[a1] + p[a2]; } by default compiler generates something like this in some pseudo-asm: shl r3, r3, 2 add r4, r3, r4 ld8r15, [r2 + r3 * 8] ld8r2, [ r2 + r4 * 8] {add r2, r2, r15 ;ret} but it would be way better this way: { sh_add r4, r4, (r3 << 2) ; shl r3, r3, 2 } { ld8r15, [r2 + r3 * 8] ; ld8r2, [ r2 + r4 * 8] } {add r2, r2, r15 ;ret} 2nd sequence is 2 cycles shorter. Combiner pass even shows patterns like this but fail to transform this as it wrapped in parallel: Failed to match this instruction: (parallel [ (set (reg:DI 56) (plus:DI (mult:DI (reg:DI 3 r3 [ a ]) (const_int 4 [0x4])) (reg:DI 4 r4 [ b ]))) (set (reg/v:DI 40 [ a1 ]) (ashift:DI (reg:DI 3 r3 [ a ]) (const_int 2 [0x2]))) ]) What would be a proper way to perform reorganizations like this in general way? The same goes with the pointer increment: add r2, r2, 1 ld r3, [r2+0] would be much better off like this: { ld r3, [r2 + 1] ; add r2, r2, 1 } Are those kind of things overlooked or I failed to set something in machine-dependent portion? Thanks a lot for your thoughts
Re: Reorder/combine insns on superscalar arch
Thanks Jeff, I really hoped that I missed something and there was better answer. But does it do any harm if combiner will try to check every piece of a parallel like that and if every component is matchable and total cost is not worse to emit them separately? It will change nothing for single issue machines just some reordering but it will help many multi-issue... What the pitfalls or this approach are? Thanks On Thu, Jan 14, 2016 at 8:55 PM, Jeff Law wrote: > On 01/14/2016 04:47 PM, Igor Shevlyakov wrote: >> >> Guys, >> >> I'm trying to make compiler to generate better code on superscalar >> in-order machine but can't find the right way to do it. >> >> Imagine the following code: >> >> long f(long* p, long a, long b) >> { >>long a1 = a << 2; >>long a2 = a1 + b; >>return p[a1] + p[a2]; >> } > > > >> >> by default compiler generates something like this in some pseudo-asm: >> >> shl r3, r3, 2 >> add r4, r3, r4 >> ld8r15, [r2 + r3 * 8] >> ld8r2, [ r2 + r4 * 8] >> {add r2, r2, r15 ;ret} >> >> but it would be way better this way: >> >>{ sh_add r4, r4, (r3 << 2) ; shl r3, r3, 2 } >>{ ld8r15, [r2 + r3 * 8] ; ld8r2, [ r2 + r4 * 8] } >>{add r2, r2, r15 ;ret} > > > >> >> 2nd sequence is 2 cycles shorter. Combiner pass even shows patterns >> like this but fail to transform this as it wrapped in parallel: >> >> Failed to match this instruction: >> (parallel [ >> (set (reg:DI 56) >> (plus:DI (mult:DI (reg:DI 3 r3 [ a ]) >> (const_int 4 [0x4])) >> (reg:DI 4 r4 [ b ]))) >> (set (reg/v:DI 40 [ a1 ]) >> (ashift:DI (reg:DI 3 r3 [ a ]) >> (const_int 2 [0x2]))) >> ]) > > You can always write a pattern which matches the PARALLEL. You can then > either arrange to emit the assembly code from that pattern or split the > pattern (after reload/lra) > >> >> What would be a proper way to perform reorganizations like this in general >> way? >> >> The same goes with the pointer increment: >> >> add r2, r2, 1 >> ld r3, [r2+0] >> >> would be much better off like this: >> >> { ld r3, [r2 + 1] ; add r2, r2, 1 } >> >> Are those kind of things overlooked or I failed to set something in >> machine-dependent portion? > > Similarly. You may also get some mileage from LEGITIMIZE_ADDRESS, though it > may not see the add/load together which would hinder its ability to generate > the code you want. > > Note that using a define_split to match these things prior to reload likely > won't work because combine will likely see the split pattern as being the > same cost as the original insns. > > In general the combiner really isn't concerned with superscalar issues, > though you can tackle some superscalar things with creative patterns that > match parallels or which match something more complex, but then split it up > later. > > Note that GCC supports a number of superscalar architectures -- they were > very common for workstations and high end embedded processors for many > years. MIPS, PPC, HPPA, even x86, etc all have variants which were tuned > for superscalar code generation. I'm sure there's tricks you can exploit in > every one of those architectures to help generate code with fewer data > dependencies and thus more opportunities to exploit the superscalar nature > of your processor. > > > > jeff
Is this FE bug or am I missing something?
Guys, Small sample below fails (at least on 6.1) for multiple targets. The difference between two functions start at the very first tree pass... Please confirm that I'm not crazy and it's not supposed to be like this... Thanks -- #include "limits.h" #include "stdio.h" int* __attribute__((noinline)) f1(int* p, int x) { return &p[x + 1]; } int* __attribute__((noinline)) f2(int*p, int x) { return &p[1 + x]; } int P[10]; int main() { int x = INT_MAX; if (f1(P, x) != f2(P, x)) { printf("Error!\n"); abort(); } } --
Re: Is this FE bug or am I missing something?
Well, my concern is not what happens with overflow (which in second case -fsanitize=undefined will address), but rather consistency of that 2 cases. p[x+1] generates RTL which leads to better generated code at the expense of leading to overflow, while p[1+x] never overflows but leads to worse code. It would be beneficial to make the behaviour consistent between those 2 cases. Thanks for your input On Mon, Sep 12, 2016 at 12:51 AM, Marc Glisse wrote: > On Sun, 11 Sep 2016, Igor Shevlyakov wrote: > >> Small sample below fails (at least on 6.1) for multiple targets. The >> difference between two functions start at the very first tree pass... > > > You are missing -fsanitize=undefined (and #include ). > > Please use the mailing list gcc-h...@gcc.gnu.org next time. > > -- > Marc Glisse
RE: Listing a maintainer for libcilkrts, and GCC's Cilk Plus implementation generally?
> The original plan was for Balaji to take on this role; however, his assignment > within Intel has changed and thus he's not going to have time to work on > Cilk+ anymore. > > Igor Zamyatin has been doing a fair amount of Cilk+ maintenance/bugfixing > and it might make sense for him to own it in the long term if he's interested. That's right. Can I add 2 records (cilk plus and libcilkrts) to Various Maintainers section? Thanks, Igor > > jeff
RE: Listing a maintainer for libcilkrts, and GCC's Cilk Plus implementation generally?
> I apologize. They got caught up in other issues. They've been merged into > our mainstream and I believe they were just posted to the cilkplus.org > website and submitted to GCC. I'm going to submit latest cilk runtime sources next week so I will check the mentioned change. Thanks, Igor > > - Barry > > -Original Message- > From: Thomas Schwinge [mailto:tho...@codesourcery.com] > Sent: Thursday, March 5, 2015 7:42 PM > To: Jeff Law > Cc: Zamyatin, Igor; Iyer, Balaji V; gcc@gcc.gnu.org; Tannenbaum, Barry M; > H.J. Lu; Jakub Jelinek > Subject: Re: Listing a maintainer for libcilkrts, and GCC's Cilk Plus > implementation generally? > > Hi! > > On Thu, 5 Mar 2015 13:39:44 -0700, Jeff Law wrote: > > On 02/23/15 14:41, H.J. Lu wrote: > > > On Mon, Sep 29, 2014 at 4:00 AM, Jakub Jelinek > wrote: > > >> On Mon, Sep 29, 2014 at 12:56:06PM +0200, Thomas Schwinge wrote: > > >>> On Tue, 23 Sep 2014 11:02:30 +, "Zamyatin, Igor" > wrote: > > >>>> Jeff Law wrote: > > >>>>> The original plan was for Balaji to take on this role; however, > > >>>>> his assignment within Intel has changed and thus he's not going > > >>>>> to have time to work on > > >>>>> Cilk+ anymore. > > >>>>> > > >>>>> Igor Zamyatin has been doing a fair amount of Cilk+ > > >>>>> maintenance/bugfixing and it might make sense for him to own it in > the long term if he's interested. > > >>>> > > >>>> That's right. > > >>> > > >>> Thanks! > > >>> > > >>>> Can I add 2 records (cilk plus and libcilkrts) to Various Maintainers > section? > > >>> > > >>> I understand Jeff's email as a pre-approval of such a patch. > > >> > > >> I think only SC can appoint maintainers, and while Jeff is in the > > >> SC, my reading of that mail wasn't that it was the SC that has > > >> acked that, but rather a question if Igor is willing to take that > > >> role, which then would need to be acked by SC. > > > > > > Where are we on this? Do we have a maintainer for Cilk Plus and its > > > run-time library? > > Not at this time. There was a bit of blockage on various things with > > the steering committee (who approves maintainers). I've got a > > half-dozen or so proposals queued (including Cilk maintainership). > > What's the process then, that I get my Cilk Plus (libcilkrts) portability > patches > committed to GCC? I was advisd this must be routed through Intel (Barry M > Tannenbaum CCed), which I have done months ago: I submitted the patches > to Intel, and -- as I understood it -- Barry and I seemed to agree about them > (at least I don't remember any requests for changes to be made on my side), > but I have not seen a merge from Intel to update GCC's libcilkrts. Should I > now commit to GCC the pending patches, <http://news.gmane.org/find- > root.php?message_id=%3C8738bae1mp.fsf%40kepler.schwinge.homeip.net > %3E> > and following? > > > Grüße, > Thomas
Pta_flags enum overflow in i386.c
Hi All! As you may see pta_flags enum in i386.c is almost full. So there is a risk of overflow in quite near future. Comment in source code advises "widen struct pta flags" which is now defined as unsigned. But it looks not optimal. What will be the most proper solution for this problem? Thanks in advance, Igor
Option to print word size, alignment on the target platform
Is there any option to ask GCC to print various size and alignment info on the target platform? This would be very nice during cross compilation when one can not run the executables to autoconfigure for such parameters. Currently I consider for that a hack like copiling the following source: #include union aligned_fields { double d; void (*f)(); ... }; struct align_test { union aligned_fields u1; char c; }; const char DATA_POINTER_SIZE[sizeof(void *)]; const char FUNCTION_POINTER_SIZE[sizeof(void (*)())]; const char UNIVERSAL_ALIGN[offsetof(struct align_test, c)]; const char SHORT_SIZE[sizeof(short)]; and then running "nm --print-size" from binutils for the target on it to get: 0004 0004 C DATA_POINTER_SIZE 0004 0004 C FUNCTION_POINTER_SIZE 0002 0002 C SHORT_SIZE 0008 0008 C UNIVERSAL_ALIGN But I doubt that this is reliable. So perhaps there is something like gcc -print-target-info ?
Re: Option to print word size, alignment on the target platform
On 1/25/06, Paul Brook <[EMAIL PROTECTED]> wrote: > Autoconf already has tests for things like this. Something along the lines of: > > const char D_P_S_4[sizeof(void *) == 4 : -1 : 1]; > const char D_P_S_8[sizeof(void *) == 8 : -1 : 1]; > > Then see which compiles, or grep the error messages. Right, but are there any way to learn about endianess of the paltform or the direction of stack growth just from knowing that program compiles or not? GCC nows about this and it would be nice if there is a way to expose these. Regards, Igor
Re: Option to print word size, alignment on the target platform
On 1/25/06, Robert Dewar <[EMAIL PROTECTED]> wrote: > A convenient way to get the endianness is to use > the System.Bit_Order attribute in Ada. But this requires to run the program on the target which is not possible with a cross-compiler. Or is there a trick to declare something in Ada that would force the program to miscompile depending on the target endianness? Regards, Igor
GCC 4.1: too strict aliasing?
Consider the following code that starting with GCC 4.1.0 generates 'dereferencing type-punned pointer will break strict-aliasing rules' warning: ~> cat test.c struct inner { struct inner *next; }; struct outer { struct inner base; int value; }; /* List of outer elements where all outer.base.next point to struct outer */ struct outer_list { struct outer *head; }; struct outer *search_list(struct outer_list *list, int value) { struct outer *elem, **pelem; pelem = &list->head; while ((elem = *pelem)) { if (elem->base.value == value) { /* Hit, move atom's element to the front of the list. */ *pelem = (struct outer*)elem->base.next; elem->base.next = &list->head->base; list->head = elem; return elem; } /*** LINE GENERATING WARNING */ pelem = (struct outer **)&elem->base.next; } return 0; } ~> gcc -c -Wall -O2 test.c test.c: In function 'search_list': test.c:29: warning: dereferencing type-punned pointer will break strict-aliasing rules But why the warning is generated? Doesn't it guaranteed that offsetof(struct outer, base) == 0 and one can always safely cast struct inner* to struct outer* if struct inner is a part struct outer so struct* outer can alias struct* inner?
Wrong code for i686 target with -O3 -flto
Hi All! Unfortunately now the compiler generates wrong code for i686 target when options -O3 and -flto are used. It started more than a month ago and reflected in PR57602. Such combination of options could be quite important at least from the performance point of view. Since there was almost no reaction on this PR I'd like to ask either to look at it in some observable future or revert the commit which is guilty for the issue. Thanks, Igor
Intel® Memory Protection Extensions support in the GCC
Hi All! This is to let you know that enabling of Intel® MPX technology (see details in http://download-software.intel.com/sites/default/files/319433-015.pdf) in GCC has been started. (Corresponding changes in binutils are here - http://sourceware.org/ml/binutils/2013-07/msg00233.html) Currently compiler changes for Intel® MPX has been put in the branch svn://gcc.gnu.org/svn/gcc/branches/mpx (will soon be reflected in svn.html). Ilya Enkovich (in cc) will be the main person maintaining this branch and submitting changes into the trunk. Some implementation details could be found on wiki Thanks, Igor
Compilation flags in libgfortran
Hi All! Is there any particular reason that matmul* modules from libgfortran are compiled with -O2 -ftree-vectorize? I see some regressions on Atom processor after r202980 (http://gcc.gnu.org/ml/gcc-cvs/2013-09/msg00846.html) Why not just use O3 for those modules? Thanks, Igor
Re: Compilation flags in libgfortran
Thanks a lot for the explanation! I can take care of the benchmarking but only on Intel hardware... Do you think that possble changes according those results would be acceptable? Thanks, Igor On Tue, Oct 15, 2013 at 11:46 PM, Janne Blomqvist wrote: > On Tue, Oct 15, 2013 at 4:58 PM, Igor Zamyatin wrote: >> Hi All! >> >> Is there any particular reason that matmul* modules from libgfortran >> are compiled with -O2 -ftree-vectorize? > > Yes, testing showed that it improved performance compared to the > default options. See the thread starting at > > http://gcc.gnu.org/ml/fortran/2005-11/msg00366.html > > In the almost 8 years (!!) since the patch was merged, I believe the > importance of vectorization for utilizing current processors has only > increased. > > [snip] > >> Why not just use O3 for those modules? > > Back when the change was made, -ftree-vectorize wasn't enabled by -O3. > IIRC I did some tests, and -O3 didn't really improve things beyond > what "-O2 -funroll-loops -ftree-vectorize" already did. That was a > while ago however, so if somebody (*wink*) would care to redo the > benchmarks things might look different with today's GCC on today's > hardware. > > Hope this helps, > > -- > Janne Blomqvist
Re: Compilation flags in libgfortran
Yeah, this is my point exactly. Atom case seems just triggered that fact. On Wed, Oct 16, 2013 at 2:22 PM, Kyrill Tkachov wrote: > On 16/10/13 10:37, pins...@gmail.com wrote: >>> >>> On Oct 15, 2013, at 6:58 AM, Igor Zamyatin wrote: >>> Hi All! >>> >>> Is there any particular reason that matmul* modules from libgfortran >>> are compiled with -O2 -ftree-vectorize? >>> >>> I see some regressions on Atom processor after r202980 >>> (http://gcc.gnu.org/ml/gcc-cvs/2013-09/msg00846.html) >>> >>> Why not just use O3 for those modules? >> >> -O3 and -O2 -ftree-vectorize won't give much performance difference. What >> you are seeing is the cost model needs improvement; at least for atom. > > Hi all, > I think http://gcc.gnu.org/ml/gcc-patches/2013-09/msg01908.html introduced > the new "cheap" vectoriser cost model that favors compilation time over > runtime performance and is set as default for -O2. -O3 uses the "dynamic" > model which potentially gives better runtime performance in exchange for > longer compile times (if I understand the new rules correctly). > Therefore, I'd expect -O3 to give a better vector performance than -O2... > > Kyrill > >
Re: Compilation flags in libgfortran
Yeah, I can try to do benchmarking with such optset instead of O3. Thanks, Igor On Thu, Oct 17, 2013 at 2:19 PM, Richard Biener wrote: > On Wed, Oct 16, 2013 at 12:22 PM, Kyrill Tkachov > wrote: >> On 16/10/13 10:37, pins...@gmail.com wrote: >>>> >>>> On Oct 15, 2013, at 6:58 AM, Igor Zamyatin wrote: >>>> Hi All! >>>> >>>> Is there any particular reason that matmul* modules from libgfortran >>>> are compiled with -O2 -ftree-vectorize? >>>> >>>> I see some regressions on Atom processor after r202980 >>>> (http://gcc.gnu.org/ml/gcc-cvs/2013-09/msg00846.html) >>>> >>>> Why not just use O3 for those modules? >>> >>> -O3 and -O2 -ftree-vectorize won't give much performance difference. What >>> you are seeing is the cost model needs improvement; at least for atom. >> >> Hi all, >> I think http://gcc.gnu.org/ml/gcc-patches/2013-09/msg01908.html introduced >> the new "cheap" vectoriser cost model that favors compilation time over >> runtime performance and is set as default for -O2. -O3 uses the "dynamic" >> model which potentially gives better runtime performance in exchange for >> longer compile times (if I understand the new rules correctly). >> Therefore, I'd expect -O3 to give a better vector performance than -O2... > > But this suggests to compile with -O2 -ftree-vectorize > -fvect-cost-model=dynamic, not building with -O3. > > Richard. > >> Kyrill >> >>
Re: GNU C extension: Function Error vs. Success
10.03.2014 18:27, Shahbaz Youssefi пишет: FILE *fin = fopen("filename", "r") !! goto exit_no_file; Or maybe permission denied? ;-)
Tree loop if conversion at O2
Hi All! Is there any particular reason why tree loop if conversion (tree-if-conv.c) isn't enabled by default on O2 (as far as I can see it's true for any platforms)? Thanks, Igor
Bug repositories
Hello, I'm a master student and I'm writing my thesis on bug triaging in open source project and I wondering if I can access to a big part of the bug repository, if I can, how to do it ? Writing a crawler/parser for bugzilla or something else? I need 5 to 8 years of development. Thanks a lot, Igor K.
Request about adding a new micro support
Hi All, Sorry for my bad english. How can I add to gcc support for a 8-bit micro (Harvard architecture)? An RTFM link would be really appreciated. :-) Thanks! Ciao, Alessio
lvx versus lxvd2x on power8
Hi all, I recently checked this old discussion about when/why to use lxvd2x instead of lvsl/lvx/vperm/lvx to load elements from memory to vector: https://gcc.gnu.org/ml/gcc/2015-03/msg00135.html I had the same doubt and I was also concerned how performance influences on these approaches. So that, I created the following project to check which one is faster and how memory alignment can influence on results: https://github.com/PPC64/load_vec_cmp This is a simple code, that many loads (using both approaches) are executed in a simple loop in order to measure which implementation is slower. The project also considers alignment. As it can be seen on this plot (https://raw.githubusercontent.com/igorsnunes/load_vec_cmp/master/doc/LoadVecCompare.png) an unaligned load using lxvd2x takes more time. The previous discussion (as far as I could see) addresses that lxvd2x performs better than lvsl/lvx/vperm/lvx in all cases. Is that correct? Is my analysis wrong? This issue concerned me, once lxvd2x is heavily used on compiled code. Regards, Igor
GSoC 2023
Dear all, I am a student of computer science and I was thinking about applying for Google Summer of Code 2023. Naturally, I wanted to reach out to you before applying for GCC projects. >From selected topics you are interested in, several grabbed my attention: 1. Bypass assembler when generating LTO object file 2. Extend the static analysis pass 3. Rust Front-End: HIR Dump 4. Rust Front-End: Improving user errors I have to admit that I feel a bit intimidated by projects of "hard difficulty", because I have seen how hard it is to find your way in a large codebase (which GCC definitely is). Therefore, I would like to ask you for your opinion about these topics and the level of theoretical/practical experience with compilers you are expecting. As for the languages used, I have advanced knowledge of C and intermediate knowledge of C++. Thank you very much for your time. Best regards, Igor Putovný
GSoC 2023
Dear all, I am an undergraduate student of computer science and I am interested in GCC projects for Google Summer of Code 2023. >From selected topics you are interested in, several grabbed my attention: 1. Bypass assembler when generating LTO object file 2. Rust Front-End: HIR Dump 3. Rust Front-End: Improving user errors May I ask you for more information about these projects and other knowledge or skills you are expecting? Thank you very much for your time. Best regards, Igor Putovný