Re: Optimization breaks inline asm code w/ptrs
On Tue, Aug 15, 2017 at 03:09:15PM +0800, Liu Hao wrote: > On 2017/8/14 20:41, Alan Modra wrote: > >On Sun, Aug 13, 2017 at 10:25:14PM +0930, Alan Modra wrote: > >>On Sun, Aug 13, 2017 at 03:35:15AM -0700, David Wohlferd wrote: > >>>Using "m"(*pStr) as an (unused) input parameter has no effect. > >> > >>Use "m" (*(const void *)pStr) and ignore the warning, or use > >>"m" (*(const struct {char a; char x[];} *) pStr). > > > >or even better "m" (*(const char (*)[]) pStr). > > > > This should work in the sense that GCC now thinks bytes adjacent to `pStr` > are subject to modification by the asm statement. > > But I just tried GCC 7.2 and it seems that even if such a "+m" constraint is > the only output parameter of an asm statement and there is no `volatile` or > the "memory" clobber, GCC optimizer will not optimize the asm statement > away, which is the case if a plain `"+m"(*pStr)` is used. I wasn't advocating a "+m" constraint in this case. Obviously it's wrong to say scasb modifies memory. That aside though, I'm mainly interested in gcc-8 and see "+m"(*p) preventing dead code removal, even when all outputs of the asm are unused (including of course the array pointed at by p). Probably a bug. -- Alan Modra Australia Development Lab, IBM
Re: [sparc64] kernel OOPS with gcc 7.1 / 7.2
On Wed, Aug 16, 2017 at 7:30 AM, David Miller wrote: > From: Anatoly Pugachev > Date: Tue, 15 Aug 2017 21:50:45 +0300 > >> Together with Dmitry (ldv) , we've discovered that running test suite >> from strace produces kernel OOPS, when kernel is compiled with gcc 7.1 >> or with gcc 7.2 , but not with gcc 6 : > > Please try this patch: Dave, this patch fixes OOPS, thanks. Tested on ldom (gcc 7.2, git kernel + patch, git strace). Going to check with sun sparc v215 (sun4u), but it would take some time to compile patched kernel...
Re: How to migrate struct rtl_opt_pass to class for GCC v6.x?
for example: gcc -mmpx -fplugin=./plugin.so -flto bare.c -wrapper gdb,--args or add flag_lto = ""; to the testcase plugin.cpp https://github.com/xiangzhai/dragonball/blob/master/tests/plugin.cpp#L110 then pass_rtl_emit_function's execute will not be called, position_pass (gcc/passes.c) failed to Insert the *new pass* info at the proper position, then traverse the list of pass in the execute_pass_list_1, there is *no* pass_rtl_emit_function at all, so execute_one_pass never call its execute hook. but as ChangeLog-2014 mentioned: 2014-11-13 Ilya Verbin Ilya Tocar Andrey Turetskiy Bernd Schmidt ... Replace flag_lto with flag_generate_lto before lto_streamer_hooks_init. ... why GCC v4.6 is still able to work https://github.com/xiangzhai/dragonegg/blob/master/src/Backend.cpp#L1939 but GCC v6.x or v8.x (git-20170816) couldn't, please give me some hint about the difference between GCC v4.6's LTO and GCC v6.x or v8.x (git), thanks a lot! -- Regards, Leslie Zhai - a LLVM developer https://reviews.llvm.org/p/xiangzhai/
How to migrate ggc_alloc_XXX for GCC v8.x (git-20170816)?
Hi GCC developers, GCC v4.6's gengtype will auto-generate Allocators for known structs and unions, for example: ggc_alloc_tree2WeakVH for tree2WeakVH https://github.com/xiangzhai/dragonegg/blob/master/include/dragonegg/gt-cache-4.6.inc#L24 but gengtype will not auto-generate ggc_alloc_XXX for GCC v6.x or v8.x (git-20170816), for example: struct GTY((for_user)) tree2WeakVH https://github.com/xiangzhai/dragonegg/blob/master/include/dragonegg/gt-cache-8.0.inc#L1284 As ChangeLog-2014 mentioned: 2014-05-17 Trevor Saunders ... (ggc_alloc): Install the type's destructor as the finalizer if it might do something. Please give me some hint about ggc_alloc migration, thanks a lot! -- Regards, Leslie Zhai - a LLVM developer https://reviews.llvm.org/p/xiangzhai/
Optimizing away deletion of null pointers with g++
When compiling the following code: int* ptr = nullptr; delete ptr; GCC 7.1 on x86_64 generates a delete-operator-related call instruction in the resulting program with both -O2 and -O3 optimization flags. This is a nonsense piece of code, indeed, but imagine a class that has an owning raw pointer (please, don't tell me to use unique_ptr, there are such classes in the real world). Then, this class needs to call delete in its move assignment operator and destructor: class X { T* ptr_; public: X() ptr_(nullptr) { } X& operator=(X&& rhs) { delete ptr_; ptr_ = rhs.ptr_; rhs.ptr_ = nullptr; return *this; } ~X() { delete ptr_; } ... }; Invoking std::swap with objects of class X then yields deletion of null pointer 3 times (2 times within move assignment operator and once within destructor). Since GCC doesn't optimize out these null-pointer deletions and generate corresponding call instructions, whole swapping has some performance penalty. I found out that if I change 'delete ptr_;' to 'if (ptr_) delete ptr_;', then no call instructions are generated. So, I did a benchmark that sorted 100M randomly-shuffled objects of class X (where T was int) with std::sort. Here are measured sorting times: 1) 40.8 seconds with plain 'delete ptr_;', 2) 31.5 seconds with 'if (ptr_) delete ptr_;', 3) 31.3 seconds with additional custom swap free function for class X. This is quite a big performance difference between first two cases. Wouldn't be nice if g++ was able to optimize out deletion of null pointers? I guess in cases such as above mentioned one the null pointers can be recognized by static analysis. (As far as I know omitting delete operator calls is standard-compliant. Both Clang and Intel compilers did this in my tests.) Cheers, Daniel
Re: Optimizing away deletion of null pointers with g++
On Wed, Aug 16, 2017 at 12:09 PM, Daniel Langr wrote: > When compiling the following code: > > int* ptr = nullptr; > delete ptr; > > GCC 7.1 on x86_64 generates a delete-operator-related call instruction in > the resulting program with both -O2 and -O3 optimization flags. This is a > nonsense piece of code, indeed, but imagine a class that has an owning raw > pointer (please, don't tell me to use unique_ptr, there are such classes in > the real world). Then, this class needs to call delete in its move > assignment operator and destructor: > > class X { > T* ptr_; > public: > X() ptr_(nullptr) { } > X& operator=(X&& rhs) { > delete ptr_; > ptr_ = rhs.ptr_; > rhs.ptr_ = nullptr; > return *this; > } > ~X() { delete ptr_; } > ... > }; > Invoking std::swap with objects of class X then yields deletion of null > pointer 3 times (2 times within move assignment operator and once within > destructor). Since GCC doesn't optimize out these null-pointer deletions and > generate corresponding call instructions, whole swapping has some > performance penalty. > > I found out that if I change 'delete ptr_;' to 'if (ptr_) delete ptr_;', > then no call instructions are generated. So, I did a benchmark that sorted > 100M randomly-shuffled objects of class X (where T was int) with std::sort. > Here are measured sorting times: > > 1) 40.8 seconds with plain 'delete ptr_;', > 2) 31.5 seconds with 'if (ptr_) delete ptr_;', > 3) 31.3 seconds with additional custom swap free function for class X. > > This is quite a big performance difference between first two cases. Wouldn't > be nice if g++ was able to optimize out deletion of null pointers? I guess > in cases such as above mentioned one the null pointers can be recognized by > static analysis. (As far as I know omitting delete operator calls is > standard-compliant. Both Clang and Intel compilers did this in my tests.) I suppose one needs to mark operator delete specially, like we do for new with DECL_IS_OPERATOR_NEW. Then the middle-end can delete those calls (the flag would mean there's no side-effect in case you pass NULL). Richard. > Cheers, > Daniel
Re: Optimizing away deletion of null pointers with g++
Hi, On 16/08/2017 12:09, Daniel Langr wrote: When compiling the following code: int* ptr = nullptr; delete ptr; I didn't understand why we don't already handle the easy case: constexpr int* ptr = nullptr; delete ptr; and the below tiny front-end tweak would take care of it. But I'm not sure how much of that we want in the front-end, I would appreaciate a word from Jason about such kind of early optimization. Thanks, Paolo. Index: decl2.c === --- decl2.c (revision 251085) +++ decl2.c (working copy) @@ -499,7 +499,7 @@ delete_sanity (tree exp, tree size, bool doing_vec } /* Deleting a pointer with the value zero is valid and has no effect. */ - if (integer_zerop (t)) + if (integer_zerop (fold_non_dependent_expr (t))) return build1 (NOP_EXPR, void_type_node, t); if (doing_vec)
Re: How to migrate ggc_alloc_XXX for GCC v8.x (git-20170816)?
On Wed, Aug 16, 2017 at 05:32:10PM +0800, Leslie Zhai wrote: > Hi GCC developers, > > GCC v4.6's gengtype will auto-generate Allocators for known structs and > unions, for example: ggc_alloc_tree2WeakVH for tree2WeakVH > https://github.com/xiangzhai/dragonegg/blob/master/include/dragonegg/gt-cache-4.6.inc#L24 > > but gengtype will not auto-generate ggc_alloc_XXX for GCC v6.x or v8.x > (git-20170816), for example: struct GTY((for_user)) tree2WeakVH > https://github.com/xiangzhai/dragonegg/blob/master/include/dragonegg/gt-cache-8.0.inc#L1284 > > As ChangeLog-2014 mentioned: > > 2014-05-17 Trevor Saunders > > ... > (ggc_alloc): Install the type's destructor as the finalizer if it > might do something. > > Please give me some hint about ggc_alloc migration, thanks a lot! if you look at the patches they convert ggc_alloc_foo to ggc_alloc and you should do the same. Trev > > -- > Regards, > Leslie Zhai - a LLVM developer https://reviews.llvm.org/p/xiangzhai/ > > >
Re: [sparc64] kernel OOPS with gcc 7.1 / 7.2
On Wed, Aug 16, 2017 at 11:42 AM, Anatoly Pugachev wrote: > On Wed, Aug 16, 2017 at 7:30 AM, David Miller wrote: >> From: Anatoly Pugachev >> Date: Tue, 15 Aug 2017 21:50:45 +0300 >> >>> Together with Dmitry (ldv) , we've discovered that running test suite >>> from strace produces kernel OOPS, when kernel is compiled with gcc 7.1 >>> or with gcc 7.2 , but not with gcc 6 : >> >> Please try this patch: > > Dave, > > this patch fixes OOPS, thanks. Tested on ldom (gcc 7.2, git kernel + > patch, git strace). > Going to check with sun sparc v215 (sun4u), but it would take some > time to compile patched kernel... strace test suite run on v215 (sun4u), git kernel + patch (4.13.0-rc5-00067-g510c8a899caf-dirty), debian sid gcc-7.1, git strace passed without kernel OOPS.
Re: gcc behavior on memory exhaustion
Hello. Just small note, link to Nathan's patch that has been recently accepted: https://gcc.gnu.org/ml/gcc-patches/2017-08/msg00878.html Which provides info about process termination. Martin
Re: Overwhelmed by GCC frustration
On 31.07.2017 19:54, Jeff Law wrote: On 07/31/2017 11:23 AM, Segher Boessenkool wrote: On Tue, Aug 01, 2017 at 01:12:41AM +0900, Oleg Endo wrote: I could probably write a similar rant. This is the life of a "minority target programmer". Most development efforts are being done with primary targets in mind. And as a result, most changes are being tested only on such targets. Also, many changes require retuning of all target backends. This never Got the message. This means it's actually waste of time to work on these backends. The code will finally end up in the dustbin as cc0 backends are considered undesired ballast that has to be "jettisoned". "Deprecate all cc0" is just a nice formulation of "deprecate most of the cc0 backends". Just the fact that the backends that get most attention and attract most developers don't use cc0 doesn't mean cc0 is a useless device. First of all, LRA cannot cope with cc0 (Yes, I know deprecating cc0 is just to deprecate all non-LRA BEs). LRA asserts that accessing the frame doesn't change condition code. LRA doesn't provide replacement for LEGITIMITE_RELOAD_ADDRESS. Hence LRA focusses just comfortable, orthogonal targets. As far as cc0 is concerned, transforming avr BE is not trivial. It would need rewriting almost all of its md files entirely. It would need rewriting great deal of avr.c that handle insn output and provide input to NOTICE_UPDATE_CC. But my feeling is that opposing deprecation of cc0 is futile, the voices that support cc0 deprecation are more and usefulness of cc0 is not recognized. Sooner or later these backends will end up in /dev/null. Johann
Re: Overwhelmed by GCC frustration
> Just the fact that the backends that get most attention and attract > most developers don't use cc0 doesn't mean cc0 is a useless device. Everything that can be done with cc0 can be done with the new representation, at least theoritically, although this can require more work. > As far as cc0 is concerned, transforming avr BE is not trivial. > It would need rewriting almost all of its md files entirely. > It would need rewriting great deal of avr.c that handle > insn output and provide input to NOTICE_UPDATE_CC. I recently converted the Visium port, which is an architecture where every integer instruction, including a simple move, clobber the flags, so it's doable even for such an annoying target (but Visium is otherwise regular). See for example https://gcc.gnu.org/wiki/CC0Transition for some guidelines. > But my feeling is that opposing deprecation of cc0 is futile, > the voices that support cc0 deprecation are more and usefulness > of cc0 is not recognized. cc0 is just obsolete and inferior compared to the new representation. -- Eric Botcazou
Re: Overwhelmed by GCC frustration
On Wed, 2017-08-16 at 15:53 +0200, Georg-Johann Lay wrote: > > This means it's actually waste of time to work on these > backends. The code will finally end up in the dustbin as cc0 > backends are considered undesired ballast that has to be > "jettisoned". > > "Deprecate all cc0" is just a nice formulation of "deprecate > most of the cc0 backends". > > Just the fact that the backends that get most attention and attract > most developers don't use cc0 doesn't mean cc0 is a useless device. The desire to get rid of old, crusty and unmaintained stuff is somehow understandable... > First of all, LRA cannot cope with cc0 (Yes, I know deprecating > cc0 is just to deprecate all non-LRA BEs). LRA asserts that > accessing the frame doesn't change condition code. LRA doesn't > provide replacement for LEGITIMITE_RELOAD_ADDRESS. Hence LRA > focusses just comfortable, orthogonal targets. It seems LRA is being praised so much, but all those niche BEs and corner cases get zero support. There are several known instances of SH code regressions with LRA, and that's why I haven't switched it to LRA. I think the problem is that it's very difficult to make a register allocator that works well for everything. The last attempt ended in reload. And eventually LRA will go down the same route. So instead of trying to fit a round peg in a square hole, maybe we should just have the options for round and square pegs and holes. Cheers, Oleg
Re: Optimizing away deletion of null pointers with g++
On Wed, 2017-08-16 at 13:30 +0200, Paolo Carlini wrote: > > I didn't understand why we don't already handle the easy case: > > constexpr int* ptr = nullptr; > delete ptr; > What about overriding the global delete operator with some user defined implementation? Is there something in the C++ standard that says the invocation can be completely omitted, i.e. on which side of the call the nullptr check is being done? One possible use case could be overriding the global delete operator to count the number of invocations, incl. for nullptr. Not sure how useful that is though. Cheers, Oleg
Re: Optimizing away deletion of null pointers with g++
On 16 August 2017 at 15:27, Oleg Endo wrote: > On Wed, 2017-08-16 at 13:30 +0200, Paolo Carlini wrote: >> >> I didn't understand why we don't already handle the easy case: >> >> constexpr int* ptr = nullptr; >> delete ptr; >> > > What about overriding the global delete operator with some user defined > implementation? Is there something in the C++ standard that says the > invocation can be completely omitted, i.e. on which side of the call > the nullptr check is being done? > > One possible use case could be overriding the global delete operator to > count the number of invocations, incl. for nullptr. Not sure how > useful that is though. Users can replace the deallocation function "operator delete" but this is the delete operator ... a subtly different thing. Anyway, the standard says: "If the value of the operand of the delete-expression is a null pointer value, it is unspecified whether a deallocation function will be called as described above." So it's permitted to omit the call to operator delete.
Re: Optimizing away deletion of null pointers with g++
On 16 August 2017 at 15:40, Jonathan Wakely wrote: > On 16 August 2017 at 15:27, Oleg Endo wrote: >> On Wed, 2017-08-16 at 13:30 +0200, Paolo Carlini wrote: >>> >>> I didn't understand why we don't already handle the easy case: >>> >>> constexpr int* ptr = nullptr; >>> delete ptr; >>> >> >> What about overriding the global delete operator with some user defined >> implementation? Is there something in the C++ standard that says the >> invocation can be completely omitted, i.e. on which side of the call >> the nullptr check is being done? >> >> One possible use case could be overriding the global delete operator to >> count the number of invocations, incl. for nullptr. Not sure how >> useful that is though. > > Users can replace the deallocation function "operator delete" but this > is the delete operator ... a subtly different thing. > > Anyway, the standard says: > > "If the value of the operand of the delete-expression is a null > pointer value, it is unspecified whether a deallocation function will > be called as described above." > > So it's permitted to omit the call to operator delete. Before C++11 the call was required: "The delete-expression will call a deallocation function (3.7.3.2)." This was changed by http://www.open-std.org/jtc1/sc22/wg21/docs/cwg_defects.html#348
tests for GCC internal functions
Is there a setup for writing and running as part of the test suite unit tests that exercise internal GCC functions like error and warning? I ask because of a couple of bugs that were recently reported for the %s directive in GCC's diagnostics (81859 and 81586) that were only caught by running valgrind on GCC, but that could have been caught by GCC's own tests for these APIs if they existed. I can't find any except some in the gcc.dg/plugin directory (though they weren't written to exercise the functions but rather the plugin interface). Would adding a plugin for this sort of thing be the right way to do it? Thanks Martin
Re: tests for GCC internal functions
On Wed, Aug 16, 2017 at 08:46:20AM -0600, Martin Sebor wrote: > Is there a setup for writing and running as part of the test > suite unit tests that exercise internal GCC functions like > error and warning? I ask because of a couple of bugs that > were recently reported for the %s directive in GCC's > diagnostics (81859 and 81586) that were only caught by running > valgrind on GCC, but that could have been caught by GCC's own > tests for these APIs if they existed. > > I can't find any except some in the gcc.dg/plugin directory > (though they weren't written to exercise the functions but > rather the plugin interface). Would adding a plugin for > this sort of thing be the right way to do it? Isn't -fself-test the thing you're looking for? Marek
Re: [Bug web/?????] New: Fwd: failure notice: Bugzilla down.
On 8/15/17, Jonathan Wakely wrote: > On 15 August 2017 at 04:10, Martin Sebor wrote: >> On 08/14/2017 04:22 PM, Eric Gallager wrote: >>> >>> I'm emailing this manually to the list because Bugzilla is down and I >>> can't file a bug on Bugzilla about Bugzilla being down. The error >>> message looks like this: > > > Even if it were possible, there wouldn't be any point in filing a bug > that bugzilla's down, and so not much point emailing gcc-bugs either > (since that's for bugzilla mail). Using gcc@ seems like the right > list to me. > > N.B. since the server is being restored from a backup all of > yesterday's changes to bugzilla have been lost, including all Richi's > 7.2 release changes, and Eric's housekeeping. > > I don't suggest redoing all that work until all services are fully > restored, in case it's lost again. > I see Richi redid all his 7.2 release changes; does that imply that the server restore is now complete?
Re: Overwhelmed by GCC frustration
On 08/16/2017 08:14 AM, Eric Botcazou wrote: >> Just the fact that the backends that get most attention and attract >> most developers don't use cc0 doesn't mean cc0 is a useless device. > > Everything that can be done with cc0 can be done with the new representation, > at least theoritically, although this can require more work. Yup. > >> As far as cc0 is concerned, transforming avr BE is not trivial. >> It would need rewriting almost all of its md files entirely. >> It would need rewriting great deal of avr.c that handle >> insn output and provide input to NOTICE_UPDATE_CC. > > I recently converted the Visium port, which is an architecture where every > integer instruction, including a simple move, clobber the flags, so it's > doable even for such an annoying target (but Visium is otherwise regular). > See for example https://gcc.gnu.org/wiki/CC0Transition for some guidelines. Yup. I'd strongly recommend anyone contemplating a conversion to read your guidelines. > >> But my feeling is that opposing deprecation of cc0 is futile, >> the voices that support cc0 deprecation are more and usefulness >> of cc0 is not recognized. > > cc0 is just obsolete and inferior compared to the new representation. And cc0 is inherently buggy if you know how to poke at it. There are places where we can't enforce the cc0 user must immediately follow the cc0 setter. We've been faulting in work-arounds when ports trip over those problems, but I'm certain more problems in this space remain. jeff
Re: tests for GCC internal functions
On Wed, 2017-08-16 at 16:51 +0200, Marek Polacek wrote: > On Wed, Aug 16, 2017 at 08:46:20AM -0600, Martin Sebor wrote: > > Is there a setup for writing and running as part of the test > > suite unit tests that exercise internal GCC functions like > > error and warning? I ask because of a couple of bugs that > > were recently reported for the %s directive in GCC's > > diagnostics (81859 and 81586) that were only caught by running > > valgrind on GCC, but that could have been caught by GCC's own > > tests for these APIs if they existed. > > > > I can't find any except some in the gcc.dg/plugin directory > > (though they weren't written to exercise the functions but > > rather the plugin interface). Would adding a plugin for > > this sort of thing be the right way to do it? > > Isn't -fself-test the thing you're looking for? Indeed; looking at those bugs, it looks like you want to be adding to/extending the selftests near end of pretty-print.c (that way you can easily re-run them under valgrind via "make selftest-valgrind"); probably in selftest::test_pp_format, which already has a little coverage for %s.
Re: Optimizing away deletion of null pointers with g++
On 08/16/2017 08:44 AM, Jonathan Wakely wrote: > On 16 August 2017 at 15:40, Jonathan Wakely wrote: >> On 16 August 2017 at 15:27, Oleg Endo wrote: >>> On Wed, 2017-08-16 at 13:30 +0200, Paolo Carlini wrote: I didn't understand why we don't already handle the easy case: constexpr int* ptr = nullptr; delete ptr; >>> >>> What about overriding the global delete operator with some user defined >>> implementation? Is there something in the C++ standard that says the >>> invocation can be completely omitted, i.e. on which side of the call >>> the nullptr check is being done? >>> >>> One possible use case could be overriding the global delete operator to >>> count the number of invocations, incl. for nullptr. Not sure how >>> useful that is though. >> >> Users can replace the deallocation function "operator delete" but this >> is the delete operator ... a subtly different thing. >> >> Anyway, the standard says: >> >> "If the value of the operand of the delete-expression is a null >> pointer value, it is unspecified whether a deallocation function will >> be called as described above." >> >> So it's permitted to omit the call to operator delete. > > Before C++11 the call was required: > > "The delete-expression will call a deallocation function (3.7.3.2)." > > > This was changed by > http://www.open-std.org/jtc1/sc22/wg21/docs/cwg_defects.html#348 Which I think argues that we can safely remove a call to operator delete when we know the pointer is null. However, we can not assume that an object passed to operator delete is non-null. ISTM this would be better implemented in the optimizers rather than in the front-end. tree-ssa-dce.c would seem fairly natural. The only wrinkle is we can't do it in C++03 or earlier mode. jeff
Re: tests for GCC internal functions
On 08/16/2017 09:20 AM, David Malcolm wrote: On Wed, 2017-08-16 at 16:51 +0200, Marek Polacek wrote: On Wed, Aug 16, 2017 at 08:46:20AM -0600, Martin Sebor wrote: Is there a setup for writing and running as part of the test suite unit tests that exercise internal GCC functions like error and warning? I ask because of a couple of bugs that were recently reported for the %s directive in GCC's diagnostics (81859 and 81586) that were only caught by running valgrind on GCC, but that could have been caught by GCC's own tests for these APIs if they existed. I can't find any except some in the gcc.dg/plugin directory (though they weren't written to exercise the functions but rather the plugin interface). Would adding a plugin for this sort of thing be the right way to do it? Isn't -fself-test the thing you're looking for? Indeed; looking at those bugs, it looks like you want to be adding to/extending the selftests near end of pretty-print.c (that way you can easily re-run them under valgrind via "make selftest-valgrind"); probably in selftest::test_pp_format, which already has a little coverage for %s. You're both right, thanks! It didn't occur to me to look in the file I'm changing where there already are a bunch of self tests for the pretty printer. Martin
Re: [sparc64] kernel OOPS with gcc 7.1 / 7.2
From: Anatoly Pugachev Date: Wed, 16 Aug 2017 11:42:43 +0300 > On Wed, Aug 16, 2017 at 7:30 AM, David Miller wrote: >> From: Anatoly Pugachev >> Date: Tue, 15 Aug 2017 21:50:45 +0300 >> >>> Together with Dmitry (ldv) , we've discovered that running test suite >>> from strace produces kernel OOPS, when kernel is compiled with gcc 7.1 >>> or with gcc 7.2 , but not with gcc 6 : >> >> Please try this patch: > > Dave, > > this patch fixes OOPS, thanks. Tested on ldom (gcc 7.2, git kernel + > patch, git strace). Thanks for testing.
Re: [Bug web/?????] New: Fwd: failure notice: Bugzilla down.
On Mon, Aug 14, 2017 at 11:10 PM, Martin Sebor wrote: > On 08/14/2017 04:22 PM, Eric Gallager wrote: >> >> I'm emailing this manually to the list because Bugzilla is down and I >> can't file a bug on Bugzilla about Bugzilla being down. The error >> message looks like this: > > Bugzilla and the rest of gcc.gnu.org have been down much of > the afternoon/evening due to a hard drive upgrade (the old disk > apparently failed). You're not the only one who found out about There's no RAID?
Re: [Bug web/?????] New: Fwd: failure notice: Bugzilla down.
On Wed, 16 Aug 2017, Eric Gallager wrote: > I see Richi redid all his 7.2 release changes; does that imply that > the server restore is now complete? No, there's still a search process ongoing to identify corrupted or missing files by comparison with the last backup. My expectation is that all the other Bugzilla changes from 13 and 14 August UTC need redoing manually (recreating bugs with new numbers in the case of new bugs filed during that period, if those bugs are still relevant, repeating comments, etc. - and possibly recreating accounts for people who created accounts and filed bugs during that period). But I haven't seen any official announcement from overseers to all affected projects (for both GCC and Sourceware Bugzillas) yet. -- Joseph S. Myers jos...@codesourcery.com
Re: [Bug web/?????] New: Fwd: failure notice: Bugzilla down.
On Wed, 16 Aug 2017, NightStrike wrote: > On Mon, Aug 14, 2017 at 11:10 PM, Martin Sebor wrote: > > On 08/14/2017 04:22 PM, Eric Gallager wrote: > >> > >> I'm emailing this manually to the list because Bugzilla is down and I > >> can't file a bug on Bugzilla about Bugzilla being down. The error > >> message looks like this: > > > > Bugzilla and the rest of gcc.gnu.org have been down much of > > the afternoon/evening due to a hard drive upgrade (the old disk > > apparently failed). You're not the only one who found out about > > There's no RAID? The problem was apparently a kernel bug (in the 2.6.32 RHEL6 kernel, may or may not be in current kernels) that arose when adding a new SSD mirror to the RAID with the old hard drives marked writemostly. (Separately, one of the hard drives in the existing HDD RAID is showing signs of failing and needs to be replaced.) -- Joseph S. Myers jos...@codesourcery.com
Re: Should --enable-checking=yes,rtl work on 32-bit hosts?
On Tue, Aug 15, 2017 at 5:43 PM, Daniel Santos wrote: > On 08/15/2017 06:18 AM, Richard Biener wrote: >> On Mon, Aug 14, 2017 at 5:23 PM, H.J. Lu wrote: >>> For GCC 8, when --enable-checking=yes,rtl is used with x32 GCC, >>> I got >>> >>> cc1plus: out of memory allocating 56137200 bytes after a total of >>> 3139436544 bytes >>> make[5]: *** [Makefile:1104: insn-extract.o] Error 1 >>> make[5]: *** Waiting for unfinished jobs >>> >>> gcc-7-branch is OK. Is this expected? >> I suppose not. Who allocates all the memory? >> >> Richard. > > Well I didn't dig into it too deeply, but below is a backtrace just > prior to the error. I'm not at all intimate with gcc's memory > management, if gcc keeps track of how much each component has allocated. > > (gdb) bt > #0 __GI___libc_malloc (bytes=56137200) at malloc.c:2905 > #1 0x025bc8dc in xmalloc (size=56137200) at > /home/daniel/proj/sys/gcc/head/libiberty/xmalloc.c:147 > #2 0x0124ffa7 in (anonymous namespace)::pass_cprop_hardreg::execute > (this=0x32b8e50, fun=0xd359d270) at > /home/daniel/proj/sys/gcc/head/gcc/regcprop.c:1272 > #3 0x011c9f23 in execute_one_pass (pass= "cprop_hardreg"(299)>) at /home/daniel/proj/sys/gcc/head/gcc/passes.c:2495 > #4 0x011ca2ac in execute_pass_list_1 (pass= "cprop_hardreg"(299)>) at /home/daniel/proj/sys/gcc/head/gcc/passes.c:2584 > #5 0x011ca2de in execute_pass_list_1 (pass= "*all-postreload"(-1)>) at /home/daniel/proj/sys/gcc/head/gcc/passes.c:2585 > #6 0x011ca2de in execute_pass_list_1 (pass= "*rest_of_compilation"(-1)>) at > /home/daniel/proj/sys/gcc/head/gcc/passes.c:2585 > #7 0x011ca340 in execute_pass_list (fn=0xd359d270, pass= "fixup_cfg"(94)>) at /home/daniel/proj/sys/gcc/head/gcc/passes.c:2595 > #8 0x00cb317b in cgraph_node::expand (this= "insn_extract">) at /home/daniel/proj/sys/gcc/head/gcc/cgraphunit.c:2054 #9 > 0x00cb384b in expand_all_functions () at > /home/daniel/proj/sys/gcc/head/gcc/cgraphunit.c:2190 > #10 0x00cb4469 in symbol_table::compile (this=0xf64f60d8) at > /home/daniel/proj/sys/gcc/head/gcc/cgraphunit.c:2542 > #11 0x00cb46d4 in symbol_table::finalize_compilation_unit (this=0xf64f60d8) > at /home/daniel/proj/sys/gcc/head/gcc/cgraphunit.c:2631 > #12 0x013c7412 in compile_file () at > /home/daniel/proj/sys/gcc/head/gcc/toplev.c:496 > #13 0x013c9c51 in do_compile () at > /home/daniel/proj/sys/gcc/head/gcc/toplev.c:2037 > #14 0x013c9f76 in toplev::main (this=0xcb20, argc=77, argv=0xcc14) at > /home/daniel/proj/sys/gcc/head/gcc/toplev.c:2171 > #15 0x024fd468 in main (argc=77, argv=0xcc14) at > /home/daniel/proj/sys/gcc/head/gcc/main.c:39 > I opened: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=81869 -- H.J.
How to Verify GNU Releases
Hello, I have no problem working GnuPG. However, files on ftp://ftp.gnu.org are signed by individuals, at least for the projects I am interested in (binutils, gcc, gdb). Where might I find a GNU's endorsement of the signers? I started another thread about a related issue, my apologies if I should have continued it. R0b0t1.
Re: How to Verify GNU Releases
On Wed, Aug 16, 2017 at 3:53 PM, R0b0t1 wrote: > Hello, > > I have no problem working GnuPG. However, files on ftp://ftp.gnu.org > are signed by individuals, at least for the projects I am interested > in (binutils, gcc, gdb). Where might I find a GNU's endorsement of the > signers? > > I started another thread about a related issue, my apologies if I > should have continued it. > My apologies, the keys used are listed here: https://gcc.gnu.org/mirrors.html. I had found this on my own and it was previously related to me on this list. Hopefully my memory improves. R0b0t1.
Re: Overwhelmed by GCC frustration
On Wed, Aug 16, 2017 at 03:53:24PM +0200, Georg-Johann Lay wrote: > This means it's actually waste of time to work on these backends. The > code will finally end up in the dustbin as cc0 backends are considered > undesired ballast that has to be "jettisoned". > > "Deprecate all cc0" is just a nice formulation of "deprecate > most of the cc0 backends". _All_ cc0 backends. We cannot remove cc0 support without removing all targets that depend on it. The push for moving away from cc0 isn't anything new. > First of all, LRA cannot cope with cc0 (Yes, I know deprecating > cc0 is just to deprecate all non-LRA BEs). No, it isn't that at all. CC0 is problematic in very many places. It is a blocker for removing old reload, that is true. > As far as cc0 is concerned, transforming avr BE is not trivial. That unfortunately is true for all cc0 backends. It requires looking over all of the backend code (not just the MD files even), and it requires knowing the actual target behaviour in detail. And it cannot be done piecemeal, it's an all-or-nothing switch. > But my feeling is that opposing deprecation of cc0 is futile, > the voices that support cc0 deprecation are more and usefulness > of cc0 is not recognized. > > Sooner or later these backends will end up in /dev/null. If they aren't converted, yes. A more constructive question is: what can be done to make conversion easier and less painful? Segher
Redundant loads for bitfield accesses
Hi, Is there any reason for 3 loads being issued for these bitfield accesses, given two of the loads are bytes, and one is a half; the compiler appears to know the structure is aligned at a half word boundary. Secondly, the riscv code is using a mixture of 32-bit and 64-bit adds and shifts. Thirdly, with -Os the riscv code size is the same, but the schedule is less than optimal. i.e. the 3rd load is issued much later. - https://cx.rv8.io/g/2YDLTA code: struct foo { unsigned int a : 5; unsigned int b : 5; unsigned int c : 5; }; unsigned int proc_foo(struct foo *p) { return p->a + p->b + p->c; } riscv asm: proc_foo(foo*): lhu a3,0(a0) lbu a4,0(a0) lbu a5,1(a0) srliw a3,a3,5 andi a0,a4,31 srli a5,a5,2 andi a4,a3,31 addw a0,a0,a4 andi a5,a5,31 add a0,a0,a5 ret x86_64 asm: proc_foo(foo*): movzx edx, BYTE PTR [rdi] movzx eax, WORD PTR [rdi] mov ecx, edx shr ax, 5 and eax, 31 and ecx, 31 lea edx, [rcx+rax] movzx eax, BYTE PTR [rdi+1] shr al, 2 and eax, 31 add eax, edx ret hand coded riscv asm: proc_foo(foo*): lhu a1,0(a0) srli a2,a1,5 srli a3,a1,10 andi a0,a1,31 andi a2,a2,31 andi a3,a3,31 add a0,a0,a2 add a0,a0,a3 ret Michael
Re: Redundant loads for bitfield accesses
Here’s a more extreme example: - https://cx.rv8.io/g/2HWQje The bitfield type is unsigned int, so one or two 32-bit loads should suffice (depending on register pressure). GCC is issuing a lw at some point in the asm. struct foo { unsigned int a : 3; unsigned int b : 3; unsigned int c : 3; unsigned int d : 3; unsigned int e : 3; unsigned int f : 3; unsigned int g : 3; unsigned int h : 3; unsigned int i : 3; unsigned int j : 3; }; unsigned int proc_foo(struct foo *p) { return p->a + p->b + p->c + p->d + p->d + p->e + p->f + p->g + p->h + p->i + p->j; } > On 17 Aug 2017, at 10:29 AM, Michael Clark wrote: > > Hi, > > Is there any reason for 3 loads being issued for these bitfield accesses, > given two of the loads are bytes, and one is a half; the compiler appears to > know the structure is aligned at a half word boundary. Secondly, the riscv > code is using a mixture of 32-bit and 64-bit adds and shifts. Thirdly, with > -Os the riscv code size is the same, but the schedule is less than optimal. > i.e. the 3rd load is issued much later. > > - https://cx.rv8.io/g/2YDLTA > > code: > > struct foo { > unsigned int a : 5; > unsigned int b : 5; > unsigned int c : 5; > }; > > unsigned int proc_foo(struct foo *p) > { > return p->a + p->b + p->c; > } > > riscv asm: > > proc_foo(foo*): > lhu a3,0(a0) > lbu a4,0(a0) > lbu a5,1(a0) > srliw a3,a3,5 > andi a0,a4,31 > srli a5,a5,2 > andi a4,a3,31 > addw a0,a0,a4 > andi a5,a5,31 > add a0,a0,a5 > ret > > x86_64 asm: > > proc_foo(foo*): > movzx edx, BYTE PTR [rdi] > movzx eax, WORD PTR [rdi] > mov ecx, edx > shr ax, 5 > and eax, 31 > and ecx, 31 > lea edx, [rcx+rax] > movzx eax, BYTE PTR [rdi+1] > shr al, 2 > and eax, 31 > add eax, edx > ret > > hand coded riscv asm: > > proc_foo(foo*): > lhu a1,0(a0) > srli a2,a1,5 > srli a3,a1,10 > andi a0,a1,31 > andi a2,a2,31 > andi a3,a3,31 > add a0,a0,a2 > add a0,a0,a3 > ret > > Michael
Re: Redundant loads for bitfield accesses
On Wed, Aug 16, 2017 at 3:29 PM, Michael Clark wrote: > Hi, > > Is there any reason for 3 loads being issued for these bitfield accesses, > given two of the loads are bytes, and one is a half; the compiler appears to > know the structure is aligned at a half word boundary. Secondly, the riscv > code is using a mixture of 32-bit and 64-bit adds and shifts. Thirdly, with > -Os the riscv code size is the same, but the schedule is less than optimal. > i.e. the 3rd load is issued much later. Well one thing is most likely SLOW_BYTE_ACCESS is set to 0. This forces byte access for bit-field accesses. The macro is misnamed now as it only controls bit-field accesses right now (and one thing in dojump dealing with comparisons with and and a constant but that might be dead code). This should allow for you to get the code in hand written form. I suspect SLOW_BYTE_ACCESS support should be removed and be assumed to be 1 but I have not time to look into each backend to see if it is correct to do or not. Maybe it is wrong for AVR. Thanks, Andrew Pinski > > - https://cx.rv8.io/g/2YDLTA > > code: > > struct foo { > unsigned int a : 5; > unsigned int b : 5; > unsigned int c : 5; > }; > > unsigned int proc_foo(struct foo *p) > { > return p->a + p->b + p->c; > } > > riscv asm: > > proc_foo(foo*): > lhu a3,0(a0) > lbu a4,0(a0) > lbu a5,1(a0) > srliw a3,a3,5 > andi a0,a4,31 > srli a5,a5,2 > andi a4,a3,31 > addw a0,a0,a4 > andi a5,a5,31 > add a0,a0,a5 > ret > > x86_64 asm: > > proc_foo(foo*): > movzx edx, BYTE PTR [rdi] > movzx eax, WORD PTR [rdi] > mov ecx, edx > shr ax, 5 > and eax, 31 > and ecx, 31 > lea edx, [rcx+rax] > movzx eax, BYTE PTR [rdi+1] > shr al, 2 > and eax, 31 > add eax, edx > ret > > hand coded riscv asm: > > proc_foo(foo*): > lhu a1,0(a0) > srli a2,a1,5 > srli a3,a1,10 > andi a0,a1,31 > andi a2,a2,31 > andi a3,a3,31 > add a0,a0,a2 > add a0,a0,a3 > ret > > Michael
gcc-6-20170816 is now available
Snapshot gcc-6-20170816 is now available on ftp://gcc.gnu.org/pub/gcc/snapshots/6-20170816/ and on various mirrors, see http://gcc.gnu.org/mirrors.html for details. This snapshot has been generated from the GCC 6 SVN branch with the following options: svn://gcc.gnu.org/svn/gcc/branches/gcc-6-branch revision 251135 You'll find: gcc-6-20170816.tar.xzComplete GCC SHA256=a6ca77be4af6a168b128d2ebac660d74fc50d73f37709d507940fdf887a3f807 SHA1=b1f31d50f304387ed4256558d7da47c1968d0fd7 Diffs from 6-20170809 are available in the diffs/ subdirectory. When a particular snapshot is ready for public consumption the LATEST-6 link is updated and a message is sent to the gcc list. Please do not use a snapshot before it has been announced that way.
Re: Redundant loads for bitfield accesses
> On 17 Aug 2017, at 10:41 AM, Andrew Pinski wrote: > > On Wed, Aug 16, 2017 at 3:29 PM, Michael Clark wrote: >> Hi, >> >> Is there any reason for 3 loads being issued for these bitfield accesses, >> given two of the loads are bytes, and one is a half; the compiler appears to >> know the structure is aligned at a half word boundary. Secondly, the riscv >> code is using a mixture of 32-bit and 64-bit adds and shifts. Thirdly, with >> -Os the riscv code size is the same, but the schedule is less than optimal. >> i.e. the 3rd load is issued much later. > > > Well one thing is most likely SLOW_BYTE_ACCESS is set to 0. This > forces byte access for bit-field accesses. The macro is misnamed now > as it only controls bit-field accesses right now (and one thing in > dojump dealing with comparisons with and and a constant but that might > be dead code). This should allow for you to get the code in hand > written form. > I suspect SLOW_BYTE_ACCESS support should be removed and be assumed to > be 1 but I have not time to look into each backend to see if it is > correct to do or not. Maybe it is wrong for AVR. Thanks, that’s interesting. So I should try compiling the riscv backend with SLOW_BYTE_ACCESS = 1? Less risk than making a change to x86. This is clearly distinct from slow unaligned access. It seems odd that O3 doesn’t coalesce loads even if byte access is slow as one would expect the additional cost of the additional loads would outweigh the fact that byte accesses are not slow unless something weird is happening with the costs of loads of different widths. x86 could also be helped here too. I guess subsequent loads will be served from L1, but that’s not really an excuse for this codegen when the element is 32-bits aligned (unsigned int).
Re: Redundant loads for bitfield accesses
When implementing the RISC-V port, I took the name of this macro at face value. It does seem that we should follow Andrew's advice and set it to 1. (For your examples, doing so does improve code generation.) We'll submit a patch if the change doesn't regress. On Wed, Aug 16, 2017 at 4:00 PM, Michael Clark wrote: > ‘cc’ing Andrew Waterman > > I see this comment in SPARC: > > /* Nonzero if access to memory by bytes is slow and undesirable. >For RISC chips, it means that access to memory by bytes is no >better than access by words when possible, so grab a whole word >and maybe make use of that. */ > #define SLOW_BYTE_ACCESS 1 > > > The description says that byte access is no better than words, so as you > mention, the macro seems to be misnamed. I think this should be set to 1 on > RISC-V. I’m going to try it on the RISC-V backend. > > Andrew W, here is the example code-gen: > > - https://cx.rv8.io/g/2YDLTA > - https://cx.rv8.io/g/2HWQje > > On 17 Aug 2017, at 10:52 AM, Michael Clark wrote: > > > On 17 Aug 2017, at 10:41 AM, Andrew Pinski wrote: > > On Wed, Aug 16, 2017 at 3:29 PM, Michael Clark > wrote: > > Hi, > > Is there any reason for 3 loads being issued for these bitfield accesses, > given two of the loads are bytes, and one is a half; the compiler appears to > know the structure is aligned at a half word boundary. Secondly, the riscv > code is using a mixture of 32-bit and 64-bit adds and shifts. Thirdly, with > -Os the riscv code size is the same, but the schedule is less than optimal. > i.e. the 3rd load is issued much later. > > > > Well one thing is most likely SLOW_BYTE_ACCESS is set to 0. This > forces byte access for bit-field accesses. The macro is misnamed now > as it only controls bit-field accesses right now (and one thing in > dojump dealing with comparisons with and and a constant but that might > be dead code). This should allow for you to get the code in hand > written form. > I suspect SLOW_BYTE_ACCESS support should be removed and be assumed to > be 1 but I have not time to look into each backend to see if it is > correct to do or not. Maybe it is wrong for AVR. > > > Thanks, that’s interesting. > > So I should try compiling the riscv backend with SLOW_BYTE_ACCESS = 1? Less > risk than making a change to x86. > > This is clearly distinct from slow unaligned access. It seems odd that O3 > doesn’t coalesce loads even if byte access is slow as one would expect the > additional cost of the additional loads would outweigh the fact that byte > accesses are not slow unless something weird is happening with the costs of > loads of different widths. > > x86 could also be helped here too. I guess subsequent loads will be served > from L1, but that’s not really an excuse for this codegen when the element > is 32-bits aligned (unsigned int). > >
Re: Overwhelmed by GCC frustration
On Wed, Aug 16, 2017 at 11:23:27PM +0900, Oleg Endo wrote: > > First of all, LRA cannot cope with cc0 (Yes, I know deprecating > > cc0 is just to deprecate all non-LRA BEs). LRA asserts that > > accessing the frame doesn't change condition code. LRA doesn't > > provide replacement for LEGITIMITE_RELOAD_ADDRESS. Hence LRA > > focusses just comfortable, orthogonal targets. > > It seems LRA is being praised so much, but all those niche BEs and > corner cases get zero support. There are several known instances of SH > code regressions with LRA, and that's why I haven't switched it to > LRA. LRA is easier to work with than old reload, and that makes it better maintainable. Making LRA handle everything reload did is work, and someone needs to do it. LRA probably needs a few more target hooks (a _few_) to guide its decisions. Segher
Re: How to migrate ggc_alloc_XXX for GCC v8.x (git-20170816)?
Hi Trevor, Thanks for your kind response! 在 2017年08月16日 20:02, Trevor Saunders 写道: On Wed, Aug 16, 2017 at 05:32:10PM +0800, Leslie Zhai wrote: Hi GCC developers, GCC v4.6's gengtype will auto-generate Allocators for known structs and unions, for example: ggc_alloc_tree2WeakVH for tree2WeakVH https://github.com/xiangzhai/dragonegg/blob/master/include/dragonegg/gt-cache-4.6.inc#L24 but gengtype will not auto-generate ggc_alloc_XXX for GCC v6.x or v8.x (git-20170816), for example: struct GTY((for_user)) tree2WeakVH https://github.com/xiangzhai/dragonegg/blob/master/include/dragonegg/gt-cache-8.0.inc#L1284 As ChangeLog-2014 mentioned: 2014-05-17 Trevor Saunders ... (ggc_alloc): Install the type's destructor as the finalizer if it might do something. Please give me some hint about ggc_alloc migration, thanks a lot! if you look at the patches they convert ggc_alloc_foo to ggc_alloc and you should do the same. Thanks for your hint! I do the same :) https://github.com/xiangzhai/dragonegg/blob/master/src/Cache.cpp#L255 PS: how to find the relative patch for the ChangeLog's item? I use Google, for example: (ggc_alloc): Install the type's destructor as the finalizer if it might do something. Trev -- Regards, Leslie Zhai - a LLVM developer https://reviews.llvm.org/p/xiangzhai/ -- Regards, Leslie Zhai - a LLVM developer https://reviews.llvm.org/p/xiangzhai/
Release Signing Keys are Susceptible to Attack
After downloading and verifying the releases on ftp://ftp.gnu.org/gnu/, I found that the maintainers used 1024 bit DSA keys with SHA1 content digests. 1024 bit keys are considered to be susceptible to realistic attacks, and SHA1 has been considered broken for some time. http://nvlpubs.nist.gov/nistpubs/SpecialPublications/NIST.SP.800-131Ar1.pdf, p17 https://shattered.io/ SHA1 is weak enough that a team of researchers was able to mount a realistic attack at no great cost. As compilers and their utilities are a high value target I would appreciate it if the maintainers move to more secure verification schemes. Respectfully, R0b0t1.
Re: GCC 7.2 Released
> The GNU Compiler Collection version 7.2 has been released. Shouldn't the release have a tag on git? It doesn't seem to be there: https://gcc.gnu.org/git/?p=gcc.git;a=tags git tag gcc-7_2_0-release 1bd23ca8c30f4827c4bea23deedf7ca33a86ffb5 BR, Klaus > > GCC 7.2 is a bug-fix release from the GCC 7 branch > containing important fixes for regressions and serious bugs in > GCC 6.1 with more than 95 bugs fixed since the previous release. > This release is available from the FTP servers listed at: > > http://www.gnu.org/order/ftp.html > > Please do not contact me directly regarding questions or comments > about this release. Instead, use the resources available from > http://gcc.gnu.org. > > As always, a vast number of people contributed to this GCC release > -- far too many to thank them individually!