Built-in function question
I came across some very interesting behavior regarding to built-in functions: int __builtin_popcount( unsigned x ); is a gcc bult-in, which actually returns the number of 1 bits in x. int foo( unsigned x ) { return __builtin_popcount( x ); } generates a call to the __popcountsi2 function in libgcc, for any target I tried it for (well, I tried for x86, ARM and m68k). However: int (*bar( void ))( unsigned ) { return __builtin_popcount; } returns the address of the label "__builtin_popcount", which does not exist: int main( int arcg, char *argv[] ) { (void) argv; return (*bar())( argc ); } fails to compile because of an undefined reference to __builtin_popcount. The compiler does not give any warning with -Wall -Wextra -pedantic but it spits the dummy during the linking phase. The next quite interesting thing is the effect of optimisation. With -O1 or above bar() returns the address of the non-existent function __builtin_popcount() *but* main(), which dereferences bar() is optimised to simply call __popcountsi2 in the library. So the linking fails because bar() (which is not actually called by main()) refers to the nonexistent function, but if bar() is made static, the optimisiation gets rid of it and everything is fine and the linking succeeds. A further point is that the compiler generates a .globl for __popcountsi2 but it does not do that for __builtin_popcount, which is rather unusual (although not fatal, since gas treats all undefined symbols as globals). Nevertheless, gcc normally pedanticly emits a .globl for every global symbol it generates or refers to, but not in this case. At least the 4.5.x compiler behaves like that. The info page does not say that one can not take the address of a built-in function (and the compiler does not issue a warning on it), so a link time failure, which depends on whether the optimiser could eliminate the need to the actual function pointer or not, is somewhat surprising. I understand that there are very special built-in functions, some that work only at compile time, some show very funky argument handling behaviour and so on. However, many are (well, seem to be) stock standard functions, realised either as a call to libgcc or as a few machine instructions, that is, behaving like inline asm() wrapped in a static inline. Those functions, I think, should really behave like ordinary (possibly static inline asm) functions. Or, if not, at least one should be warned. I believe that the above is an issue, but I don't know if it is bug or a feature, i.e. a compiler or a documentation issue? Thanks, Zoltan
Re: Built-in function question
Zoltán Kócsi writes: > However: > >int (*bar( void ))( unsigned ) >{ > return __builtin_popcount; >} > > returns the address of the label "__builtin_popcount", which does not exist: > >int main( int arcg, char *argv[] ) >{ > (void) argv; > return (*bar())( argc ); >} > > fails to compile because of an undefined reference to __builtin_popcount. ... > I believe that the above is an issue, but I don't know if it is bug or a > feature, i.e. a compiler or a documentation issue? It's a documentation issue, and also a compiler issue in that the compiler should give an error for this. __builtin_popcount will compile to the popcnt instruction on x86_64 if you use -march=corei7-avx. Ian
Potentially merging cxx-mem-model with mainline.
I'd like to have the cxx-mem-model branch considered for merging with mainline before we end stage 1 for GCC 4.7. What it is == GCC has had the __sync built-ins for atomic operations for a number of years now. They implement a "sequential consistent" (AKA seq-cst) synchronization model which is the most restrictive (ie expensive) form of synchronization. It requires that all processes can see a consistent value for other shared memory variables at the point of the atomic operation. The new C++ standard defines other less restrictive modes which many newer architecture can make use of for improved performance in multi-threaded environments. These will also allow the optimizer more freedom in moving code around as well. This branch has developed a new set of atomic built-ins which include a memory model parameter and meet the atomic requirements of C++11. During development, I've kept a wiki page pretty updates about what its all about, rooted here: http://gcc.gnu.org/wiki/Atomic/GCCMM What it involves === The new __atomic prefixed built-ins have been implemented with ease of target migration and maintenance in mind: - The __atomic expanders first look for a new __atomic RTL pattern and use that if present. - failing that, they fall back to using the original __sync implementation and patterns. This is less efficient if the memory model specified isn't seq-cst, but is correct in functionality. This also means all the __atomic built-ins work correctly today if a target has __sync support. The original __sync builtins now invoke the __atomic expanders with the seq-cst memory model. This means that if a target has not been migrated to the new __atomic patterns, the __sync functions behave exactly as they do today (since __atomic expanders fall back to the original __sync patterns). When a target does specify new __atomic RTL patterns, the legacy __sync routines will automatically start using those patterns. This means that a target does not need to support both __atomic and __sync patterns. Migrating involves simply renaming the __sync patterns and modifying them to deal with the memory model parameter. There are new generic __atomic routines which provide support for arbitrary sized objects. Whenever possible, atomic operations are mapped to lock-free instruction sequences. This is not always possible either due to target restrictions, or oddly/large sized objects. The __atomic builtins leave external function calls for these cases, but they target a well defined library interface documented here: http://gcc.gnu.org/wiki/Atomic/GCCMM/LIbrary I expect to spin off a small side project to implement this library in the next few months so that a library is easily available to resolve these calls. With a few C++ template changes to utilize these new built-ins, libstdc++-v3 should have a fully functional implementation of atomics for this release. What it doesn't do = It does not address other missing aspects of the c++ memory model. In particular, bitfields are still not compliant with not introducing new potential data races. There are flags for load and store data races in mainline already, and some work has gone into limiting store data races, but testing is by no means thorough. Whats left === Functionality is pretty much complete, but there are a few minor lose ends still to deal with. They could be done after a merge, in the next stage, or required before... you tell me :-) - potentially implement -f[no]-inline-atomics (to never produce inline code and always call the library) and -f[no]-atomic-compare-swap-loop (To not fall back to a compare_and_swap loop to implement missing functionality) - unaligned objects have undefined behaviour at the moment. Behaviour could be defined and add alignment checks and a parameter to __atomic_is_lock_free() for alignment checking purposes. Anything which doesn't map to one of the properly aligned 5 sized built-ins gets a library call. - A bit of C++ template restructuring in the include files to remove the old fall back locked implementation and fully use the new __atomic builtins. (*in progress now*) - Change external library calls for __atomic_op_fetch routines. (*patch submitted already*) - There are a bunch of new tests that have been developed along the way, but I I expect to spend the next 2 months writing more detailed and specific runtime and compile time tests. And of course, fixing any of the fall out from those tests. - There have been no new __atomic RTL patterns created. I was thinking about leaving this until the next release, but I suppose we could migrate a couple of the more popular targets The final word = So what is the opinion/consensus on merging the branch? It would be nice to get this infrastructure in place for this release so we can get people to start using it, and then w
Re: Potentially merging cxx-mem-model with mainline.
On 26 October 2011 16:38, Andrew MacLeod wrote: > I'd like to have the cxx-mem-model branch considered for merging with > mainline before we end stage 1 for GCC 4.7. > > What it is > == > > GCC has had the __sync built-ins for atomic operations for a number of years > now. They implement a "sequential consistent" (AKA seq-cst) synchronization > model which is the most restrictive (ie expensive) form of synchronization. > It requires that all processes can see a consistent value for other shared > memory variables at the point of the atomic operation. The new C++ standard > defines other less restrictive modes which many newer architecture can make > use of for improved performance in multi-threaded environments. These will > also allow the optimizer more freedom in moving code around as well. > > This branch has developed a new set of atomic built-ins which include a > memory model parameter and meet the atomic requirements of C++11. > > During development, I've kept a wiki page pretty updates about what its all > about, rooted here: http://gcc.gnu.org/wiki/Atomic/GCCMM > > What it involves > === > > The new __atomic prefixed built-ins have been implemented with ease of > target migration and maintenance in mind: > > - The __atomic expanders first look for a new __atomic RTL pattern and use > that if present. > - failing that, they fall back to using the original __sync implementation > and patterns. This is less efficient if the memory model specified isn't > seq-cst, but is correct in functionality. This also means all the __atomic > built-ins work correctly today if a target has __sync support. > > The original __sync builtins now invoke the __atomic expanders with the > seq-cst memory model. This means that if a target has not been migrated to > the new __atomic patterns, the __sync functions behave exactly as they do > today (since __atomic expanders fall back to the original __sync patterns). > When a target does specify new __atomic RTL patterns, the legacy __sync > routines will automatically start using those patterns. > > This means that a target does not need to support both __atomic and __sync > patterns. Migrating involves simply renaming the __sync patterns and > modifying them to deal with the memory model parameter. > > There are new generic __atomic routines which provide support for arbitrary > sized objects. Whenever possible, atomic operations are mapped to lock-free > instruction sequences. This is not always possible either due to target > restrictions, or oddly/large sized objects. > > The __atomic builtins leave external function calls for these cases, but > they target a well defined library interface documented here: > http://gcc.gnu.org/wiki/Atomic/GCCMM/LIbrary I expect to spin off a small > side project to implement this library in the next few months so that a > library is easily available to resolve these calls. So what happens on a target that implements the __sync calls via routines in libgcc (e.g. older ARM) ? Will the __atomic's get turned into the __sync calls? Dave
gnat cross compilation
Hi, Is it possible to compile a gnat cross compiler based on gcc 4.5.2 using my pre-installed gnat native compiler based on gcc 3.4.6 ? Or should I try to build my own local native compiler based on gcc 4.5.2 ? I ask the question because for the moment, I'm stuck with the following error during the make : a-except.adb:45:01: warning: unrecognized pragma "Compiler_Unit" And I think that "Compiler_Unit" may no exist in gcc 3.4.6... Regards, Selim
Re: Potentially merging cxx-mem-model with mainline.
> Whats left > === > Functionality is pretty much complete, but there are a few minor lose > ends still to deal with. They could be done after a merge, in the > next stage, or required before... you tell me :-) > > - potentially implement -f[no]-inline-atomics (to never produce > inline code and always call the library) and > -f[no]-atomic-compare-swap-loop (To not fall back to a > compare_and_swap loop to implement missing functionality) > > - unaligned objects have undefined behaviour at the moment. > Behaviour could be defined and add alignment checks and a parameter > to __atomic_is_lock_free() for alignment checking purposes. Anything > which doesn't map to one of the properly aligned 5 sized built-ins > gets a library call. > - A bit of C++ template restructuring in the include files to remove > the old fall back locked implementation and fully use the new > __atomic builtins. (*in progress now*) Hit me off-line about this. Hopefully I can help expedite. > - Change external library calls for __atomic_op_fetch routines. > (*patch submitted already*) > > - There are a bunch of new tests that have been developed along the > way, but I I expect to spend the next 2 months writing more detailed > and specific runtime and compile time tests. And of course, fixing > any of the fall out from those tests. Yes. I don't see this as a blocker for the merge. > The final word > = > So what is the opinion/consensus on merging the branch? It would be > nice to get this infrastructure in place for this release so we can > get people to start using it, and then we can work out any issues > that arise. > > I'd have Aldy do the actual merge because if I do something will go > amok for sure. I wont be around this weekend to fix any fallout, but > I am around until Friday evening. I'm around all next week. I don't > anticipate much problem since this is all new functionality for the > most part, and mainline was merged with the branch a week or two ago. I am really expecting this branch to be merged for 4.7. The current status is very presentable IMHO. -benjamin
Re: Potentially merging cxx-mem-model with mainline.
On Wed, 26 Oct 2011, Andrew MacLeod wrote: > Whats left Out of interest, do you have any plans for the C1X side of things (_Atomic, stdatomic.h etc.)? That would of course be for 4.8 or later. -- Joseph S. Myers jos...@codesourcery.com
Re: ARM Linux EABI: unwinding through a segfault handler
> >> So, suggestions welcome. Is there a nice way to detect a signal frame? That just makes me ask why are you're trying to detect a signal frame in the first place? > > Libunwind also reads the IP to detect signal frames on ARM Linux: > > http://git.savannah.gnu.org/gitweb/?p=libunwind.git;a=blob;f=src/arm/Gis_ > > signal_frame.c;hb=HEAD > > > > I'd also be interested if there are better approaches to detect them. :) > > There aren't better ways - this is pretty much the standard for > on-stack signal frames :-) > > I thought we used a handler in GLIBC that was properly annotated, > nowadays, but I might be mistaken. We do, but the annotation is fairly approximate. Short story is that the standard EABI unwinding opcodes can't describe a signal frame accurately. Especially if we don't know (when building glibc) whether the application will be using VFP. The good news is that The EABI unwinding tables delegate all interesting behavior to the personality routine (c.f. DWARF unwinding where the frame description is parsed by libunwind). In order to accurately unwind signal frames you need to have the sa_restorer functions reference a custom personality routine that does the unwinding. If something is specifying a non-default sa_restorer, then you loose. Don't Do That :-) While Richard is correct that the ARM EABI only requires unwinding information at call sites, in practice as long as you use and accept the limitations imposed by -fnon-call-exceptions, and ignore stack overflows, it should be sufficient. Paul
tree-ssa-strlen vs. zero-length strings
The file tree-ssa-strlen.c causes several warnings when compiling the Go library. The warnings look like: ../../../gccgo2/libgo/go/http/transport.go: In function ‘http.String.pN29_libgo_http.http.connectMethod’: ../../../gccgo2/libgo/go/http/transport.go:437:1: warning: offset outside bounds of constant string [enabled by default] They are occurring because Go uses strings which are not null terminated. Some of those strings are zero length. The code in tree-ssa-srlen in strlen_enter_block calls get_stridx on the PHI arguments. get_stridx calls c_strlen. c_strlen issues the above warning. I don't think it is possible to trigger this using C because I don't think it is possible to have a zero-length STRING_CST in C. Here is a Go file which will trigger the warning when compiling with -O2. package p func A(b bool) string { s := "" if b { s = "a" } return s } foo.go: In function ‘p.A’: foo.go:2:1: warning: offset outside bounds of constant string [enabled by default] I'm not quite sure what get_stridx is doing here. Perhaps the fix is as simple as checking TREE_STRING_LENGTH before calling c_strlen. Ian
Re: gnat cross compilation
> Is it possible to compile a gnat cross compiler based on gcc 4.5.2 > using my > pre-installed gnat native compiler based on gcc 3.4.6 ? Or should I try > to > build my own local native compiler based on gcc 4.5.2 ? You needed a matching native compiler first in order to build a cross GNAT compiler. So yes, you need to first build a native GCC 4.5.2 in order to build a cross 4.5.2 compiler. Arno