Hi Howard, > Coincidentally I also explored this option in another product. We > ended up implementing it and it seemed to work quite well. It did > require the back end to "register" with the preprocessor those > builtins it implemented, and quite frankly I don't know exactly how > that registration worked. But from a library writer's point of view, > it was pretty cool. For example: > > inline > unsigned > count_bits32(_CSTD::uint32_t x) > { > #if __has_intrinsic(__builtin_count_bits32) > return __builtin_count_bits32(x); > #else > x -= (x >> 1) & 0x55555555; > x = (x & 0x33333333) + ((x >> 2) & 0x33333333); > x = (x + (x >> 4)) & 0x0F0F0F0F; > x += x >> 8; > x += x >> 16; > return (unsigned)x & 0xFF; > #endif > } > > In general, if __builtin_XYZ is implemented in the BE, then > __has_intrinsic(__builtin_XYZ) answers true, else it answers false. > This offers a lot of generality to the library writer. Generic > implementations are written once and for all (in the library), and > need not be inlined.
Indeed, we are striving for generality. The mechanism that you suggest would be rather easy to implement, in principle: as I wrote earlier today, it's pretty simple to extract the info and prepare a new preprocessor builtin that tells you for sure whether that specific target has the atomics implemented or not. Later I can also tell you the exact files and functions which would be likely touched, in case, have to dig a bit in my disk ;) Then, however, where to put the fallback *assembly* for each *target-specific* atomic built-in? The most natural choice seems to me libgcc, because our infrastructure of builtins *automatically* issues library calls at run time when the builtin is not available. That solution would avoid once and for all playing tricks with macros like the above, IMHO. And it's flexible also for many different targets, some even requiring different subversions (the obnoxious i386 vs i686...). To repeat my "philosophy", the idea of having compiler builtins is good, very good, but then we should, IMHO, complete the offloading of this low-level issue to the compiler, letting the compiler (+ libgcc, in case) to take care of generating code for __exchange_and_add, or whatever. Paolo.