[Bug c++/35553] New: -fkeep-inline-functions and -O errors out in SSE headers
#include int main(int argc, char** argv) { return 0; } --- If compiled with g++ -O -fkeep-inline-functions, this errors out with /usr/lib/gcc/x86_64-pc-linux-gnu/4.3.1-pre20080306/include/emmintrin.h: In function ‘long long int __vector__ _mm_shuffle_epi32(long long int __vector__, int)’: /usr/lib/gcc/x86_64-pc-linux-gnu/4.3.1-pre20080306/include/emmintrin.h:1382: error: mask must be an immediate /usr/lib/gcc/x86_64-pc-linux-gnu/4.3.1-pre20080306/include/emmintrin.h: In function ‘long long int __vector__ _mm_shufflelo_epi16(long long int __vector__, int)’: /usr/lib/gcc/x86_64-pc-linux-gnu/4.3.1-pre20080306/include/emmintrin.h:1376: error: mask must be an immediate ... and much more lines to follow. This did not happen with 4.2.3. I am not able to make sure there are no bogus headers on the host involved, so I attached the preprocessed source. -- Summary: -fkeep-inline-functions and -O errors out in SSE headers Product: gcc Version: 4.3.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: c++ AssignedTo: unassigned at gcc dot gnu dot org ReportedBy: gpiez at web dot de GCC build triplet: x86_64-pc-linux-gnu GCC host triplet: x86_64-pc-linux-gnu GCC target triplet: x86_64-pc-linux-gnu http://gcc.gnu.org/bugzilla/show_bug.cgi?id=35553
[Bug c++/35553] -fkeep-inline-functions and -O errors out in SSE headers
--- Comment #1 from gpiez at web dot de 2008-03-12 14:49 --- Created an attachment (id=15303) --> (http://gcc.gnu.org/bugzilla/attachment.cgi?id=15303&action=view) preprocessed source -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=35553
[Bug middle-end/36041] Speed up builtin_popcountll
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=36041 Gunther Piez changed: What|Removed |Added CC||gpiez at web dot de --- Comment #10 from Gunther Piez 2012-10-26 15:51:24 UTC --- Just noted the exceptional slowness of the provided __builtin_popcountll() even on ARMv5. I already used the above parallel bit count algorithm in the case that a native bit count instruction (like the SSE popcnt or NEON vcnt) is not present, but native 64 bit registers are available. But on a 32 bit architecture like ARM I figured it made sense to just use the __builtin_popcountll() because the many 64 bit instructions in the algorithm may be very slow without NEON or similar support on a pure 32 bit architecture. But "optimizing" my code with some macro magic to make it use the library popcount made the whole program 25% slower, although only a minor part of it actually does use the popcount instruction.
[Bug c/50168] New: __builtin_ctz() and intrinsics __bsr(), __bsf() generate suboptimal code on x86_64
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=50168 Bug #: 50168 Summary: __builtin_ctz() and intrinsics __bsr(), __bsf() generate suboptimal code on x86_64 Classification: Unclassified Product: gcc Version: unknown Status: UNCONFIRMED Severity: normal Priority: P3 Component: c AssignedTo: unassig...@gcc.gnu.org ReportedBy: gp...@web.de Testcase: #include static inline long my_bsfq(long x) __attribute__((__always_inline__)); static inline long my_bsfq(long x) { long result; asm(" bsfq %1, %0 \n" : "=r"(result) : "r"(x) ); return result; } long c[64]; long f(long i) { return c[ __bsfq(i) ]; } long g(long i) { return c[ __builtin_ctzll(i) ]; } long h(long i) { return c[ my_bsfq(i) ]; } -- When I compile this with 'gcc -O3 -g testcase.c -c -o testcase.o && objdump -d testcase', I get -- : 0: 48 0f bc ff bsf%rdi,%rdi 4: 48 63 ffmovslq %edi,%rdi 7: 48 8b 04 fd 00 00 00mov0x0(,%rdi,8),%rax e: 00 f: c3 retq 0010 : 10: 48 0f bc ff bsf%rdi,%rdi 14: 48 63 ffmovslq %edi,%rdi 17: 48 8b 04 fd 00 00 00mov0x0(,%rdi,8),%rax 1e: 00 1f: c3 retq 0020 : 20: 48 0f bc ff bsf%rdi,%rdi 24: 48 8b 04 fd 00 00 00mov0x0(,%rdi,8),%rax 2b: 00 2c: c3 retq --- Please note the unneeded 32 to 64 bit conversion 'movslq ...' inserted by the compiler in functions f() and g(). It should look like h() instead. I suspect the source is the prototype of the builtin, whose return type 'int' does not match the "natural" return type on x86_64, which is 64 bit, the same register size as the input register. If I replace the builtin/intrinsic with the selfmade asm one, I get a nice speedup of 2% in my chessengine.
[Bug c/50168] __builtin_ctz() and intrinsics __bsr(), __bsf() generate suboptimal code on x86_64
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=50168 --- Comment #3 from Gunther Piez 2011-08-23 21:54:40 UTC --- On 23.08.2011 19:58, jakub at gcc dot gnu.org wrote: > http://gcc.gnu.org/bugzilla/show_bug.cgi?id=50168 > > Jakub Jelinek changed: > >What|Removed |Added > > CC||uros at gcc dot gnu.org > > --- Comment #2 from Jakub Jelinek 2011-08-23 > 17:58:52 UTC --- > Those aren't equivalent unfortunately, because bsf and bsr insns on x86 have > undefined value if the source is zero. While __builtin_c[lt]z* documentation > says that the result is undefined in that case, I wonder if it would be fine > even if long l = (int) __builtin_c[lt]z* (x); gave a value that wasn't > actually > sign-extended to 64 bits. > The combiner already simplifies zero or sign extension of popcount/parity/ffs > and, if ctz or clz value is defined at zero, also those, but if it is > undefined > it assumes anything in any of the bits and thus can't optimize the sign/zero > extension away. With -mbmi it will be optimized just fine, because for tzcnt > (and lzcnt for -mlzcnt) insns are well defined even for source operand zero. >
[Bug c/50168] __builtin_ctz() and intrinsics __bsr(), __bsf() generate suboptimal code on x86_64
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=50168 --- Comment #4 from Gunther Piez 2011-08-23 22:00:31 UTC --- On 23.08.2011 19:58, jakub at gcc dot gnu.org wrote: > While __builtin_c[lt]z* documentation > says that the result is undefined in that case, I wonder if it would be fine > even if long l = (int) __builtin_c[lt]z* (x); gave a value that wasn't > actually > sign-extended to 64 bits. So that software operating on the assumption that the value return by __builtin_c[lt]z* is always int, even in the undefined case, would break as soon at it sees a value outside the int range. Which could very well be the case, AFAIK in the zero case the value of the target register is just unchanged. IMHO this is ok, I doubt that such code exists and even if, it is very broken by design :-) Just my 2 cent.
[Bug lto/48246] New: ICE in lto_wpa_write_files
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=48246 Summary: ICE in lto_wpa_write_files Product: gcc Version: unknown Status: UNCONFIRMED Severity: normal Priority: P3 Component: lto AssignedTo: unassig...@gcc.gnu.org ReportedBy: gp...@web.de Created attachment 23754 --> http://gcc.gnu.org/bugzilla/attachment.cgi?id=23754 testcase I get an ICE when compiling the testcase with g++ -r -nostdlib testcase.ii -O3 -flto -o /dev/null Error message is lto1: internal compiler error: in lto_wpa_write_files, at lto/lto.c:1518 This is gcc-4.6.0-rc2.
[Bug c++/44500] [C++0x] Bogus narrowing conversion error
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=44500 --- Comment #18 from Gunther Piez 2011-03-24 11:45:47 UTC --- I have chosen the "recommended" way and added a cast, -fpermissive would allow to many other dubious constructs to pass. Still I think c++ should get rid of implicit integer conversions :-)
[Bug c++/44500] New: Bogus narrowing conversion error
Compiling with g++ -std=c++0x, using gcc-4.5.0 : struct A { char x; }; template void f() { char y = 42; A a = { y+C }; } int main() { f<1>(); } yields an "error: narrowing conversion of â(((int)y) + 8)â from âintâ to âcharâ inside { }". If I change the template parameter type from "char C" to "int C" the error message persists, this seems wrong too, but I am not quite shre. If I leave out the "y", everything is fine. -- Summary: Bogus narrowing conversion error Product: gcc Version: unknown Status: UNCONFIRMED Severity: normal Priority: P3 Component: c++ AssignedTo: unassigned at gcc dot gnu dot org ReportedBy: gpiez at web dot de GCC build triplet: x86_64-pc-linux-gnu GCC host triplet: x86_64-pc-linux-gnu GCC target triplet: x86_64-pc-linux-gnu http://gcc.gnu.org/bugzilla/show_bug.cgi?id=44500
[Bug c++/44500] [C++0x] Bogus narrowing conversion error
--- Comment #2 from gpiez at web dot de 2010-06-11 11:34 --- Sorry for the unicode mess. The error message is 'error: narrowing conversion of "(((int)y) + 1)" from "int" to "char" inside { }'. The same error happens with a non templated function, but if I use two template parameters, the error disappears, even if they are to large. So this is at least very inconsistent. no error: struct A { <-->char x; }; templatevoid f() { <-->A a = { C+D }; } int main() { <-->f<1,2>(); } still no error: struct A { <-->char x; }; templatevoid f() { <-->A a = { C+D }; } int main() { <-->f<1,2>(); } error: struct A { <-->char x; }; void f(char C, char D) { <-->A a = { C+D }; } int main() { <-->f(1,2); } I believe I should not get an error, even if the template parameter type is larger than a char, as long as the template parameter value fits in in char, so template void f() { char y = 42; A a = { y+C }; } should give no error, as long as C fits in a char. IMHO ;-) -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=44500
[Bug c++/44500] [C++0x] Bogus narrowing conversion error
--- Comment #5 from gpiez at web dot de 2010-06-11 12:09 --- So is it provable that for a "T op T" to be stored in T no narrowing takes place? If the answer for T == char is no and for T == int it is yes this is rather fishy ;-) -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=44500
[Bug c++/44500] [C++0x] Bogus narrowing conversion error
--- Comment #9 from gpiez at web dot de 2010-06-11 13:27 --- I understand now after the implicit promotion to int of a non constant value the result of the narrowing operation can't be guaranteed to fit in the original type. But I still think it shouldn't give an error, and if the standard says so, I think it is flawed in this regard ;-) Consider g(); // Warning, but no Error despite it can be proven that the value will not fit and this is very likely an error. Opposing to char c,d; A a = { c+d }; which is very likely not an error and would only require a mild warning. IMHO. Manuel, in your testcase, you do not only warn, you error out if compiled with -std=c++0x. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=44500
[Bug c++/44500] [C++0x] Bogus narrowing conversion error
--- Comment #13 from gpiez at web dot de 2010-06-12 08:47 --- ... -- gpiez at web dot de changed: What|Removed |Added Status|UNCONFIRMED |RESOLVED Resolution||INVALID http://gcc.gnu.org/bugzilla/show_bug.cgi?id=44500
[Bug c++/44500] [C++0x] Bogus narrowing conversion error
--- Comment #12 from gpiez at web dot de 2010-06-12 08:46 --- I am closing this, as it isn't a gcc bug, as it behaves according to the standard. The bug is in the standard, as it mandates f<1,1> // ok f() // error g() // no error, but undefined behaviuour f(char, char) // error g(int, int) // ok which is inconsistent and surprising. C++0x should really have got rid of the implicit integer promotion. Wasn't the intent of the implicit promotion to be able to write char a,b,c,d; a = b*c/d; and get a correct result even if b*c > CHAR_MAX? I believe nobody does write code like this anymore, and even if, you could simply say "undefined behaviour" ;-) It doesn't work for ints anyway. Instead I have now an implicit integer promotion which forces me to use an explicit cast in compound initializers, where narrowing conversion isn't allowed, while in a simple assignment of course it is allowed (or else a hell would break loose... ). Why not make -Wconversion an error, at least this would be consistent ;-) -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=44500
[Bug c++/44811] New: non controlable bogus warning: right/left shift count is negative
template uint64_t shift(uint64_t b) { if (N > 0) return b << N; else return b >> -N; } int main() { int a = shift<-5>(0x100); int b = shift<5>(0x100); return a+b; } --- I am using this function template in a header, and other warnings and even errors tend to get cluttered by the output of bogus "shift count is negative" warnings. I understand that dead code elimination happens only in the optimizer and probably to late to see that the negative shift count branch is never executed, and I could live with that if the warning was controlable with some "-W" option. Alas, it seems, it is not. The only way to supress this warning is using "-w", which also inhibits other, potential useful warnings, so using "-w" everywhere is not really an option. -- Summary: non controlable bogus warning: right/left shift count is negative Product: gcc Version: unknown Status: UNCONFIRMED Severity: normal Priority: P3 Component: c++ AssignedTo: unassigned at gcc dot gnu dot org ReportedBy: gpiez at web dot de http://gcc.gnu.org/bugzilla/show_bug.cgi?id=44811