[Bug target/58269] [4.9 Regression] ICE when building libobjc on x86_64-apple-darwin* after revision 201915
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=58269 tocarip.intel at gmail dot com changed: What|Removed |Added CC||tocarip.intel at gmail dot com --- Comment #6 from tocarip.intel at gmail dot com --- -|| (TARGET_SSE && SSE_REGNO_P (regno) && !fixed_regs[regno])); +|| (TARGET_SSE && SSE_REGNO_P (regno) +&& (regno < FIRST_SSE_REG + SSE_REGPARM_MAX) +&& !fixed_regs[regno])); Those changes are not needed. If TARGET_64BIT is fasle all sse registers except xmm0-xmm7 should be fixed. Correct fix is index 0f4edb3..44b4b16 100644 --- a/gcc/config/i386/i386.c +++ b/gcc/config/i386/i386.c @@ -4231,10 +4231,10 @@ ix86_conditional_register_usage (void) /* If AVX512F is disabled, squash the registers. */ if (! TARGET_AVX512F) { -for (i = FIRST_EXT_REX_SSE_REG; i < LAST_EXT_REX_SSE_REG; i++) +for (i = FIRST_EXT_REX_SSE_REG; i <= LAST_EXT_REX_SSE_REG; i++) fixed_regs[i] = call_used_regs[i] = 1, reg_names[i] = ""; -for (i = FIRST_MASK_REG; i < LAST_MASK_REG; i++) +for (i = FIRST_MASK_REG; i <= LAST_MASK_REG; i++) fixed_regs[i] = call_used_regs[i] = 1, reg_names[i] = ""; } } - if (TARGET_MACHO) -{ - if (SSE_REGNO_P (regno) && TARGET_SSE) -return true; -} - else -{ - if (TARGET_SSE && SSE_REGNO_P (regno) - && (regno < FIRST_SSE_REG + SSE_REGPARM_MAX)) -return true; -} + if (TARGET_SSE && SSE_REGNO_P (regno) + && (regno < FIRST_SSE_REG + SSE_REGPARM_MAX)) +return true; Looks like this will break ABI. Before we returned true for e. g. xmm10. I couldn't find Darwin ABI to check which behaivor is correct. If we want to keep current behaivor something like index 0f4edb3..a603167 100644 --- a/gcc/config/i386/i386.c +++ b/gcc/config/i386/i386.c @@ -5708,7 +5708,8 @@ ix86_function_arg_regno_p (int regno) if (TARGET_MACHO) { - if (SSE_REGNO_P (regno) && TARGET_SSE) + if (SSE_REGNO_P (regno) && TARGET_SSE + && ! EXT_REX_SSE_REGNO_P (regno)) return true; } else Should work.
[Bug target/50038] redundant zero extensions
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=50038 --- Comment #2 from tocarip.intel at gmail dot com 2011-09-27 10:15:15 UTC --- Created attachment 25369 --> http://gcc.gnu.org/bugzilla/attachment.cgi?id=25369 Possible solution Here is an experimental patch which solves this problem. I modified implicit-zee pass to also eliminate useless zero-extensions from QImode to SImode. With this patch rgbyiqv test from EEMBC 2.0 benchmark showed 6% improvement. However after this patch implicit-zee may became useful for additional targets. For example it became beneficial to 32-bit x86 (+4% on rgbyiqv). Here is a Changelog: 2011-09-27 Ilya Tocar * implicit-zee.c: Added 2011 to copyright. (combine_set_zero_extend): Add QImode. (merge_def_and_ze): Likewise. (add_removable_zero_extend): Likewise. (not_qi_to_si): New. (make_defs_and_copies_lists): Add check for QImode.
[Bug target/50038] redundant zero extensions
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=50038 tocarip.intel at gmail dot com changed: What|Removed |Added CC||tocarip.intel at gmail dot ||com --- Comment #3 from tocarip.intel at gmail dot com 2011-09-30 14:51:31 UTC --- So assuming this approach (modify implicit-zee pass) is right, we'll have to enable this pass at least on x86 (32 bit). Is it ok ? P. S. I forgot to mention that this patch bootstraps/passes ,make check.
[Bug target/50038] redundant zero extensions
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=50038 --- Comment #5 from tocarip.intel at gmail dot com 2011-10-04 09:52:03 UTC --- This patch is experimental and before sending it to patches mail-list i wanted to verify that at least the approach (modify implicit-zee pass and later enable it on 32bit x86) is correct. Should i just send experimental version?
[Bug target/60204] struct with __m512i is mishandled in function parameter passing and return
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=60204 --- Comment #5 from tocarip.intel at gmail dot com --- Created attachment 32169 --> http://gcc.gnu.org/bugzilla/attachment.cgi?id=32169&action=edit Proposed patch. Currently testing attached patch.
[Bug target/64387] ICE: in extract_insn, at recog.c:2327 (unrecognizable insn) with -ffloat-store -mavx512er
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=64387 tocarip.intel at gmail dot com changed: What|Removed |Added CC||tocarip.intel at gmail dot com --- Comment #1 from tocarip.intel at gmail dot com --- Created attachment 34343 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=34343&action=edit Proposed patch.
[Bug target/64387] ICE: in extract_insn, at recog.c:2327 (unrecognizable insn) with -ffloat-store -mavx512er
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=64387 --- Comment #2 from tocarip.intel at gmail dot com --- Can also be reproduced with -mavx2 instead of -mavx512er. Proposed patch fixes both cases. Testing in progress.
[Bug target/64393] ICE: in extract_insn, at recog.c:2327 (unrecognizable insn) with -mavx512vbmi
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=64393 tocarip.intel at gmail dot com changed: What|Removed |Added CC||tocarip.intel at gmail dot com --- Comment #2 from tocarip.intel at gmail dot com --- Created attachment 34346 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=34346&action=edit Proposed patch. Added (untested patch) allows k{or,xoer,...} for avx512vbmi case, and fixes this problem. However I'm not sure, that just enabling whole avx512bw isn't a better idea.
[Bug target/64386] ICE: in extract_insn, at recog.c:2327 (unrecognizable insn) with -mavx512bw
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=64386 tocarip.intel at gmail dot com changed: What|Removed |Added CC||tocarip.intel at gmail dot com --- Comment #2 from tocarip.intel at gmail dot com --- Created attachment 34347 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=34347&action=edit Proposed patch. This (untested) patch fixes it.
[Bug bootstrap/63853] [5.0 Regression] The use of strchrnul breaks bootstrap on x86_64-apple-darwin14.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63853 tocarip.intel at gmail dot com changed: What|Removed |Added CC||tocarip.intel at gmail dot com --- Comment #16 from tocarip.intel at gmail dot com --- I've taken into account Jakub's input. Committed as r218044
[Bug target/50038] New: redundant zero extensions
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=50038 Bug #: 50038 Summary: redundant zero extensions Classification: Unclassified Product: gcc Version: 4.7.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: target AssignedTo: unassig...@gcc.gnu.org ReportedBy: tocarip.in...@gmail.com Following code void t_run_test(int Pels,unsigned char * ImageInPtr,unsigned char * ImageOutPtr) { int i; unsigned char xr, xg; unsigned char xy=0; for (i = 0; i < Pels; i++) { xr = *ImageInPtr++; xg = *ImageInPtr++; xy = (unsigned char) ((19595*xr + 38470*xg) >> 16); *ImageOutPtr++ = xy; } } Is compiled -O2 with both gcc 4.5.1 (Target: x86_64-redhat-linux Configured with: ../configure --prefix=/usr --mandir=/usr/share/man --infodir=/usr/share/info --with-bugurl=http://bugzilla.redhat.com/bugzilla --enable-bootstrap --enable-shared --enable-threads=posix --enable-checking=release --with-system-zlib --enable-__cxa_atexit --disable-libunwind-exceptions --enable-gnu-unique-object --enable-linker-build-id --enable-languages=c,c++,objc,obj-c++,java,fortran,ada,lto --enable-plugin --enable-java-awt=gtk --disable-dssi --with-java-home=/usr/lib/jvm/java-1.5.0-gcj-1.5.0.0/jre --enable-libgcj-multifile --enable-java-maintainer-mode --with-ecj-jar=/usr/share/java/eclipse-ecj.jar --disable-libjava-multilib --with-ppl --with-cloog --with-tune=generic --with-arch_32=i686 --build=x86_64-redhat-linux) and trunk version (Target: x86_64-unknown-linux-gnu Configured with: ../configure --enable-languages=c --disable-bootsrap Thread model: posix gcc version 4.7.0 20110808 (experimental) (GCC) ) to ... movzbl (%rsi), %edi movzbl 1(%rsi), %eax movq%rcx, %rsi movzbl %dil, %edi <- redundant movzbl %al, %eax <- redundant imull $19595, %edi, %edi imull $38470, %eax, %eax addl%edi, %eax ... For example icc does ... movzbl(%rsi), %ecx incl %eax movzbl1(%rsi), %r8d addq $2, %rsi imull $19595, %ecx, %r10d Without unnecessary zero extensions.