First: my apologies for the delay in this reply.
[Richard wrote:]
Well, but we do have a pretty strong if-converter on RTL
> which has access to target specific information. Yes, I have had a quick look at it. It looks quite thorough. I think I see that you [Richard] are implying that the if converter at the GIMPLE level should not be trying to do all the if-conversion work that could possibly be done. I agree with that. However, AFAIK the RTL work is done strictly after the autovectorization, so any if conversion that is strictly for the benefit of autovectorization must be done before autovectorization and therefor at the GIMPLE level. Corrections are welcome. [Abe wrote:]
The preceding makes me wonder: has anybody considered adding an optimization profile for GCC, […] that optimizes for the amount of energy consumed?
>> I don`t remember reading about anything like that […] [Richard wrote:]
I think there were GCC summit papers/talks about this.
Thanks, but can you please be more specific? After writing the message quotes above as "[Abe wrote:]" I found 2 or 3 papers about compiling code with an eye towards energy efficiency, but not a whole hell of a lot, and I didn`t yet find anything GCC-specific on this topic. [Abe wrote:]
The old one can, in some cases, produce code that e.g. dereferences a null pointer when the same program given the same inputs would have not
>> done so without the if-conversion "optimization". [Richard wrote:]
Testcase? I don't think it can and if it can this bug needs to be fixed.
With the program below my sign-off, using stock GCC 4.9.2, even under "-O3" the code compiles and runs OK, even when using "-ftree-loop-if-convert", which is I guess what you [Richard] meant by a recent comment to me [Abe]. Sebastian confirmed in person that even the old if converter did things differently even to loads when GCC is invoked with the full "-ftree-loop-if-convert-stores" flag but without "-ftree-loop-if-convert-stores". With the old converter, compiling with "-ftree-loop-if-convert-stores" yields a program that segfaults due to dereferencing a null pointer [it would deref. a _lot_ of them if it _could_ ;-)]. [tested using GCC 4.9.2 on both Cygwin (64-bit) for Windows 7, AMD64 compilation, and Mac OS X 10.6.8 -- also using GCC 4.9.2 -- compiling for both ia32 and AMD64] I intend to adapt the test case to DejaGNU format and add it to the codebase from which the patch is being generated. [Abe wrote:]
The new converter reduces/eliminates this problem.
[Richard wrote:]
You mean the -ftree-loop-if-convert-stores path.
The old converter apparently only produced code with the aforementioned crashing problem only when "-ftree-loop-if-convert-stores" is/was in use, yes. The new one should not be producing code with that problem regardless of "-ftree-loop-if-convert-stores" or lack thereof. I think the reason for the confusing ambiguity is that since the old converter did conversion of stores in a way that was thread-unsafe for half hammocks [e.g. C source code like "if (condition) A[a] = something;" with no attached "else", assuming "condition" and "something" are both conversion-friendly], somebody used "-ftree-loop-if-convert-stores" to mean "if-convert as much as possible even if doing so requires converting unsafely". In the new converter we have no such unsafety TTBOMK, so I propose to remove that flag. Regards, Abe Makefile ======== all: foo_______if-converted foo_______if-converted_with_stores_flag foo___NOT_if-converted foo_______if-converted_with_stores_flag: foo.c gcc -std=c99 -O3 -ftree-loop-if-convert-stores foo.c -o foo_______if-converted_with_stores_flag foo_______if-converted: foo.c gcc -std=c99 -O3 -ftree-loop-if-convert foo.c -o foo_______if-converted foo___NOT_if-converted: foo.c gcc -std=c99 -O3 foo.c -o foo___NOT_if-converted foo.c ===== /* intentionally not defining "SIZE" here: pretending that "choose" is compiled separately from "main" */ /* a "controlled copy-paste" that takes 4 arrays, 2 of which are full of pointers, and for each index deref.s one of the 2 pointers and shoves the result in one of the arrays, based on the value in the respective index of the remaining array */ void __attribute__((noinline)) /* forcing no inlining b/c inlining theoretically allow sufficient analysis to allow the optimizer to "see" that "cond_array[...]" is full of nothing but 0s */ choose(char* cond_array, short ** pointer_array_1, short ** pointer_array_2, short * output_array, unsigned long long len) { for (unsigned long long index = 0; index < len; ++index) if (cond_array[index]) output_array[index] = *pointer_array_1[index]; else output_array[index] = *pointer_array_2[index]; } #define SIZE 9 int main() { char condition[SIZE]; short data_in [SIZE]; short data_out[SIZE]; short * pointer_array_1[SIZE]; short * pointer_array_2[SIZE]; for (unsigned short index = 0; index < SIZE; ++index) { condition [index] = 0; /*** all false ***/ pointer_array_1[index] = 0; /*** all null pointers ***/ pointer_array_2[index] = &data_in[index]; /*** all good pointers ***/ } choose(&condition[0], &pointer_array_1[0], &pointer_array_2[0], &data_out[0], SIZE); } result from running MacOSX/ia32 compilation under GDB 6.3.50* ============================================================= Program received signal EXC_BAD_ACCESS, Could not access memory. Reason: KERN_PROTECTION_FAILURE at address: 0x00000000 0x00001e2b in choose () result from running MacOSX/AMD64 compilation under GDB 6.3.50* ============================================================== Program received signal EXC_BAD_ACCESS, Could not access memory. Reason: KERN_INVALID_ADDRESS at address: 0x0000000000000000 0x0000000100000e90 in choose () * Apple-supplied version of GDB, identified as: GNU gdb 6.3.50-20050815 (Apple version gdb-1515) (Sat Jan 15 08:33:48 UTC 2011)