Trying to map absolute sections
Hi, I am trying to map an elf section to absolute address. Is there any way that we can restrain the linker (ld) not to relocate a section and place it at an absolute address. I have tried placing the absolute address in the Sh_Addr and updating the section name as SH_ABS. But it was of no use. Thanks in advance. Tarun
Trying to map absolute sections
Hi, I am trying to map an elf section to absolute address. Is there any way that we can restrain the linker (ld) not to relocate a section and place it at an absolute address. I have tried placing the absolute address in the Sh_Addr and updating the section name as SH_ABS. But it was of no use. Thanks in advance. Tarun
gcse pass: expression hash table
Hi, During expression hash table construction in gcse pass(gcc vercion 3.4.1), expressions like a*b does not get included into the expression hash table. Such expressions occur in PARALLEL along with clobbers. This means that such expression are not being subjected to PRE. Isn't it surprising? Can anyone throw light on this. Anticipating quick reply. Thanks, Tarun Kawatra Post Graduate Student, CSE Dept. IIT Bombay, India
Re: gcse pass: expression hash table
On Wed, 23 Feb 2005, James E Wilson wrote: Tarun Kawatra wrote: During expression hash table construction in gcse pass(gcc vercion 3.4.1), expressions like a*b does not get included into the expression hash table. Such expressions occur in PARALLEL along with clobbers. You didn't mention the target, or exactly what the mult looks like. Target is i386 and the mult instruction looks like the following in RTL (insn 22 21 23 1 (parallel [ (set (reg/v:SI 62 [ c ]) (mult:SI (reg:SI 66 [ a ]) (reg:SI 67 [ b ]))) (clobber (reg:CC 17 flags)) ]) 172 {*mulsi3_1} (nil) (nil)) However, this isn't hard to answer just by using the source. hash_scan_set calls want_to_cse_p calls can_assign_to_reg_p calls added_clobbers_hard_reg_p which presumably returns true, which prevents the optimization. This makes sense. If the pattern clobbers a hard reg, then we can't safely insert it at any place in the function. It might be clobbering the hard reg at a point where it holds a useful value. If that is the reason, then even plus expression (shown below) should not be subjected to PRE as it also clobbers a hard register(CC). But it is being subjected to PRE. Multiplication expression while it looks same does not get even in hash table. (insn 35 34 36 1 (parallel [ (set (reg/v:SI 74 [ c ]) (plus:SI (reg:SI 78 [ a ]) (reg:SI 79 [ b ]))) (clobber (reg:CC 17 flags)) ]) 138 {*addsi_1} (nil) (nil)) -tarun While looking at this, I noticed can_assign_to_reg_p does something silly. It uses "FIRST_PSEUDO_REGISTER * 2" to try to generate a test pseudo register, but this can fail if a target has less than 4 registers, or if the set of virtual registers increases in the future. This should probably be LAST_VIRTUAL_REGISTER + 1 as used in another recent patch.
Re: gcse pass: expression hash table
On Wed, 23 Feb 2005, James E Wilson wrote: Tarun Kawatra wrote: During expression hash table construction in gcse pass(gcc vercion 3.4.1), expressions like a*b does not get included into the expression hash table. Such expressions occur in PARALLEL along with clobbers. You didn't mention the target, or exactly what the mult looks like. However, this isn't hard to answer just by using the source. hash_scan_set calls want_to_cse_p calls can_assign_to_reg_p calls added_clobbers_hard_reg_p which presumably returns true, which prevents the optimization. This makes sense. If the pattern clobbers a hard reg, then we can't safely insert it at any place in the function. It might be clobbering the hard reg at a point where it holds a useful value. While looking at this, I noticed can_assign_to_reg_p does something silly. ^^^ I could not find this function anywhere in gcc 3.4.1 source. Although FIRST_PSEUDO_REGISTER * 2 is being used in make_insn_raw in want_to_gcse_p directly as follows if (test_insn == 0) { test_insn = make_insn_raw (gen_rtx_SET (VOIDmode, gen_rtx_REG (word_mode, FIRST_PSEUDO_REGISTER * 2), const0_rtx)); NEXT_INSN (test_insn) = PREV_INSN (test_insn) = 0; } It uses "FIRST_PSEUDO_REGISTER * 2" to try to generate a test pseudo register, but this can fail if a target has less than 4 registers, or if the set of virtual registers increases in the future. This should probably be LAST_VIRTUAL_REGISTER + 1 as used in another recent patch. I could not get this point. -tarun
Re: gcse pass: expression hash table
On Thu, 24 Feb 2005, James E Wilson wrote: On Thu, 2005-02-24 at 03:15, Steven Bosscher wrote: On Feb 24, 2005 11:13 AM, Tarun Kawatra <[EMAIL PROTECTED]> wrote: Does GCSE look into stuff in PARALLELs at all? From gcse.c: Shrug. The code in hash_scan_set seems to be doing something reasonable. The problem I saw wasn't with finding expressions to gcse, it was with inserting them later. The insertion would create a cc reg clobber, so we don't bother adding it to the hash table. I didn't look any further, but it seemed reasonable that if it isn't in the hash table, then it isn't going to be optimized. You are write here that if some expr doesn't get into hash table, it will not get optimized. But since plus expressions on x86 also clobber CC as shown below (insn 40 61 42 2 (parallel [ (set (reg/v:SI 74 [ c ]) (plus:SI (reg:SI 86) (reg:SI 85))) (clobber (reg:CC 17 flags)) ]) 138 {*addsi_1} (nil) (nil)) then why the same reasoning does not apply to plus expressions. Why will there insertion later will not create any problems? Actually I am trying to extend PRE implementation so that it performs strength reduction as well. it requires multiplication expressions to get into hash table. I am debugging the code to find where the differences for the two kind of expressions occur. Will let you all know if I found anything interesting. If you know this already please share with me. Thanks -tarun It seems that switching the x86 backend from using cc0 to using a cc hard register has effectively crippled the RTL gcse pass for it.
Re: gcse pass: expression hash table
On Thu, 24 Feb 2005, James E Wilson wrote: On Thu, 2005-02-24 at 03:15, Steven Bosscher wrote: On Feb 24, 2005 11:13 AM, Tarun Kawatra <[EMAIL PROTECTED]> wrote: Does GCSE look into stuff in PARALLELs at all? From gcse.c: Shrug. The code in hash_scan_set seems to be doing something reasonable. The problem I saw wasn't with finding expressions to gcse, it was with inserting them later. The insertion would create a cc reg clobber, so we don't bother adding it to the hash table. I didn't look any further, but it seemed reasonable that if it isn't in the hash table, then it isn't going to be optimized. This is with reference to my latest mail. I found that while doing insertions of plus kinda expressions, the experssions inserted does not contain clobbering of CC, even if it is there in original instruction. For example for the instruction (insn 40 61 42 2 (parallel [ (set (reg/v:SI 74 [ c ]) (plus:SI (reg:SI 86) (reg:SI 85))) (clobber (reg:CC 17 flags)) ]) 138 {*addsi_1} (nil) the instruction inserted is (insn 72 64 36 2 (set (reg:SI 87) (plus:SI (reg:SI 86 [ a ]) (reg:SI 85 [ b ]))) 134 {*lea_1} (nil) (nil)) That is it converts addsi_1 to lea_1. -tarun > It seems that switching the x86 backend from using cc0 to using a cc hard register has effectively crippled the RTL gcse pass for it.
Re: gcse pass: expression hash table
On Thu, 24 Feb 2005, Andrew Pinski wrote: On Feb 24, 2005, at 3:55 PM, Tarun Kawatra wrote: Actually I am trying to extend PRE implementation so that it performs strength reduction as well. it requires multiplication expressions to get into hash table. Why do you want to do that? Strength reduction is done already in loop.c. We may then get rid of loop optimization pass if the optimizations captured by extended pre approach is comparable to that of loop.c May be not all, but then this approach can capture straight code strength reduction(which need not depend on any loop, like in case of induction variables based optimization). -tarun Thanks, Andrew Pinski
Re: gcse pass: expression hash table
On Thu, 24 Feb 2005, James E Wilson wrote: On Thu, 2005-02-24 at 12:55, Tarun Kawatra wrote: You are write here that if some expr doesn't get into hash table, it will ^^ right. -tarun not get optimized. That was an assumption on my part. You shouldn't take it as the literal truth. I'm not an expert on all implementation details of the gcse.c pass. But since plus expressions on x86 also clobber CC as shown below then why the same reasoning does not apply to plus expressions. Why will there insertion later will not create any problems? Obviously, plus expressions will have the same problem. That is why I question whether plus expressions are properly getting optimized. Since you haven't provided any example that shows that they are being optimized, or pointed me at anything in the gcse.c file I can look at, there isn't anything more I can do to help you. All I can do is tell you that you need to give more details, or debug the problem yourself. Actually I am trying to extend PRE implementation so that it performs strength reduction as well. it requires multiplication expressions to get into hash table. Current sources have a higher level intermediate language (gimple) and SSA based optimization passes that operate on them. This includes a tree-ssa-pre.c pass. It might be more useful to extend this to do strength reduction that to try to extend the RTL gcse pass. I am debugging the code to find where the differences for the two kind of expressions occur. Will let you all know if I found anything interesting. Good. If you know this already please share with me. It is unlikely that anyone already knows this info offhand.
Re: gcse pass: expression hash table
My assumption here was that if I gave you a few pointers, you would try to debug the problem yourself. If you want someone else to debug it for you, then you need to give much better info. See for instance http://gcc.gnu.org/bugs.html which gives info on how to properly report a bug. I have the target and gcc version, but I need a testcase, compiler options, and perhaps other info. I will take this into consideration now onwards. The test case I am using (for multiplication expression is) #include void foo(); int main() { foo(); } void foo() { int a, b, c; int cond; scanf(" %d %d %d", &a, &b, &cond); if( cond ) c = a * b; c = a * b; printf("Value of C is %d", c); } -- and for plus, a*b replaced by a+b everywhere. I am compiling it as gcc --param max-gcse-passes=2 -dF -dG -O3 filename.c The reason for max-gcse-passes=2 is that in first pass a+b kind of expressions will be using different sets of pseudo registers at first and second occurance of a+b. After one gcse pass, both will become same (because of intermediate constant/copy propagation passes). Then a+b gets optimized. As can be seen from dumps filename.c.07.addressof and filename.c.08.gcse A part of expression hash table for program containing plus is Expression hash table (11 buckets, 11 entries) Index 0 (hash value 3) (plus:SI (reg/f:SI 20 frame) (const_int -4 [0xfffc])) Index 8 (hash value 1) (mem/f:SI (plus:SI (reg/f:SI 20 frame) (const_int -8 [0xfff8])) [2 b+0 S4 A32]) Index 9 (hash value 6) (plus:SI (reg:SI 78 [ a ]) (reg:SI 79 [ b ])) Index 10 (hash value 10) (plus:SI (reg:SI 80 [ a ]) (reg:SI 81 [ b ])) Which clearly shows that clobbering CC in a+b is being ignored if the expressions which requires to be inserted will not be containing clobbering of CC. How do you know that adds are getting optimized? Did you judge this by I am looking at dump files. looking at one of the dump files, or looking at the assembly output? Maybe you are looking at the wrong thing, or misunderstanding what you are looking at? You need to give more details here. Regards, -tarun