------- Comment #14 from amacleod at redhat dot com 2008-10-27 16:21 ------- TER's job is to create larger expressions for the expander so that we get better instruction selection during the initial expansion from trees/tuples to RTL.
It does this by simply expanding the definition of an ssa-name into its use location. This is only done if the definition has a single use, otherwise you would be executing the definition code more than once, which is generally undesirable. The code in this example has a string of about 14 serial adds, followed by 14 related adds. s1.155 = s1.153 + (long unsigned int) MEM[base: buf.183, offset: 1]{*D.1237}; s1.157 = s1.155 + (long unsigned int) MEM[base: buf.183, offset: 2]{*D.1240}; s1.159 = s1.157 + (long unsigned int) MEM[base: buf.183, offset: 3]{*D.1243}; s1.161 = s1.159 + (long unsigned int) MEM[base: buf.183, offset: 4]{*D.1246}; <...> s2.156 = s2.154 + s1.155; s2.158 = s2.156 + s1.157; s2.160 = s2.158 + s1.159; s2.162 = s2.160 + s1.161; Since s1.155 is used in 2 different places, it eliminates TER from doing anything with it. A register pressure reduction pass could alleviate this problem, either early near RTL expansion time or as part of the register allocator spilling subsystem. Both have been talked about, but I don't believe either has been worked on to any great degree. Scheduling could help as well if it would see fit to start interleaving some of those adds: Since the addition of s1.157 has to wait for s1.155 to finish, and then s1.159 has to wait for s1.157, s2.156 is ready to execute and could be interleaved between s1.157 and s1.159 while waiting for s1.157 to finish (which since it has to go to memory one would expect might be delayed). ie: s1.155 = s1.153 + (long unsigned int) MEM[base: buf.183, offset: 1]{*D.1237}; s1.157 = s1.155 + (long unsigned int) MEM[base: buf.183, offset: 2]{*D.1240}; s2.156 = s2.154 + s1.155; s1.159 = s1.157 + (long unsigned int) MEM[base: buf.183, offset: 3]{*D.1243}; s2.158 = s2.156 + s1.157; s1.161 = s1.159 + (long unsigned int) MEM[base: buf.183, offset: 4]{*D.1246}; s2.160 = s2.158 + s1.159; which would, as a convenient side effect, solve the problem. -- amacleod at redhat dot com changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |amacleod at redhat dot com http://gcc.gnu.org/bugzilla/show_bug.cgi?id=37916