------- Additional Comments From amacleod at redhat dot com 2005-09-19 16:41 ------- The culprit is in fact correct_uses. It was originally a linear function, and I had added some short cut attempts which reduced it in many cases to not have to traverse an entire list. Apparently the pathological cases still exist which negate that shortcut attempt.
So, there are 2 choices. I have a patch for each of method and am currently gathering some info on them which I will post with the 2 patches when I am finished. The issue is the lack of controlled access to the operand summary. Any optimization which bypasses the cache and directly manipulates the underlying tree can change information in the cache in an unknown way. The routine correct_uses is forced to check if a use has been manipulated this way, and if it has, put it in back in the correct immediate use list. Unfortunately, the only way to be sure is to look at the owning DEF of the list it is currently linked into and if that isnt the right def for this use, delink it and link it into the correct one. The original implementation simply scanned back to the root node, and checked it. This was strictly linear. The current approach is to look back in the list just far enough to find a use which we are sure is in the right list, and see if it has the same use as this node. You cannot simply check the previous node, because it may have been manipulated as well and not updated yet. Instead the scanner looks back until is find a use in the list whose is in an unmodified stmt. Then we are sure that node is in the right list, and can compare uses. It seems that is not good enough in some cases. There are 2 main solutions. 1 - Since this is only an issue when we directly manipulate the underlying trees, we can flag passes which do direct tree manipulations, and only do this lookup during those passes. There are two issues with this solution. a)it requires you to find and flag the appropriate passes, and keep this information accurate. b) It will still have this quasi linear behaviour a percentage of the time. 2 - Add another pointer to elements in the list which point to the list owner. Then the check is simply to see whether the list owner in the node is the same as the use of the node. O(1), but at the cost of an additional word of memory per use. During the original implementation, I had experiemented with (2), but was trying to be sensitive to the amount of memory required since I was already making quite an increase. At that point I hadn't found any pathological cases which causes this kind of explosion, and the current solution appeared to work fine in most cases, so I stuck with the current one. When Im done testing, I'll make both patches available along with comparable numbers. The initial result for this test case on my machine using the preferred solution (2) results in a drop of operand time from: tree operand scan : 18.96 (44%) usr 0.09 (28%) sys 19.14 (44%) wall to tree operand scan : 0.11 ( 0%) usr 0.02 ( 8%) sys 0.13 ( 1%) wall I gave stevenb a preliminary version of the patch last week and that it speeded up the cc1-collection of files on x86-64 by 1% (negligible really), but I think it addressed his pathological case which is really where the time was being spent. I will run results on both x86 and x86-64 over the next day for both patches and report back. -- What |Removed |Added ---------------------------------------------------------------------------- AssignedTo|unassigned at gcc dot gnu |amacleod at redhat dot com |dot org | Status|NEW |ASSIGNED Last reconfirmed|2005-07-25 04:44:04 |2005-09-19 16:41:31 date| | http://gcc.gnu.org/bugzilla/show_bug.cgi?id=21430