ON THE CALL: Shin-ming Liu (HP), Vladimir Makarov (Red Hat), Mark Smith (Gelato), Bob Kidd (UIUC), Andrey Belevantsev (RAS), Arutyun Avetisyan (RAS), Mark Davis (Intel)
Diego Novillo (Red Hat) was unable to join the call, but supplied an update to include in these notes. The GCC track at the upcoming Gelato ICE conference now finalized. Gerolf Hoflehner's talk on SPEC2006 had to be canceled because of a delay in its release. A new addition to the GCC track is Arutyun Avetisyan who will give an RAS work overview and start soliciting input for the August 2006 GCC meeting in Moscow. Confirmed topics/speakers for the Gelato ICE GCC track include: * Russian Academy of Science work overview and plans for August GCC meeting in Moscow - Arutyun Avetisyan * GCC IP issues - Dan Berlin * LLVM - Chris Lattner * LTO - Mark Mitchell * ORC back end for GCC - Shin-Ming Liu * Aliasing update - Dan Berlin * Russian Academy of Science scheduler improvement update - Andrey Belevantsev * Superblock work - Bob Kidd * Parallel programming with GCC - Diego Novillo * Intel micro-architecture talk - Cameron McNairy For a detailed list of confirmed speakers and topics for Gelato ICE 2006, visit: www.gelato.org/meeting#agenda Updates from call participants can be found below. NEXT MEETING: At the Gelato ICE meeting in San Jose, CA, April 24-26, 2006. Andrey Belevantsev: ------------------- Testing the aliasing patch with the latest mainline has revealed the changes in structure aliasing, so we had to rewrite some code that handles variables with structure field tags (SFTs). Now small arrays could also be decomposed onto elements for the sake of better aliasing. The other thing we fixed is more accurate propagation of original tree expressions saved with MEMs during expand. We have sent an updated patch to the gcc-patches list. Vladimir Makarov has approved the speculation patch and provided commetns on the ia64 part of the patch. We have fixed all issues pointed to by Vladimir. After additional regtesting on ia64 and i686, the patch was committed to trunk as rev. 112129. Earlier version of the patch was also bootstrapped and regtested on sparc-solaris. Using the patch on other platforms revealed some bugs (PR26275 and PR26734). The fixes for those PRs are submitted to the list. We have tested the basic features of code motion during this month. To accomplish this task, the main scheduling loop was written. A single iteration of the scheduling loop tries to form a group of instructions, which could be executed in parallel during one cycle (more or less corresponds to the instruction group of IA-64). At first, code motion of entire instructions inside a basic block was tested. Now we are testing interblock motions, which imply possible creation of bookkeeping code. Code motion of conditional branches is now disabled. Our next plans would be enable the code motion of right-hand sides of expressions. The last but not least, our paper proposal for GCC Summit 2006 has been accepted. The paper will talk about new scheduler work, proposed design and current state of implementation. Bob Kidd: --------- (Bob had his paper proposal for the GCC Summit 2006 accepted. The paper will cover the GCC superblock work in detail.) I checked the Superblock patch into the ia64-improvements tree. This patch has no significant effect on the overall estimated SPEC score for ia64 or ppc, and a slight degradation on x86_64. On IA64, some benchmarks run faster while others slow down. The overall score varies by one point. I'm looking into the changed benchmarks to see what causes the speedup or slowdown. I investigated 300.twolf, which slows down when superblocks are formed at the Tree-SSA level. One function (new_dbox_a) is significantly slower with the superblock patch than without. This function takes a pointer to an integer as an argument and updates the value of that integer inside a hot loop. The loop is structured along these lines: for (hot) if (cond) (biased) a = ... else a = ... *arg += a ... Tail duplication generates two copies of the *arg += ... line, which generates two copies of the load and store of arg. When tail duplication is not done, PRE can move the load and store of arg out of the loop, but it is unable to do this in the superblock loop. My suspicion is that superblock formation needs to fix up the alias info so that later optimizers realize these two loads are the same. Shin-ming Liu ------------- - HP has posted the GCC 4.1 release binary in HP portal for HP-UX: www.hp.com/go/gcc - HP submitted 11 patches to stock gcc and 3 patches to binutil - The Alternative backend project has made reasonable progress. The front end for this compiler still at 3.3.2. Both C and Fortran are functional and achieved the similar performance as ORC 2.1. The current focus is to update the backend to support Itanium C++ ABI. Vladimir Makarov: ----------------- Probably Robert Kidd's superblock scheduling in gcc for x86, x86_64 will not give improvement, because interblock scheduling (before reload) is switched off for this architectures mostly because the reload can not deal with RTL insns containing hard registers which were moved by the scheduler before the reload. So the code will be bigger with less code locality and consequently will be slower. If the superblock scheduling gives the improvement, it will be most probably for Itanium which is the least sensitive architecture for the code locality which I saw. With my point of view, the major problem of gcc scheduler (for in order execution processors like Itanium) is that it is done in the middle of the back-back end and there are insn splitting and lot of optimization after that. So I think that the current RAS work on the scheduler is more promising. Gcc Itanium port has no description of vector insns although there is vectorizer now in gcc. I proposed to do describe them. Mark Davis told that according to Intel experience it will not improve code. After some thoughts I guessed that it is because there are a lot of nops in ia64 bundles to fill them out by non-vector insns which will be executed for the same time as vector insn with a lot of nops in the same bundle. Diego Novillo: -------------- - OpenMP has been completely integrated in GCC mainline. - I will be presenting a design/implementation document on OpenMP at the next GCC summit. - We have been working with Dmitry on the tree->rtl alias export patch. I think we could add it to the ia64-improvements branch shortly, but I have been a bit sidetracked and haven't been able to check out their latest version. - Bob said he was considering doing a mainline->branch merge on the ia64-improvements branch. I'm not sure whether he's finished that. - I will continue to work on representation changes for our SSA form on the mem-ssa branch (http://gcc.gnu.org/ml/gcc/2006-02/msg00620.html).