On 21 May 2014 20:43, Steven Bosscher <stevenb....@gmail.com> wrote: > On Wed, May 21, 2014 at 11:58 AM, Zhenqiang Chen wrote: >> Hi, >> >> The patch fixes the gcc.target/i386/pr49095.c FAIL in PR61225. The >> test case tends to check a peephole2 optimization, which optimizes the >> following sequence >> >> 2: bx:SI=ax:SI >> 25: ax:SI=[bx:SI] >> 7: {ax:SI=ax:SI-0x1;clobber flags:CC;} >> 8: [bx:SI]=ax:SI >> 9: flags:CCZ=cmp(ax:SI,0) >> to >> 2: bx:SI=ax:SI >> 41: {flags:CCZ=cmp([bx:SI]-0x1,0);[bx:SI]=[bx:SI]-0x1;} >> >> The enhanced shrink-wrapping, which calls copyprop_hardreg_forward >> changes the INSN 25 to >> >> 25: ax:SI=[ax:SI] >> >> Then peephole2 can not optimize it since two memory_operands look like >> different. >> >> To fix it, the patch adds another peephole2 rule to read one more >> insn. From the register copy, it knows the address is the same. > > That is one complex peephole2 to deal with a transformation like this. > It seems to be like it's a too specific solution for a bigger problem. > > Could you please try one of the following solutions instead:
Thanks for the comments. > 1. Track register values for peephole2 and try different alternatives > based on known register equivalences? E.g. in your example, perhaps > there is already a REG_EQUAL/REG_EQUIV note available on insn 25 after > copyprop_hardreg_forward, to annotate that [ax:SI] is equivalent to > [bx:SI] at that point (or if that information is not available, it is > not very difficult to make it available). Then you could try applying > peephole2 on the original pattern but also on patterns modified with > the known equivalences (i.e. try peephole2 on multiple equivalent > patterns for the same insn). This may expose other peephole2 > opportunities, not just the specific one your patch addresses. I will try this one. > 2. Avoid copy-prop'ing ax:SI into [bx:SI] to begin with. At insn 7, > both ax:SI and bx:SI are live, so insn 2 is not dead (i.e. cannot be > eliminated) and there is no benefit in this transformation. It only > hides (or at least makes it harder to see) that [ax:SI] in insn 25 is > the same memory reference as [bx:SI] in insn 8. So perhaps the copy > propagation should simply not be done unless it turns at least one > instruction into dead code. This is a good heuristics. But it will lead copy-prop much more complexity. copy-prop pass scans INSN one by one to do the propagation. If there have multi reference INSNs, you can not make the decision until changing the last reference. > Any reason why this transformation isn't done much earlier, e.g. in combine? I do not know. The similar peephole2 rule was added to fix pr49095. Will try it if Solution 1 does not work. Thanks! -Zhenqiang