Thanks for giving it a try. On Fri, Oct 31, 2014 at 3:43 AM, Jeff Law <l...@redhat.com> wrote: > On 10/10/14 21:32, Bin.Cheng wrote: >> >> Mike already gave great answers, here are just some of my thoughts on >> the specific questions. See embedded below. > > Thanks to both of you for your answers. > > Fundamentally, what I see is this scheme requires us to be able to come up > with a key based solely on information in a particular insn. To get fusion > another insn has to have the same or a closely related key. > > This implies that the the two candidates for fusion are related, even if > there isn't a data dependency between them. The canonical example would be > two loads with reg+d addressing modes. If they use the same base register > and the displacements differ by a word, then we don't have a data dependency > between the insns, but the insns are closely related by their address > computations and we can compute a key to ensure those two related insns end > up consecutive. At any given call to the hook, the only context we can > directly see is the current insn. > > I'm pretty sure if I were to tweak the ARM bits ever-so-slightly it could > easily model the load-load or store-store special case on the PA7xxx[LC] > processors. Normally a pair of loads or stores can't dual issue. But if > the two loads (or two stores) hit the upper and lower half of a double-word > objects, then the instructions can dual issue. > > I'd forgotten about that special case scheduling opportunity until I started > looking at some unrelated enhancement for prefetching. > > Your code would also appear to allow significant cleanup of the old > caller-save code that had a fair amount of bookkeeping added to issue > double-word memory loads/stores rather than single word operations. This > *greatly* improved performance on the old sparc processors which had no > call-saved FP registers. > > However, your new code doesn't handle fusing instructions which are totally > independent and of different static types. There just isn't a good way to > compute a key that I can see. And this is OK -- that case, if we cared to > improve it, would be best handled by the SCHED_REORDER hooks. > >>>> >>>> I guess another way to ask the question, are fusion priorities static >>>> based on the insn/alternative, or can they vary? And if they can vary, can >>>> they vary each tick of the scheduler? >> >> >> Though this pass works on predefined fusion types and priorities now, >> there might be two possible fixes for this specific problem. >> 1) Introduce another exclusive_pri, now it's like "fusion_pri, >> priority, exclusive_pri". The first one is assigned to mark >> instructions belonging to same fusion type. The second is assigned to >> fusion each pair/consecutive instructions together. The last one is >> assigned to prevent specific pair of instructions from being fused, >> just like "BC" mentioned. >> 2) Extend the idea by using hook function >> TARGET_SCHED_REORDER/TARGET_SCHED_REORDER2. Now we can assign >> fusion_pri at the first place, making sure instructions in same fusion >> type will be adjacent to each other, then we can change priority (thus >> reorder the ready list) at back-end's wish even per each tick of the >> scheduler. > > #2 would be the best solution for the case I was pondering, but I don't > think solving that case is terribly important given the processors for which > it was profitable haven't been made for a very long time. I am thinking if it's possible to introduce a pattern-directed fusion. Something like define_fusion, and adapting haifa-scheduler for it. I agree there are two kinds (relevant and irrelevant) fusion types, and it's not trivial to support both in one scheme. Do you have a specific example that I can have a try?
This is just a preliminary idea and definitely can't catch up in GCC 5.0. Moreover, so far I didn't see such requirement on ARM/AARCH64 or any other targets that I know, so I would like to continue with this version if it's fine. Later I will send patch pairing different kinds of ldp/stp based on this for aarch64. Thanks, bin > > Jeff