On 11/12/2013, at 3:45 pm, Ramana Radhakrishnan <ramana....@googlemail.com> wrote:
> On Wed, Dec 11, 2013 at 12:02 AM, Maxim Kuvyrkov <ma...@kugelworks.com> wrote: >> On 11/12/2013, at 11:14 am, Ramana Radhakrishnan <ramana....@googlemail.com> >> wrote: >> >>> On Tue, Dec 10, 2013 at 9:44 PM, Maxim Kuvyrkov <ma...@kugelworks.com> >>> wrote: >>>> On 11/12/2013, at 5:17 am, Ramana Radhakrishnan >>>> <ramana....@googlemail.com> wrote: >>>> >>>>> On Mon, Jul 1, 2013 at 5:31 PM, Paulo Matos <pma...@broadcom.com> wrote: >>>>>> Hi, >>>>>> >>>>>> Near the start of schedule_block, find_modifiable_mems is called if >>>>>> DONT_BREAK_DEPENDENCIES is not enabled for this scheduling pass. It >>>>>> seems on c6x backend currently uses this. >>>>>> However, it's quite strange that this is not a requirement for all >>>>>> backends since find_modifiable_mems, moves all my dependencies in >>>>>> SD_LIST_HARD_BACK to SD_LIST_SPEC_BACK even though I don't have >>>>>> DO_SPECULATION enabled. >>>>>> >>>>>> Since dependencies are accessed later on from try_ready (for example), I >>>>>> would have thought that it would be always good not to call >>>>>> find_modifiable_mems, given that it seems to 'literally' break >>>>>> dependencies. >>>>>> >>>>>> Is the behaviour of find_modifiable_mems a bug or somehow expected? >>>> >>>> "Breaking" a dependency in scheduler involves modification of instructions >>>> that would allow scheduler to move one instruction past the other. The >>>> most common case of breaking a dependency is "r2 = r1 + 4; r3 = [r2];" >>>> which can be transformed into "r3 = [r1 + 4]; r2 = r1 + 4;". Breaking a >>>> dependency is not ignoring it, speculatively or otherwise; it is an >>>> equivalent code transformation to allow scheduler more freedom to fill up >>>> CPU cycles. >>> >>> >>> Yes, but there are times when it does this a bit too aggressively and >>> this looks like the cause for a performance regression that I'm >>> investigating on ARM. I was looking for a way of preventing this >>> transformation and there doesn't seem to be an easy one other than the >>> obvious hack. >> >> If you want a particular transformation from occurring, then you need to >> investigate why scheduler thinks that there is nothing better to do than to >> schedule an instruction which requires breaking a dependency. "Breaking" a >> dependency only increases pool of instructions available to schedule, and >> your problem seems to be laying in "why" the wrong instruction is selected >> from that pool. >> >> Are you sure that the problem is introduced by dependency breaking, rather >> than dependency breaking exposing a latent bug? > > From my reading because the dependency breaking is of addresses that > are in a memcpy type loop which is unrolled and the original > expectation is that by switching this to an add and a negative offset > one can get more ILP in theory, but in practice the effects appear to > be worse because of secondary issues that I'm still investigating. Is this happening in the 1st or 2nd scheduling pass? From your comments I get a feeling that dependency breaking is introducing an additional instruction, rather then adding an offset to a memory reference. Ideally, dependency breaking during 1st scheduling pass should be more conservative and avoid too many new instructions (e.g., by breaking a dependency only if nothing whatsoever can be scheduled on the current cycle). Dependency breaking during 2nd scheduling pass can be more aggressive as it can make sure that adding offset to a memory instruction will not cause it to be split. > >> >>> >>> Additionally there appears to be no way to control "flags" in a >>> backend hook for sched-rgn for DONT_BREAK_DEPENDENCIES . Again if the >>> DONT_BREAK_DEPENDENCIES is meant to be disabled with these flags, then >>> it looks like we should allow for these to also be handled or describe >>> TARGET_SCHED_SET_SCHED_FLAGS as only a hook valid with the selective >>> scheduler. >> >> I'm not sure I follow you here. Any port can define >> TARGET_SCHED_SET_SCHED_FLAGS and set current_sched_info->flags to whatever >> it thinks is appropriate. E.g., c6x does this to disable dependency >> breaking for a particular kind of loops. > > Ah, that will probably work and that's probably what I was missing. I > don't like the idea in general of the same interface setting global > state randomly in a backend is probably not the best approach in the > long term. Expecting to set global state in this form from an > interface is something I wasn't expecting especially when it takes a > parameter. Originally TARGET_SCHED_SET_SCHED_FLAGS was setting current_sched_info->flags and nothing else, hence the name. The parameter spec_info appeared later to hold flags related to IA64-specific speculative scheduling. -- Maxim Kuvyrkov www.kugelworks.com