On Mon, 8 Feb 2021, Benoît De Dinechin wrote:

> Hello, 
> 
> Is there a way to activate control speculation of loads in GCC, starting with
> the ia64 target? For a loop as simple as on GCC 7.5, I could not get any: 

I think in that loop cost modeling in sel-sched estimates that load speculation
would not be profitable. With a long-latency operation after the load, I do get
a speculative load at -O3 (for the 'payload' field, but not 'next'):

struct list {
  struct list *next;
  double payload;
};

double f(struct list *l)
{
  double result = 0;
  for (; l; l = l->next)
    result += 1 / l->payload;
  return result;
}

> Kalray has developed a 64-bit Fisher-style VLIW architecture ('KVX') for use
> in a manycore processor it produces. These VLIW cores run Linux, and Kalray
> develops GCC and LLVM code generators for them (see kvx compilers on
> https://godbolt.org/z/ZJGzje ). VLIW performance on non-numerical code is
> critically dependent on the control speculation of loads. Being a
> Fischer-style VLIW, the kvx architecture has dismissable loads instead of
> control speculative loads, so there is no need to create speculation check
> with recovery code. 
> 
> I first tried in prepass scheduling with SCHED_RGN, hoping from various
> comments in the source file that it could move loads across blocks
> (sched-rgn.c:26 The first run performs interblock scheduling, moving insns
> between different blocks in the same "region"). SCHED_EBB is not available in
> prepass and SEL_SCHED does not work with control speculation: not only from
> experience with the kvx retargeting where it breaks dataflow invariants, but
> also as hinted by logic in ia64.c:ia64_set_sched_flags(). 

Can you elaborate on the dataflow issues you've encountered? I don't recall the
specific reason why control speculation before register allocation cannot be
enabled with sel-sched, but I'd expect it has to do with the interval between
the speculative load and the check, in which the register may not be stored to
memory normally (needs dedicated spill/fill instructions), and interaction with
uninitialized variables assigned the same register.

If on KVX you don't need speculation checks, those concerns would not apply.

Why are you looking for pre-RA (prepass) scheduling specifically? To avoid
anti-dependencies created by register allocation?

> My question is whether GCC can or cannot do any control speculation of loads
> during prepass scheduling. From what I observed, enabling control speculation
> in region scheduling only enables the load instructions to get ready earlier
> in their home basic block, not being scheduled in a dominator basic block like
> expected to happen for improving performance in the above example. 

But there's no control flow inside a basic block, so the load can appear earlier
due to data speculation (or normal scheduling), not control speculation.

I think GCC may have correctness issues with ia64-style control speculation
before register allocation, but I can't think of a reason why check-free loads
would pose a problem.

Alexander

Reply via email to