On 10/17/2017 07:22 PM, Jan Hubicka wrote:
According to Agner's tables, gathers range from 12 ops (vgatherdpd)
to 66 ops (vpgatherdd). I assume that CPU needs to do following:
In our code, it is basically don't" care" how much work it is for a
gather instruction to do its work.
Without gat
> Please look at the testsuite fallout in detail. Note that only
> testcases that do not disable the cost model should be affected
> (all vect.exp testcases disable the cost model for example).
>
> The patch itself looks mostly good, I suppose if we also have
> separate costs for float vs. double
On Thu, 19 Oct 2017, Jan Hubicka wrote:
> Hi,
> this is proof of concept patch for vectorizer costs to use costs used for
> rtx_cost
> and register_move_cost which are readily available in ix86_costs instead of
> using
> its own set of random values. At least until we have proof of evidence tha
Hi,
this is proof of concept patch for vectorizer costs to use costs used for
rtx_cost
and register_move_cost which are readily available in ix86_costs instead of
using
its own set of random values. At least until we have proof of evidence that
vectroizer
costs needs to differ, I do not think w
> > Those instructions seems similarly expensive in Intel implementation.
> > http://users.atw.hu/instlatx64/GenuineIntel0050654_SkylakeXeon9_InstLatX64.txt
> > lists latencies ranging from 18 to 32 cycles.
> >
> > Of course it may also be the case that the utility is measuring gathers
> > incorr
On Wed, 18 Oct 2017, Jan Hubicka wrote:
> > > According to Agner's tables, gathers range from 12 ops (vgatherdpd)
> > > to 66 ops (vpgatherdd). I assume that CPU needs to do following:
> > >
> > > 1) transfer the offsets sse->ALU unit for address generation (3 cycles
> > >each, 2 ops)
> > >
> > According to Agner's tables, gathers range from 12 ops (vgatherdpd)
> > to 66 ops (vpgatherdd). I assume that CPU needs to do following:
> >
> > 1) transfer the offsets sse->ALU unit for address generation (3 cycles
> >each, 2 ops)
> > 2) do the address calcualtion (2 ops, probably 4 ops
On Tue, 17 Oct 2017, Jan Hubicka wrote:
> > On Tue, 17 Oct 2017, Jan Hubicka wrote:
> >
> > > Hi,
> > > gether/scatter loads tends to be expensive (at least for x86) while we
> > > now account them
> > > as vector loads/stores which are cheap. This patch adds vectorizer cost
> > > entry for th
> On Tue, 17 Oct 2017, Jan Hubicka wrote:
>
> > Hi,
> > gether/scatter loads tends to be expensive (at least for x86) while we now
> > account them
> > as vector loads/stores which are cheap. This patch adds vectorizer cost
> > entry for these
> > so this can be modelled more realistically.
> >
On Tue, 17 Oct 2017, Jan Hubicka wrote:
> Hi,
> gether/scatter loads tends to be expensive (at least for x86) while we now
> account them
> as vector loads/stores which are cheap. This patch adds vectorizer cost
> entry for these
> so this can be modelled more realistically.
>
> Bootstrapped/r
Hi,
gether/scatter loads tends to be expensive (at least for x86) while we now
account them
as vector loads/stores which are cheap. This patch adds vectorizer cost entry
for these
so this can be modelled more realistically.
Bootstrapped/regtested x86_64-linux, OK?
Honza
2017-10-17 Jan Hubick
11 matches
Mail list logo