On Wed, Sep 9, 2020 at 3:49 PM Segher Boessenkool <seg...@kernel.crashing.org> wrote: > > Hi! > > On Tue, Sep 08, 2020 at 10:26:51AM +0200, Richard Biener wrote: > > Hmm, yeah - I guess that's what should be addressed first then. > > I'm quite sure that in case 'v' is not on the stack but in memory like > > in my case a SImode store is better than what we get from > > vec_insert - in fact vec_insert will likely introduce a RMW cycle > > which is prone to inserting store-data-races? > > The other way around -- if it is in memory, and was stored as vector > recently, then reading back something shorter from it is prone to > SHL/LHS problems. There is nothing special about the stack here, except > of course it is more likely to have been stored recently if on the > stack. So it depends how often it has been stored recently which option > is best. On newer CPUs, although they can avoid SHL/LHS flushes more > often, the penalty is relatively bigger, so memory does not often win. > > I.e.: it needs to be measured. Intuition is often wrong here.
But the current method would simply do a direct store to memory without a preceeding read of the whole vector. Richard. > > > Segher