> This patch would like to combine the vec_duplicate + vadd.vv to the
> vadd.vx.  From example as below:

I think we concluded a while ago that we don't want this turned on universally.
For the example/tests you provide it will be a de-optimization on any uarch
that has non-zero GPR -> VR latency.

So at least we need to define RTL costs for the combined variant and make them
depend on the VR <-> GPR costs (so we don't do this if the latency/cost is >
0).

Does the optimization happen in combine or late-combine BTW?  I thought
late-combine because we need to look through the unary op (vec_duplicate).

-- 
Regards
 Robin

Reply via email to