> Do you have an example where wrong code is generated through the > noce_convert_multiple_sets_p path (with or without bodged costs)? > > Both AArch64 and x86-64 reject your testcase along this codepath because > of the constant set of 1. If we work around that by setting bla = n rather > than bla = 1 , I see this code generation for AArch64: [..] > i.e. we have fresh registers. As you note, we do expect a later pass to > clean up the redundant compare of (reg:SI 73) (reg:SI 77), which can > throw the cost calculation off. > > If I hack up ix86_noce_conversion_profitable_p to alwyas return true, then > for your testcase I also see: [..] > Again, no overlap.
mhm, right, at first I used the cond_move_process_if_block before realizing noce_convert_multiple_sets is better suited to do the work. Seeing that the additional compare is not cleaned up on s390 (no overlap though), I implicitly assumed it has the same problems as the other path since it also uses noce_emit_cmove. Yet, apparently no wrong code is produced here. (Additionally, my local branch has a hack for the const_int case which I missed to include) On s390 the resulting assembly reads cr %r5,%r1 locrnhe %r4,%r0 cr %r5,%r1 locrnhe %r1,%r5 I'll have to check why we don't manage to get rid of the compare. Regarding the cond_move.. case I agree that it is of lesser importance but I'd still be interested why the overlap condition is needed (if not for implementation reasons that could be overcome like in ..multiple_sets). Nevertheless, the cost situation seems strange - I'm not sure if it's worth to cache the compare like in my patch just to get the costs "correct" though. Your reply suggests that the current cost model works for aarch64 and my example, or did you include my hack? Regards Robin