On 11/7/24 10:36 AM, Robin Dapp wrote:
I think it'd be better if I abstain from this.  I probably disagree too
much with the current structure and the way that the code is developing.
I won't object if anyone else approves it though.

It's not that I'm happy with the current state either and I thought about
how to rewrite it more than once.  So if you have an idea for a good
rework of it (or larger parts of ifcvt) I'd be all ears.

My argument right now would be that this patch doesn't make things worse in
terms of complexity and improves SPEC 2017's deepsjeng considerably on riscv.
IMHO it doesn't inhibit a future rewrite and even simplifies things
ever so slightly.
I went back and reviewed this thread from last summer & fall.

The second attempt is (in the original and Robin's adjustment) meant to refine the results of the first attempt. It's not supposed to do anything unexpectedly different. Robin's change does at least check that the second attempt doesn't do something crazy bad like generate code noce_convert_multiple_sets_1 can't handle.

Robin's patch also makes that refinement step unconditional.

If I'm reading Richard S's concerns correctly, he has more of a philosophical concern with the code rather than an implementation concern. For example, if we did life analysis we could easily know what values need to be preserved. And he's right. But the life analysis itself (ISTM) would be just as complex as running through the code again knowing the temporary set. It's also not clear (based on Robin's comment) that would be sufficient to avoid a fixup step. ie, it's not a given that Richard's approach would ultimately result in a cleaner implementation.


So I think this boils down to Richard's general structure concerns vs the Robin's pragmatic approach with measurable performance gains. Richard agreed to step aside, and since I'm the one that threw this in Robin's lap after seeing missed if-conversion in an important part of deepsjeng I'm inclined to ACK the V4 version for the trunk. I don't see that it makes the current implementation notably worse and in some ways cleans it up (IMHO).

Robin, given the time since submission, I would suggest a fresh bootstrap and regression test on x86. Given the V4 passed on x86 and aarch64 in the past and earlier versions tested well on s390, I think just the sanity check on x86 is needed.

Thanks for being patient here.

Jeff







Reply via email to