On 9/10/2021 10:05 AM, Aldy Hernandez wrote:
On 9/10/21 5:43 PM, Jeff Law wrote:
On 9/9/2021 3:21 AM, Aldy Hernandez wrote:
/* If this path does not thread through the loop latch, then we are
using the FSM threader to find old style jump threads. This
is good, except the FSM threader does not re-use an existing
threading path to reduce code duplication.
So for that case, drastically reduce the number of statements
we are allowed to copy. */
*blink*
Woah. The backward threader has been using FSM threads
indiscriminately as far as I can remember. I wonder what would
break if we "fixed it".
?!? I'm not sure what you're suggesting here. If you s/FSM
threader/backwards threader/ in the comment does it make more sense?
The term FSM really should largely have been dropped as the backwards
threader was improved to handle more cases.
back_threader_registry::register_path() uses EDGE_FSM_THREAD as the
thread type to register threads. I was wondering if it should have
been some combination of EDGE_START_JUMP_THREAD /
EDGE_*_COPY_SRC_BLOCK, etc. I (purposely) know nothing about the
underlying threading types ;-). But if the backwards threader has been
improved, then perhaps we should just remove the confusing FSM
references.
No we shouldn't change it to any of the other types. EDGE_FSM_THREAD
means a thread found by the backwards threader and it's the key used to
determine which of the two CFG updating mechanisms should be used, the
generic copier in the case of EDGE_FSM_THREAD.
Changing the name, yes, absolutely. I probably should have done that
when the original FSM threader was tweaked to handle generic threading.
As you've probably heard me mention before, all the EDGE_FSM_THREAD
stuff in the registry really could be pulled out. The registry's
purpose is to deal with the two stage nature of jump threading in DOM
(find threads, then optimize them later). I don't think any of the
backwards threading bits need that two stage handling.
My current thinking is that replacing the forward VRP threader with a
hybrid one is a gentler approach to the longer term goal of replacing
the forward threader altogether. However, all the work I've been
doing could go either way-- we could try the forward/VRP replacement
or a hybrid approach. It will all use the path solver underneath.
And that's probably a reasonable intermediate step on the way towards
removing the VRP threading.
My main problem with replacing the forward/VRP with a backward client
is that the cost models are so different that it was difficult to
compare how we fared. I constantly ran into threads the solver could
handle just fine, but profitable_path_p was holding it back.
Yea. Sorry about that tangle of insanity
FWIW, we get virtually everything the forward threader gets, minus a
very few things. At least when I plug in the solver to the
DOM/forwarder threader, it can solve everything it can (minus noise
and floats).
So once you plug those bits in, we don't have to carry around the
avail/copies tables for the threader anymore, right? That's a nice
cleanup in and of itself.
If you prefer a backward threader instance to replace the VRP/forward
threader, I'm game. It's just harder to compare. Either way (backward
threader or a hybrid forward+solver) uses the same underlying solver
which is solid.
I think we can go hybrid, then look at the next step, which could well
be bringing some consistency into the costing models.
c) DOM changes the IL as it goes. Though we could conceivably
divorce do the threading after DOM is done.
The only reason threading runs in parallel with DOM is so that it can
use the context sensitive equivalences. With the infrastructure
you're building, there's a reasonable chance we can move to a model
where we run DOM (and in the long term a simpler DOM) and threading
as distinct, independent passes.
Andrew mumbled something about replacing all of DOM eventually :-).
Well, except that value-numbering business I bet.
Essentially a realization of Bodik's work in GCC. The nugget in there
is it's a path sensitive optimizer. That's kindof what I've envisioned
DOM turning into.
1. We separate out jump threading from DOM.
2. We replace the bulk of DOM with FRE
3. The remnants of DOM turn into a path sensitive optimizer (and for
god's sake we don't want to call it DOM anymore :-)
Jeff