Hello, I've been working on an LTO points-to analysis pass for a little while. Because of LTO's design, gimple bodies are inaccessible during WPA. This essentially means that every LTO pass compiles down function bodies into their own IR which gets stored in function summaries and later read during WPA. This is also what I plan to do.
I recently started looking into how IPA-CP works. I noticed that again, IPA-CP compiles down every function into its own function summary. However, while reading functions, it selectively decides which information to store by looking at symtab_node::prevailing_p. I was not aware of this function but from what I understand it is a way of deciding which symtab_node's bodies survive when removing duplicates before the execute stage of the pass. Is this correct? For IPA-CP, those cgraph_nodes for which the predicate symtab_node::prevailing_p are just read and discarded. This makes sense if it is a duplication of content. I know that different cgraph_nodes might represent the same function but maybe one of them has been specialized, another version has been inlined. I also think that two different cgraph_nodes might represent the same function implementation (i.e., they shared the same body and the same information but this information is duplicated during LGEN across partitions). I believe that it is not until the WPA/execute (making a distinction between WPA/execute and WPA/read_summary) that the distinct cgraph_nodes are merged. Would it be correct to say that a more faithful representation of reality is that non-prevailing_p nodes are eliminated while the other ones remain?) However, cgraph_nodes which represent the same function, but have been specialized will be marked as prevailing_p. Is this correct? (Here, I am not sure about the internals of the LTO, because in some sense, the points-to analysis hasn't run, but it is possible that other analysis have already run their WPA/execute stage and have said that some function bodies need to be specialized but at the moment they are still virtual clones? Related question, do virtual clones have cgraph_node?) I did a little experiment yesterday where I had the following control group: 1. encoded a cgraph_node during LGEN/write_summary 2. decoded a cgraph_node during WPA/read_summary and printed cnode->name () and compared it against the following experimental group: 1. encoded a cgraph_node during LGEN/write_summary 2. decoded a cgraph_node during WPA/read 3. during WPA/execute I printed cnode->name () What I found was that during the run of the "control group" I was able to print all cnodes' names. However, during the run of the "experimental group" only some of the names were printed before a segmentation fault occurred. Again, this might have been because those cgraph_node's were deleted. My theory is that these are non-prevailing_p cgraph_nodes but I haven't confirmed it experimentally, is this the case? I also do not know if all data being pointed to by these cgraph_node* is corrupted or if only some parts of the cgraph_node* have been removed from memory (like the name). Would cgraph_node* during WPA/execute in the experimental run have some valid fields or should it all be considered invalid and not even accessed outside of WPA/read? Looking at the definition of non-prevailing_p, it seems that all functions without a gimple body will be marked as non-prevailing_p. What does this mean though? There are definitely calls to external functions and so having a call to a non-prevailing_p just means that you are calling a function with no defined body. But what does that mean for functions that were "merged" or removed because they are duplicates? Can you have a cgraph_edge to a non-prevailing_p cgraph_node whose function body was once available at LGEN/lwrite but it is no longer available during WPA/execute? If that's the case how does one know the target of the call? Sorry if these are too many questions, I do greatly appreciate all the support given to me in the mailing list. In the meanwhile, I'll continue looking into how ipa-cp works to see what I can learn from other sources. Thanks -Erick