On Thu, Jul 22, 2021 at 4:33 PM Erick Ochoa <eoc...@gcc.gnu.org> wrote: > > > > > But the addresses are at LGEN time? > > The following is what runs at WPA time > > unsigned long pid = streamer_read_uhwi (&ib); > unsigned long id = streamer_read_uhwi (&ib); > lto_symtab_encoder_t encoder = file_data->symtab_node_encoder; > cgraph_node *cnode = > dyn_cast<cgraph_node*>(lto_symtab_encoder_deref(encoder, id)); > logger ("%s %ld %ld %p\n", cnode->name (), pid, id, cnode); > > > Note the nodes are actually > > streamed to different instances by input_symtab, then decls are merged > > (lto_symtab_merge_decls), then I think the IPA > > pass summaries are read in (to different unmerged instances!), _then_ > > the symtab merging process starts (lto_symtab_merge_symbols). > > I think the last step eventually calls the cgraph/varpool removal hook > > IPA passes registered. > > Ah, so what you are saying is that during the read_summary stage they > will still be different, but during execute or > write_optimization_summary (), will they be finally merged? I think > maybe the terminology of LGEN/WPA/LTRANS should be expanded to be > lgen_gen, lgen_write, lwpa_read, lwpa_exec/lwpa_write, ltrans_read, > ltrans_exec? > > So, just to be a bit more concrete, when initializing the > ipa_opt_pass_d instance one has to write functions which will be > called by a parent process. Normally I see the following comments with > them: > > generate_summary > write_summary > read_summary > write_optimization_summary > read_optimization_summary > > and finally there's the execute function that gets called. > > I am doing the following: > > generate_summary, /* generating pid */ > write_summary /* generating id and writing pid and id */ > read_summary /* reading and printing the info I told about */ > write_optimization_summary /* nothing yet */ > read_optimization_summary /* nothing yet */ > execute /* nothing yet */ > > And I think these correspond to the following "LGEN/WPA/LTRANS" stages > > 1. lgen (multiple processes) generate_summary > 2. lgen (multiple process) write_summary > 3. wpa (single process) read_summary > 4. wpa (single process) execute > 5. wpa? (single process?) write_optimization_summary > 6 ltrans (multiple processes) read_optimization_summary > > > And you are telling me that cgraph_node and varpool_nodes will have > the same address only after the beginning of the execute stage but not > before that? > > Is the above correct? > > <OPEN EDIT> > > I did try printing cnode->name() during execute and it segfaulted, so > perhaps those function bodies where merged to something else? Note, > that some names were successfully printed out. I'm wondering, can I > use the function lto_symtab_encoder_deref during execute? I think this > is unlikely... because in the past I've tried to use > lto_symtab_encoder_encode during generate_summary and it caused > segfaults. I'll still give it a try. > > Perhaps this is still a bit of progress? But now I'm wondering, if I > can't use lto_symtab_encoder_deref and the nodes were indeed merged, > do some of the varpool_node* I saved during read_summary are pointing > to random memory? How am I able to tell which ones survived?
As said there are modification hooks and there's likely one missing for your case (merge-A-and-B or at least B removal). > <CLOSE EDIT> > > > > > It might be that you need a replace hook to do what you want, I think > > that for example IPA CP encodes references to global vars aka &global > > as IPA_REF and those are transparently re-written. > > > > As said, I think it can be made work but the details, since this is the > > first IPA pass needing this, can be incomplete infra-structure-wise. > > > > Basically you have summaries like > > > > 'global = <fn::1>_3' > > > > where the <fn::1> should eventually be implicit and the constraints > > grouped into constraints generated from the respective function body > > and constraints generated by call stmts (not sure here), and constraints > > for global variable init. But for the above constraint the point is to > > make the 'global' references from different LGEN units the same by > > some means (but not streaming and comparing the actual assembler name). > > > > I'll need some more time to read through how ipa-cp encodes references > to global variables. Thanks for the suggestion! > > I don't really follow the paragraph that details what you think my > summaries look like. I'm thinking that for > > global = <fn::1>_3 > > global is a variable? and <fn::1>_3 states that it is an SSA variable > in function 1? I think that can be a possible notation. I prefer to > just use integers. > > What do you mean by implicit? the <fn::1> should be implicit, the data can just contain '3'. With the assumption that you have a set of constraints recorded for each function definition (which then naturally only refers to SSA vars in this function). > But the idea is to essentially "compile" down all > variables/functions/locals/ssa/etc into integers. And have the program > represented as integers and relation of integers. For example: > > int* a > > extern void foo (int* c); > > int main () > { > int b; > a = &b; > foo (a) // defined in a different function > } > > Should have the following at LGEN time (more specifically write_summary) > > variable -> long integer encoding > -------------------------------------------- > abstract null -> $null_id > cgraph main -> 0 > cgraph foo -> 1 > varpool a -> 2 > tree b -> 0 x 0 // corresponds to main position 0 > real arg c -> 1 x 0 // corresponds to foo position 0 > > Technically we can also map the other way around, we just need to know > in which "table" the information is stored. (I.e., the symbol_table, > the local_decl table or the ssa_table...) > > Then, we give them a unique id > > id for lgen <-> variable <-> long integer encoding > -------------------------------------------------------------- > $null_id <-> abstract null -> $null_id > 0 <-> cgraph main -> 0 > 1 <-> cgraph foo -> 1 > 2 <-> varpool a -> 2 > 3 <-> tree b -> 0 x 0 > 4 <-> real arg c -> 1 x 0 > > Then we can generate the constraints > > 2 = &3 // a = &b > 4 = 2 // parm c = a > call foo > > The problem is that because this is happening in parallel the other > partition might generate the following constraints: > > void foo(int *c) > { > c = NULL; > } > > abstract null -> $null_id > cgraph foo -> 0 > formal arg c -> 0 x 0 > > Give the following global id: > > $null_id <-> abstract null -> $null_id > 0 <-> cgraph foo -> 0 > 1 <-> formal arg c -> 0 x 0 > > And have the following constraint: > > 1 = $null_id > > and so if we were to merge the constraints from both partitions > naively, we would get that 0 and 1 refer to different parts of the > program. Well, yeah - you have to remember and stream this mapping to WPA and then produce a new merged mapping and rewrite the integers. So I'd complexify the initial items to not be all integers but tuples of pieces that remap naturally with the LGEN -> WPA merging process. > I am trying to get the primary ID's to match at WPA time to be something like: > > FROM PARTITON pid 1 > 0 <-> cgraph main -> 0 > 1 <-> cgraph foo -> 1 > 2 <-> varpool a -> 2 > 3 <-> tree b -> 0 x 0 > 4 <-> real arg c -> 1 x 0 > > 2 = &3 // a = &b > 4 = 2 // parm c = a > call 1 > > FROM PARTITION pid 2 > $null_id <-> abstract null -> $null_id > 0 <-> cgraph foo -> 0 > 1 <-> formal arg c -> 0 x 0 > > 1 = $null_id > > MERGED with a map back to their old PID > wpa id, pid x lgen id, var, > 0 <-> 1 x 0 <-> cgraph main -> 0 > 1 <-> 1 x 1 <-> cgraph foo -> 1 > 1 <-> 2 x 0 <-> cgraph foo -> 0 > 2 <-> 1 x 2 <-> varpool a -> 2 > 3 <-> 1 x 3 <-> tree b -> 0 x 0 > 4 <-> 1 x 4 <-> real arg c -> 1 x 0 > 5 <-> 2 x 1 <-> formal arg c -> 1 x 0 > > 2 = &3 // a = &b > 4 = 2 // real arg c = a > call 1 // call foo > 5 = $null_id // formal arg c = NULL > > Finally, with this information we can run points-to analysis using > integers standing in for memory locations and can output a pointer > pointee relationship also as integers. > > I don't want to go through the whole derivation (and I already omitted > details and probably have made some silly mistakes here) but in the > end, for example we should at least have: > > Pointer, pointee > --------------------- > 2, 3 // a may-points-to b > 4, 3 // real arg c may-points-to b > 2, $null_id // a may-points-to NULL > 5, $null_id // formal arg c may-points-to NULL > 5, 3 // formal arg c may-points-to b > > And we can use these numbers to map back to the gimple source. > > This might be inefficient and there's room for removing some > redundancy, but that's kinda what I'm thinking about. > > > > > > One node is dropped and all references are adjusted. And somehow > > IPA passes are notified about this _after_they have read their > > summaries. > > > > Richard.