On Thu, Jul 22, 2021 at 4:33 PM Erick Ochoa <eoc...@gcc.gnu.org> wrote:
>
> >
> > But the addresses are at LGEN time?
>
> The following is what runs at WPA time
>
> unsigned long pid = streamer_read_uhwi (&ib);
> unsigned long id = streamer_read_uhwi (&ib);
> lto_symtab_encoder_t encoder = file_data->symtab_node_encoder;
> cgraph_node *cnode =
> dyn_cast<cgraph_node*>(lto_symtab_encoder_deref(encoder, id));
> logger ("%s %ld %ld %p\n", cnode->name (), pid, id, cnode);
>
> > Note the nodes are actually
> > streamed to different instances by input_symtab, then decls are merged
> > (lto_symtab_merge_decls), then I think the IPA
> > pass summaries are read in (to different unmerged instances!), _then_
> > the symtab merging process starts (lto_symtab_merge_symbols).
> > I think the last step eventually calls the cgraph/varpool removal hook
> > IPA passes registered.
>
> Ah, so what you are saying is that during the read_summary stage they
> will still be different, but during execute or
> write_optimization_summary (), will they be finally merged? I think
> maybe the terminology of LGEN/WPA/LTRANS should be expanded to be
> lgen_gen, lgen_write, lwpa_read, lwpa_exec/lwpa_write, ltrans_read,
> ltrans_exec?
>
> So, just to be a bit more concrete, when initializing the
> ipa_opt_pass_d instance one has to write functions which will be
> called by a parent process. Normally I see the following comments with
> them:
>
> generate_summary
> write_summary
> read_summary
> write_optimization_summary
> read_optimization_summary
>
> and finally there's the execute function that gets called.
>
> I am doing the following:
>
> generate_summary, /* generating pid */
> write_summary /* generating id and writing pid and id */
> read_summary /* reading and printing the info I told about */
> write_optimization_summary /* nothing yet */
> read_optimization_summary /* nothing yet */
> execute /* nothing yet */
>
> And I think these correspond to the following "LGEN/WPA/LTRANS" stages
>
> 1. lgen (multiple processes) generate_summary
> 2. lgen (multiple process) write_summary
> 3. wpa (single process) read_summary
> 4. wpa (single process) execute
> 5. wpa? (single process?) write_optimization_summary
> 6  ltrans (multiple processes) read_optimization_summary
>
>
> And you are telling me that cgraph_node and varpool_nodes will have
> the same address only after the beginning of the execute stage but not
> before that?
>
> Is the above correct?
>
> <OPEN EDIT>
>
> I did try printing cnode->name() during execute and it segfaulted, so
> perhaps those function bodies where merged to something else? Note,
> that some names were successfully printed out. I'm wondering, can I
> use the function lto_symtab_encoder_deref during execute? I think this
> is unlikely... because in the past I've tried to use
> lto_symtab_encoder_encode during generate_summary and it caused
> segfaults. I'll still give it a try.
>
> Perhaps this is still a bit of progress? But now I'm wondering, if I
> can't use lto_symtab_encoder_deref and the nodes were indeed merged,
> do some of the varpool_node* I saved during read_summary are pointing
> to random memory? How am I able to tell which ones survived?

As said there are modification hooks and there's likely one missing for
your case (merge-A-and-B or at least B removal).

> <CLOSE EDIT>
>
> >
> > It might be that you need a replace hook to do what you want, I think
> > that for example IPA CP encodes references to global vars aka &global
> > as IPA_REF and those are transparently re-written.
> >
> > As said, I think it can be made work but the details, since this is the
> > first IPA pass needing this, can be incomplete infra-structure-wise.
> >
> > Basically you have summaries like
> >
> >  'global = <fn::1>_3'
> >
> > where the <fn::1> should eventually be implicit and the constraints
> > grouped into constraints generated from the respective function body
> > and constraints generated by call stmts (not sure here), and constraints
> > for global variable init.  But for the above constraint the point is to
> > make the 'global' references from different LGEN units the same by
> > some means (but not streaming and comparing the actual assembler name).
> >
>
> I'll need some more time to read through how ipa-cp encodes references
> to global variables. Thanks for the suggestion!
>
> I don't really follow the paragraph that details what you think my
> summaries look like. I'm thinking that for
>
> global = <fn::1>_3
>
> global is a variable? and <fn::1>_3 states that it is an SSA variable
> in function 1? I think that can be a possible notation. I prefer to
> just use integers.
>
> What do you mean by implicit?

the <fn::1> should be implicit, the data can just contain '3'.  With the
assumption that you have a set of constraints recorded for each
function definition (which then naturally only refers to SSA vars in this
function).

> But the idea is to essentially "compile" down all
> variables/functions/locals/ssa/etc into integers. And have the program
> represented as integers and relation of integers. For example:
>
> int* a
>
> extern void foo (int* c);
>
> int main ()
> {
>   int b;
>   a = &b;
>   foo (a) // defined in a different function
> }
>
> Should have the following at LGEN time (more specifically write_summary)
>
> variable -> long integer encoding
> --------------------------------------------
> abstract null -> $null_id
> cgraph main -> 0
> cgraph foo -> 1
> varpool a -> 2
> tree b -> 0 x 0  // corresponds to main position 0
> real arg c -> 1 x 0 // corresponds to foo position 0
>
> Technically we can also map the other way around, we just need to know
> in which "table" the information is stored. (I.e., the symbol_table,
> the local_decl table or the ssa_table...)
>
> Then, we give them a unique id
>
> id for lgen <-> variable <-> long integer encoding
> --------------------------------------------------------------
> $null_id <-> abstract null -> $null_id
> 0 <-> cgraph main -> 0
> 1 <-> cgraph foo -> 1
> 2 <-> varpool a -> 2
> 3 <-> tree b -> 0 x 0
> 4 <-> real arg c -> 1 x 0
>
> Then we can generate the constraints
>
> 2 = &3 // a = &b
> 4 = 2   // parm c = a
> call foo
>
> The problem is that because this is happening in parallel the other
> partition might generate the following constraints:
>
> void foo(int *c)
> {
>   c = NULL;
> }
>
> abstract null -> $null_id
> cgraph foo -> 0
> formal arg c -> 0 x 0
>
> Give the following global id:
>
> $null_id <-> abstract null -> $null_id
> 0 <-> cgraph foo -> 0
> 1 <-> formal arg c -> 0 x 0
>
> And have the following constraint:
>
> 1 = $null_id
>
> and so if we were to merge the constraints from both partitions
> naively, we would get that 0 and 1 refer to different parts of the
> program.

Well, yeah - you have to remember and stream this mapping to
WPA and then produce a new merged mapping and rewrite the
integers.  So I'd complexify the initial items to not be all integers
but tuples of pieces that remap naturally with the LGEN -> WPA
merging process.

> I am trying to get the primary ID's to match at WPA time to be something like:
>
> FROM PARTITON pid 1
> 0 <-> cgraph main -> 0
> 1 <-> cgraph foo -> 1
> 2 <-> varpool a -> 2
> 3 <-> tree b -> 0 x 0
> 4 <-> real arg c -> 1 x 0
>
> 2 = &3 // a = &b
> 4 = 2   // parm c = a
> call 1
>
> FROM PARTITION pid 2
> $null_id <-> abstract null -> $null_id
> 0 <-> cgraph foo -> 0
> 1 <-> formal arg c -> 0 x 0
>
> 1 = $null_id
>
> MERGED with a map back to their old PID
> wpa id, pid x lgen id, var,
> 0 <-> 1 x 0 <-> cgraph main -> 0
> 1 <-> 1 x 1 <-> cgraph foo -> 1
> 1 <-> 2 x 0 <-> cgraph foo -> 0
> 2 <-> 1 x 2 <-> varpool a -> 2
> 3 <-> 1 x 3 <-> tree b -> 0 x 0
> 4 <-> 1 x 4 <-> real arg c -> 1 x 0
> 5 <-> 2 x 1 <-> formal arg c -> 1 x 0
>
> 2 = &3 // a = &b
> 4 = 2   // real arg c = a
> call 1  //  call foo
> 5 = $null_id  // formal arg c = NULL
>
> Finally, with this information we can run points-to analysis using
> integers standing in for memory locations and can output a pointer
> pointee relationship also as integers.
>
> I don't want to go through the whole derivation (and I already omitted
> details and probably have made some silly mistakes here) but in the
> end, for example we should at least have:
>
> Pointer, pointee
> ---------------------
> 2, 3  // a may-points-to b
> 4, 3  // real arg c may-points-to b
> 2, $null_id // a may-points-to NULL
> 5, $null_id // formal arg c may-points-to NULL
> 5, 3 // formal arg c may-points-to b
>
> And we can use these numbers to map back to the gimple source.
>
> This might be inefficient and there's room for removing some
> redundancy, but that's kinda what I'm thinking about.
>
>
> >
> > One node is dropped and all references are adjusted.  And somehow
> > IPA passes are notified about this _after_they have read their
> > summaries.
> >
> > Richard.

Reply via email to