Hi everyone,
I have some question regarding the analyzer.
I'm currently working on an fully out-of-tree static analyzer plugin.
I started development on commit tagged as /basepoints/gcc-13/, but
recently moved my code to a more recent /trunk/ branch (build /20230309/).
From my different experiments and analyzer's log readings, I have the
intuition that '/ana::state_machine::state/' class is mostly (if not
only) linked with '/ana::svalue/', i.e. the lvalue of a memory region.
In my analysis case, I would like to also be able to track state for
some rvalue of a memory region, i.e. '/ana::region/' objects.
So, first question: is there any way to associate and track the state of
a rvalue, independently of its lvalue?
To try to clarify the question, here's an example:
'''
int __attribute__("__some_attribute__") x = 42;
/* STEP-1
From now on, we consider the state of x as being marked by some_attribute.
But in fact, in the log, we can observe that we'll have something like
this in the new '/ana::program_state/':
{0x4b955b0: (int)42: marked_state (‘x’)} */
int *y = &x;
/* STEP-2
For analysis purpose, you want to know that from now on, y is pointing
to marked data.
So you set state of the LHS of the GIMPLE statement (i.e. some ssa_name
instance of y) accordingly, with a state named 'points-to_marked_data'
and setting 'x' as the origin of the state (in the sense of the argument
/origin/ from '/ana::sm_context::on_transition/'.
What we now have in the new '/ana::program_state/' is this:
{0x4b9acb0: &x: points-to-marked_data (‘&x’) (origin: 0x4b955b0: (int)42
(‘x’)), 0x4b955b0: (int)42: marked_state (‘x’)} */
int z = *y;
/* STEP-3
Now you have implicit copy of marked data, and you want to report it.
So you state the LHS of the GIMPLE statement (i.e. some ssa_name
instance of z) as being marked, with 'y' as the origin.
What we now have in the new '/ana::program_state/' is this:
{0x4b9acb0: &x: points-to-marked_data (‘&x’) (origin: 0x4b955b0: (int)42
(‘x’)), 0x4b955b0: (int)42: marked_state (‘x’)} */
'''
In STEP-2:
We lost the information saying that the rvalue of y (i.e. y), is
pointing to marked data.
Only remain the information that its lvalue (i.e. &x), is pointing to
marked data, which is of course true.
In STEP-3:
No information is added regarding z in the new program_state.
In fact, if one have a closer look at the log, we see that the LHS of
the GIMPLE statement (the ssa_name instance of z), is already in the
state of 'marked_data'.
Through the log to the call to '/sm_context::on_transition/' this can be
seen:
marked_sm: state transition of ‘z_5’: marked_data -> marked_data
All of this step is somehow leading to confusing diagnostic.
For example, you miss the fact that 'z' is being marked, because no
event happen as it is somehow aliasing 'x' svalue.
Though it might not be true in case of missed optimization.
Of course, if anyone wants to have a look at my code, I can provide it
to you (it is not yet publicly available at the moment).
Also, let me know if there is any imprecise information.
Thank you for your time,
Pierrick