Re: analyzer: Weekly update on extending C++ support (2)

David Malcolm via Gcc Mon, 07 Aug 2023 08:49:21 -0700

On Mon, 2023-08-07 at 15:04 +0200, Benjamin Priour wrote:
> Hi Dave,

Hi Benjamin.


> 
> I made some more progress on moving tests from gcc.dg/analyzer/ to
> c-c++-common/analyzer.

Thanks.

It sounds like you have a large amount of "pending" work that is
accumulating on your hard drive.  This makes me nervous; if we disagree
on how an aspect of the work should be done, it would be better to sort
that out sooner rather than when you've done all the work.  So in the
spirit of "release early, release often", are you able to post a small
representative subset of the work to gcc-patches?

Also, do you have backups?

Further comments inline below...

> I'll only detail the most noteworthy tests I encountered.
> 
> gcc.dg/analyzer/pr103892.c troubled me with an Excess error when
> compiled
> with g++
> analysis bailed out early (91 'after-snode' enodes; 314 enodes)
> [-Wanalyzer-too-complex]
> 
> pr103892.c is compiled with optimization level -O2.
> Analysis bails out when the number of "after-SN" enodes explodes,
> i.e.
> exceeds a certain proportion of the total number of SG nodes.
> limit(after-SN) = m_sg.num_nodes ()  * param-bb-explosion-factor.
> The reason why C and C++ differs is simply due to what '-O2' does to
> each
> of them.
> Under -O0 or -O1, there is no failure whatsoever, under -O2 only C++
> fails,
> whereas with -O3 both gcc and g++ emits -Wanalyzer-too-complex.
> With -O2, although GCC produces a greater total number of "after-SN"
> enodes
> than G++, their proportion barely stays under limit(after-SN) as
> the total number of SN is also bigger, hence no warning is emitted.

Is the gimple seen by -fanalyzer different between the C and C++
frontends for this case?

> 
> All in all, I don't believe there is a significant difference here
> between
> C and C++, nor is there much we can do about this.
> Therefore I'll simply add a { dg-warning "analysis bailed out early"
> "" {
> target c++ } }

Putting in a dg-warning directive would mean that we expect the warning
at that line, which would be over-specifying the behavior of the test.

It's usually better to add -Wno-analyzer-too-complex to the options. 
Looking at git log, that test was added in
3d41408c5d28105e7a3ea2eb2529431a70b96369 as part of a fix for state
explosions.  So the absense of -Wanalyzer-too-complex is something that
the test is effectively asserting.  Hence the -Wno-analyzer-too-complex
should be conditional on { target c++ }.


> An interesting divergence between GCC and G++ was in the handling of
> built-ins.
> Tests gcc.dg/analyzer/{pr104062.c,sprintf-2.c} are failing when
> compiling
> with G++.
> FAIL: c-c++-common/analyzer/pr104062.c  leak of ap7 at line 13 (test
> for
> warnings, line 12)
> The reason is neither realloc nor sprintf are considered by G++ as a
> built-in, contrary to GCC.

Do you know why this happens?  I see both of those tests have their own
prototypes for the functions in question, rather than using system
headers; maybe those prototypes don't match some property that the C++
FE is checking for?

> C built-ins are mapped within the analyzer to known_functions using
> their
> 'enum combined_fn' code
> (See kf.cc:register_known_functions), not their function name.
> Therefore, we find a built-in known function by checking
> tree.h:fndecl_built_in_p returns true
> and calling known_function_manager::get_normal_builtin.
> The former returns false for 'realloc' and 'sprintf' when using G++.
> 
> To fix that, I've derived builtin_known_function from known_function.
> 
> /* Subclass of known_function for builtin functions.  */
> 
> class builtin_known_function : public known_function
> {
> public:
>   virtual enum built_in_function builtin_code () const = 0;
>   tree builtin_decl () const {
>     gcc_assert (builtin_code () < END_BUILTINS);
>     return builtin_info[builtin_code ()].decl;
>   }
> 
>   virtual const builtin_known_function *
>   dyn_cast_builtin_kf () const { return this; }
>   virtual builtin_known_function *
>   dyn_cast_builtin_kf () { return this; }
> };
> 
> And 'kfm.add (BUILT_IN_REALLOC, make_unique<kf_realloc> ());' becomes
> kfm.add ("realloc", make_unique<kf_realloc> ());
> kfm.add ("__builtin_realloc", make_unique<kf_realloc> ());
> 
> Unfortunately we have to register the built-ins using their function
> name
> which is more apt to bugs,
> and the double call to kfm.add is quite messy. Having two instances
> of
> kf_realloc however is not that troubling,
> as builtin_known_function objects are lightweight.
> 
> That GCC built-ins are not recognized by G++ doesn't only impede
> their
> detection,
> but also how we process them, and what information we have of them.
> In gcc.dg/analyzer/sprintf-2.c, sprintf signature follows the manual
> and
> cppreference.
> Yet we expect sm-malloc to warn about use of NULL when either of the
> argument is NULL,
> although there is no attribute(nonnull) specified.
> In fact, sm-malloc isn't using the signature of sprintf as specified
> in the
> test case, but rather the one provided
> by GCC in builtins.def
> 
> /* See e.g. https://en.cppreference.com/w/c/io/fprintf
>    and https://www.man7.org/linux/man-pages/man3/sprintf.3.html */
> 
>   extern int
>   sprintf(char* dst, const char* fmt, ...)
>     __attribute__((__nothrow__)); // No attribute(nonnull)
> 
> int
> test_null_dst (void)
> {
>   return sprintf (NULL, "hello world"); /* { dg-warning "use of NULL
> where
> non-null expected" } */
> }
> 
> int
> test_null_fmt (char *dst)
> {
>   return sprintf (dst, NULL);  /* { dg-warning "use of NULL where
> non-null
> expected" } */
> }
> 
> Signature in builtins.def: DEF_LIB_BUILTIN        (BUILT_IN_SPRINTF,
> "sprintf", BT_FN_INT_STRING_CONST_STRING_VAR,
> ATTR_NOTHROW_NONNULL_1_FORMAT_PRINTF_2_3)
> 
> My question is then : Should G++ behave as GCC and ignore the user's
> signatures of built-ins, instead using the attributes specified by
> GCC ?
> At the moment I went with a "yes". If the function is a builtin, I'm
> operating on its builtin_known_function::builtin_decl () (see above)
> rather
> than the callee_fndecl
> deduced from a gimple call, therefore overriding the user's
> signature.
> 
> Doing so led to tests sprintf-2.c and realloc-1.c to pass both in GCC
> and
> G++.

I think I agree with "yes" here.

> 
> 
> Last test I'd like to discuss is analyzer/pr99193-1.c

I'll reply to this in a separate email

Thanks; hope the above makes sense
Dave

Re: analyzer: Weekly update on extending C++ support (2)

Reply via email to