Re: [PATCH] analyzer: warn on the use of floating points in the size argument [PR106181]

David Malcolm via Gcc-patches Mon, 15 Aug 2022 11:16:23 -0700

On Mon, 2022-08-15 at 14:35 +0200, Tim Lange wrote:
> This patch fixes the ICE reported in PR106181 and adds a new warning
> to
> the analyzer complaining about the use of floating point operands.


Thanks for the patch.

Various comments inline...

> 
> I decided to move the warning for floats inside the size argument out
> of
> the allocation size checker code and toward the allocation such that
> the
> warning only appears once.
> I'm not sure about the wording of the diagnostic, my current wording
> feels
> a bit bulky. 

Agreed, and the warning itself is probably setting a new record for
option length: -Wanalyzer-imprecise-floating-point-arithmetic is 45
characters.  I'm not without sin here: I think the current record is -
Wanalyzer-unsafe-call-within-signal-handler, which is 43.

How about:
 -Wanalyzer-imprecise-float-arithmetic
 -Wanalyzer-imprecise-fp-arithmetic
instead?  (ideas welcome)


> Here is how the diagnostics look like:
> 
> /path/to/main.c: In function ‘test_1’:
> /path/to/main.c:10:14: warning: use of floating point arithmetic
> inside the size argument might yield unexpected results

https://gcc.gnu.org/codingconventions.html#Spelling
says we should use "floating-point" rather than "floating point".

How about just this:

"warning: use of floating-point arithmetic here might yield unexpected
results"

here...

> [-Wanalyzer-imprecise-floating-point-arithmetic]
>    10 |   int *ptr = malloc (sizeof (int) * n); /* { dg-line test_1 }
> */
>       |              ^~~~~~~~~~~~~~~~~~~~~~~~~
>   ‘test_1’: event 1
>     |
>     |   10 |   int *ptr = malloc (sizeof (int) * n); /* { dg-line
> test_1 } */
>     |      |              ^~~~~~~~~~~~~~~~~~~~~~~~~
>     |      |              |
>     |      |              (1) operand ‘n’ is of type ‘float’
>     |
> /path/to/main.c:10:14: note: only use operands of a type that
> represents whole numbers inside the size argument

...and make the note say:

"only use operands of an integer type inside the size argument"

which tells the user that it's a size that we're complaining about.


> /path/to/main.c: In function ‘test_2’:
> /path/to/main.c:20:14: warning: use of floating point arithmetic
> inside the size argument might yield unexpected results [-Wanalyzer-
> imprecise-floating-point-arithmetic]
>    20 |   int *ptr = malloc (n * 3.1); /* { dg-line test_2 } */
>       |              ^~~~~~~~~~~~~~~~
>   ‘test_2’: event 1
>     |
>     |   20 |   int *ptr = malloc (n * 3.1); /* { dg-line test_2 } */
>     |      |              ^~~~~~~~~~~~~~~~
>     |      |              |
>     |      |              (1) operand ‘3.1000000000000001e+0’ is of
> type ‘double’
>     |
> /path/to/main.c:20:14: note: only use operands of a type that
> represents whole numbers inside the size argument
> 
> Also, another point to discuss is the event note in case the
> expression is
> wrapped in a variable, such as in test_3:
> /path/to/main.c:30:10: warning: use of floating point arithmetic
> inside the size argument might yield unexpected results [-Wanalyzer-
> imprecise-floating-point-arithmetic]
>    30 |   return malloc (size); /* { dg-line test_3 } */
>       |          ^~~~~~~~~~~~~
>   ‘test_3’: events 1-2
>     |
>     |   37 | void test_3 (float f)
>     |      |      ^~~~~~
>     |      |      |
>     |      |      (1) entry to ‘test_3’
>     |   38 | {
>     |   39 |   void *ptr = alloc_me (f); /* { dg-message "calling
> 'alloc_me' from 'test_3'" } */
>     |      |               ~~~~~~~~~~~~
>     |      |               |
>     |      |               (2) calling ‘alloc_me’ from ‘test_3’
>     |
>     +--> ‘alloc_me’: events 3-4
>            |
>            |   28 | void *alloc_me (size_t size)
>            |      |       ^~~~~~~~
>            |      |       |
>            |      |       (3) entry to ‘alloc_me’
>            |   29 | {
>            |   30 |   return malloc (size); /* { dg-line test_3 } */
>            |      |          ~~~~~~~~~~~~~
>            |      |          |
>            |      |          (4) operand ‘f’ is of type ‘float’
>            |
> 
> I'm not sure if it is easily discoverable that event (4) does refer
> to
> 'size'. I thought about also printing get_representative_tree
> (capacity)
> in the event but that would clutter the event message if the argument
> does hold the full expression. I don't have any strong feelings about
> the
> decision here but if I had to decide I'd leave it as is (especially
> because the warning is probably quite unusual).

Yeah; get_representative_tree tries gets a tree, but trees don't give
us a good way of referring to a local var within a particular stack
frame within a path.  So leaving it as is is OK.

> The index of the argument would also be a possibility, but that would
> get
> tricky for calloc.

[...snip...]

> 
> diff --git a/gcc/analyzer/analyzer.opt b/gcc/analyzer/analyzer.opt
> index 61b58c575ff..bef15eae2c3 100644
> --- a/gcc/analyzer/analyzer.opt
> +++ b/gcc/analyzer/analyzer.opt
> @@ -98,6 +98,10 @@ Wanalyzer-free-of-non-heap
>  Common Var(warn_analyzer_free_of_non_heap) Init(1) Warning
>  Warn about code paths in which a non-heap pointer is freed.
>  
> +Wanalyzer-imprecise-floating-point-arithmetic
> +Common Var(warn_analyzer_imprecise_floating_point_arithmetic)
> Init(1) Warning
> +Warn about code paths in which floating point arithmetic is use in

"use" -> "used".

> locations where precise computation is needed.
> +
>  Wanalyzer-jump-through-null
>  Common Var(warn_analyzer_jump_through_null) Init(1) Warning
>  Warn about code paths in which a NULL function pointer is called.
> diff --git a/gcc/analyzer/region-model-impl-calls.cc
> b/gcc/analyzer/region-model-impl-calls.cc
> index 8eebd122d42..c4d7e186963 100644
> --- a/gcc/analyzer/region-model-impl-calls.cc
> +++ b/gcc/analyzer/region-model-impl-calls.cc
> @@ -237,6 +237,7 @@ region_model::impl_call_alloca (const
> call_details &cd)
>    const region *new_reg = create_region_for_alloca (size_sval,
> cd.get_ctxt ());
>    const svalue *ptr_sval
>      = m_mgr->get_ptr_svalue (cd.get_lhs_type (), new_reg);
> +  check_region_capacity_for_floats (ptr_sval, cd.get_ctxt ());
>    cd.maybe_set_lhs (ptr_sval);
>  }
>  
> @@ -393,6 +394,7 @@ region_model::impl_call_calloc (const
> call_details &cd)
>      {
>        const svalue *ptr_sval
>         = m_mgr->get_ptr_svalue (cd.get_lhs_type (), new_reg);
> +      check_region_capacity_for_floats (ptr_sval, cd.get_ctxt ());
>        cd.maybe_set_lhs (ptr_sval);
>      }
>  }
> @@ -497,6 +499,7 @@ region_model::impl_call_malloc (const
> call_details &cd)
>      {
>        const svalue *ptr_sval
>         = m_mgr->get_ptr_svalue (cd.get_lhs_type (), new_reg);
> +      check_region_capacity_for_floats (ptr_sval, cd.get_ctxt ());
>        cd.maybe_set_lhs (ptr_sval);
>      }
>  }

These checks would be simpler if they were consolidated into a single
call to check_region_capacity_for_floats in:
  region_model::set_dynamic_extents
akin to how we have check_dynamic_size_for_taint called there - though
that would make it harder to complain about specific arguments at
function calls.  Given that we're not doing the latter, please
consolidate them.

[...snip...]

>  
> +/* Abstract class for diagnostics related touse of floating point

"touse" -> "to use"

> +   arithmetic where precision is needed.  */
> +
> +class imprecise_floating_point_arithmetic
> +: public
> pending_diagnostic_subclass<imprecise_floating_point_arithmetic>
> +{
> +public:
> +  imprecise_floating_point_arithmetic ()
> +  {}

Do we need thic ctor?  There aren't any fields to initialize.

> +
> +  const char *get_kind () const final override
> +  {
> +    return "imprecise_floating_point_arithmetic";
> +  }

get_kind should probably only be implemented for concrete subclasses,
such as float_as_size_arg.

> +
> +  int get_controlling_option () const final override
> +  {
> +    return OPT_Wanalyzer_imprecise_floating_point_arithmetic;
> +  }
> +
> +  bool
> +  operator== (const imprecise_floating_point_arithmetic &other
> +             ATTRIBUTE_UNUSED) const
> +  {
> +    return true;
> +  }

I don't think it makes sense to have an operator== here for an abstract
class with no fields.

> +};
> +
> +/* Concrete diagnostic to complain about uses of floating point
> arithmetic
> +   in the size argument of malloc etc.  */
> +
> +class float_as_size_arg : public imprecise_floating_point_arithmetic
> +{
> +public:
> +  float_as_size_arg (tree arg) : m_arg (arg)
> +  {}
> +
> +  bool operator== (const float_as_size_arg &other) const
> +  {
> +    return pending_diagnostic::same_tree_p (m_arg, other.m_arg);
> +  }
> +
> +  bool emit (rich_location *rich_loc) final override
> +  {
> +    diagnostic_metadata m;
> +    bool warned = warning_meta (rich_loc, m, get_controlling_option
> (),
> +                               "use of floating point arithmetic
> inside the"
> +                               " size argument might yield
> unexpected"
> +                               " results");
> +    if (warned)
> +      inform (rich_loc->get_loc (), "only use operands of a type
> that"
> +                                   " represents whole numbers inside
> the"
> +                                   " size argument");
> +    return warned;
> +  }
> +
> +  label_text describe_final_event (const evdesc::final_event &ev)
> final
> +  override
> +  {
> +    if (m_arg)
> +      return ev.formatted_print ("operand %qE is of type %qT",
> +                                m_arg, TREE_TYPE (m_arg));
> +    return ev.formatted_print ("at least one operand of the size
> argument is"
> +                              " of a floating point type");
> +  }
> +
> +private:
> +  tree m_arg;
> +};
> +
> +/* Visitor to find uses of floating point variables/constants in an
> svalue.  */
> +
> +class contains_floating_point_visitor : public visitor
> +{
> +public:
> +  contains_floating_point_visitor (const svalue *root_sval) :
> m_result (NULL)
> +  {
> +    root_sval->accept (this);
> +  }
> +
> +  const svalue *get_svalue_to_report ()
> +  {
> +    return m_result;
> +  }
> +
> +  void visit_constant_svalue (const constant_svalue *sval) final
> override
> +  {
> +    /* At the point the analyzer runs, constant integer operands in
> a floating
> +       point expression are already implictly converted to floating
> points.
> +       Thus, we do prefer to report non-constants such that the
> diagnostic
> +       always reports a floating point operand.  */

If the constant is seen first, then the non-constant won't be favored
(though perhaps binary ops get canonicalized so that constants are on
the RHS?).

Maybe rework this so that the visitor accumulates an auto_vec<const
svalue *> of all floating point svalues it sees, and have
get_svalue_to_report do a qsort of them (if non-empty), and pick the
first, using a custom comparator to order them into preference of
reporting?  Though this is possibly overkill.

> +    tree type = sval->get_type ();
> +    if (type && FLOAT_TYPE_P (type) && !m_result)
> +      m_result = sval;
> +  }
> +
> +  void visit_conjured_svalue (const conjured_svalue *sval) final
> override
> +  {
> +    tree type = sval->get_type ();
> +    if (type && SCALAR_FLOAT_TYPE_P (type))
> +      m_result = sval;
> +  }
> +
> +  void visit_initial_svalue (const initial_svalue *sval) final
> override
> +  {
> +    tree type = sval->get_type ();
> +    if (type && SCALAR_FLOAT_TYPE_P (type))
> +      m_result = sval;
> +  }
> +
> +private:
> +  /* Non-null if at least one floating point operand was found.  */
> +  const svalue *m_result;
> +};
> +
> +/* May complain about uses of floating point operands in the
> capacity of SVAL.
> +
> +   SVAL is expected to be a region_svalue.  */
> +
> +void
> +region_model::check_region_capacity_for_floats (const svalue *sval,
> +                                               region_model_context
> *ctxt)
> +const
> +{
> +  if (!ctxt)
> +    return;
> +
> +  if (const region_svalue *reg_sval = dyn_cast <const region_svalue
> *> (sval))
> +    {
> +      const svalue *capacity = get_capacity (reg_sval->get_pointee
> ());
> +      contains_floating_point_visitor v (capacity);
> +      if (const svalue *float_sval = v.get_svalue_to_report ())
> +       {
> +         tree diag_arg = get_representative_tree (float_sval);
> +         ctxt->warn (new float_as_size_arg (diag_arg));
> +       }
> +    }
> +}
> +
>  /* Set the value of the region given by LHS_REG to the value given
>     by RHS_SVAL.
>     Use CTXT to report any warnings associated with writing to
> LHS_REG.  */

[...snip...]

> @@ -9758,6 +9759,7 @@ Enabling this option effectively enables the
> following warnings:
>  -Wanalyzer-fd-use-without-check @gol
>  -Wanalyzer-file-leak @gol
>  -Wanalyzer-free-of-non-heap @gol
> +-Wno-analyzer-imprecise-floating-point-arithmetic @gol

Lose the "no-" here.


>  -Wanalyzer-jump-through-null @gol
>  -Wanalyzer-malloc-leak @gol
>  -Wanalyzer-mismatching-deallocation @gol
> @@ -9946,6 +9948,18 @@ is called on a non-heap pointer (e.g. an on-
> stack buffer, or a global).
>  
>  See @uref{https://cwe.mitre.org/data/definitions/590.html, CWE-590:
> Free of Memory not on the Heap}.
>  
> +@item -Wno-analyzer-imprecise-floating-point-arithmetic
> +@opindex Wanalyzer-imprecise-floating-point-arithmetic
> +@opindex Wno-analyzer-imprecise-floating-point-arithmetic
> +This warning requires @option{-fanalyzer}, which enables it; use
> +@option{-Wno-analyzer-imprecise-floating-point-arithmetic}
> +to disable it.
> +
> +This diagnostic warns for paths through the code in which floating
> point
> +arithmetic is used in locations where precise computation is
> needed.  This
> +diagnostic only warns on use of floating points operands inside the
> +allocation size at the moment.

"inside the allocation size " -> "inside the calculation of an
allocation size".

[...snip...]

Other than the above, the patch looks good; thanks.

Dave

Re: [PATCH] analyzer: warn on the use of floating points in the size argument [PR106181]

Reply via email to