On Mon, 2022-08-15 at 14:35 +0200, Tim Lange wrote: > This patch fixes the ICE reported in PR106181 and adds a new warning > to > the analyzer complaining about the use of floating point operands.
Thanks for the patch. Various comments inline... > > I decided to move the warning for floats inside the size argument out > of > the allocation size checker code and toward the allocation such that > the > warning only appears once. > I'm not sure about the wording of the diagnostic, my current wording > feels > a bit bulky. Agreed, and the warning itself is probably setting a new record for option length: -Wanalyzer-imprecise-floating-point-arithmetic is 45 characters. I'm not without sin here: I think the current record is - Wanalyzer-unsafe-call-within-signal-handler, which is 43. How about: -Wanalyzer-imprecise-float-arithmetic -Wanalyzer-imprecise-fp-arithmetic instead? (ideas welcome) > Here is how the diagnostics look like: > > /path/to/main.c: In function ‘test_1’: > /path/to/main.c:10:14: warning: use of floating point arithmetic > inside the size argument might yield unexpected results https://gcc.gnu.org/codingconventions.html#Spelling says we should use "floating-point" rather than "floating point". How about just this: "warning: use of floating-point arithmetic here might yield unexpected results" here... > [-Wanalyzer-imprecise-floating-point-arithmetic] > 10 | int *ptr = malloc (sizeof (int) * n); /* { dg-line test_1 } > */ > | ^~~~~~~~~~~~~~~~~~~~~~~~~ > ‘test_1’: event 1 > | > | 10 | int *ptr = malloc (sizeof (int) * n); /* { dg-line > test_1 } */ > | | ^~~~~~~~~~~~~~~~~~~~~~~~~ > | | | > | | (1) operand ‘n’ is of type ‘float’ > | > /path/to/main.c:10:14: note: only use operands of a type that > represents whole numbers inside the size argument ...and make the note say: "only use operands of an integer type inside the size argument" which tells the user that it's a size that we're complaining about. > /path/to/main.c: In function ‘test_2’: > /path/to/main.c:20:14: warning: use of floating point arithmetic > inside the size argument might yield unexpected results [-Wanalyzer- > imprecise-floating-point-arithmetic] > 20 | int *ptr = malloc (n * 3.1); /* { dg-line test_2 } */ > | ^~~~~~~~~~~~~~~~ > ‘test_2’: event 1 > | > | 20 | int *ptr = malloc (n * 3.1); /* { dg-line test_2 } */ > | | ^~~~~~~~~~~~~~~~ > | | | > | | (1) operand ‘3.1000000000000001e+0’ is of > type ‘double’ > | > /path/to/main.c:20:14: note: only use operands of a type that > represents whole numbers inside the size argument > > Also, another point to discuss is the event note in case the > expression is > wrapped in a variable, such as in test_3: > /path/to/main.c:30:10: warning: use of floating point arithmetic > inside the size argument might yield unexpected results [-Wanalyzer- > imprecise-floating-point-arithmetic] > 30 | return malloc (size); /* { dg-line test_3 } */ > | ^~~~~~~~~~~~~ > ‘test_3’: events 1-2 > | > | 37 | void test_3 (float f) > | | ^~~~~~ > | | | > | | (1) entry to ‘test_3’ > | 38 | { > | 39 | void *ptr = alloc_me (f); /* { dg-message "calling > 'alloc_me' from 'test_3'" } */ > | | ~~~~~~~~~~~~ > | | | > | | (2) calling ‘alloc_me’ from ‘test_3’ > | > +--> ‘alloc_me’: events 3-4 > | > | 28 | void *alloc_me (size_t size) > | | ^~~~~~~~ > | | | > | | (3) entry to ‘alloc_me’ > | 29 | { > | 30 | return malloc (size); /* { dg-line test_3 } */ > | | ~~~~~~~~~~~~~ > | | | > | | (4) operand ‘f’ is of type ‘float’ > | > > I'm not sure if it is easily discoverable that event (4) does refer > to > 'size'. I thought about also printing get_representative_tree > (capacity) > in the event but that would clutter the event message if the argument > does hold the full expression. I don't have any strong feelings about > the > decision here but if I had to decide I'd leave it as is (especially > because the warning is probably quite unusual). Yeah; get_representative_tree tries gets a tree, but trees don't give us a good way of referring to a local var within a particular stack frame within a path. So leaving it as is is OK. > The index of the argument would also be a possibility, but that would > get > tricky for calloc. [...snip...] > > diff --git a/gcc/analyzer/analyzer.opt b/gcc/analyzer/analyzer.opt > index 61b58c575ff..bef15eae2c3 100644 > --- a/gcc/analyzer/analyzer.opt > +++ b/gcc/analyzer/analyzer.opt > @@ -98,6 +98,10 @@ Wanalyzer-free-of-non-heap > Common Var(warn_analyzer_free_of_non_heap) Init(1) Warning > Warn about code paths in which a non-heap pointer is freed. > > +Wanalyzer-imprecise-floating-point-arithmetic > +Common Var(warn_analyzer_imprecise_floating_point_arithmetic) > Init(1) Warning > +Warn about code paths in which floating point arithmetic is use in "use" -> "used". > locations where precise computation is needed. > + > Wanalyzer-jump-through-null > Common Var(warn_analyzer_jump_through_null) Init(1) Warning > Warn about code paths in which a NULL function pointer is called. > diff --git a/gcc/analyzer/region-model-impl-calls.cc > b/gcc/analyzer/region-model-impl-calls.cc > index 8eebd122d42..c4d7e186963 100644 > --- a/gcc/analyzer/region-model-impl-calls.cc > +++ b/gcc/analyzer/region-model-impl-calls.cc > @@ -237,6 +237,7 @@ region_model::impl_call_alloca (const > call_details &cd) > const region *new_reg = create_region_for_alloca (size_sval, > cd.get_ctxt ()); > const svalue *ptr_sval > = m_mgr->get_ptr_svalue (cd.get_lhs_type (), new_reg); > + check_region_capacity_for_floats (ptr_sval, cd.get_ctxt ()); > cd.maybe_set_lhs (ptr_sval); > } > > @@ -393,6 +394,7 @@ region_model::impl_call_calloc (const > call_details &cd) > { > const svalue *ptr_sval > = m_mgr->get_ptr_svalue (cd.get_lhs_type (), new_reg); > + check_region_capacity_for_floats (ptr_sval, cd.get_ctxt ()); > cd.maybe_set_lhs (ptr_sval); > } > } > @@ -497,6 +499,7 @@ region_model::impl_call_malloc (const > call_details &cd) > { > const svalue *ptr_sval > = m_mgr->get_ptr_svalue (cd.get_lhs_type (), new_reg); > + check_region_capacity_for_floats (ptr_sval, cd.get_ctxt ()); > cd.maybe_set_lhs (ptr_sval); > } > } These checks would be simpler if they were consolidated into a single call to check_region_capacity_for_floats in: region_model::set_dynamic_extents akin to how we have check_dynamic_size_for_taint called there - though that would make it harder to complain about specific arguments at function calls. Given that we're not doing the latter, please consolidate them. [...snip...] > > +/* Abstract class for diagnostics related touse of floating point "touse" -> "to use" > + arithmetic where precision is needed. */ > + > +class imprecise_floating_point_arithmetic > +: public > pending_diagnostic_subclass<imprecise_floating_point_arithmetic> > +{ > +public: > + imprecise_floating_point_arithmetic () > + {} Do we need thic ctor? There aren't any fields to initialize. > + > + const char *get_kind () const final override > + { > + return "imprecise_floating_point_arithmetic"; > + } get_kind should probably only be implemented for concrete subclasses, such as float_as_size_arg. > + > + int get_controlling_option () const final override > + { > + return OPT_Wanalyzer_imprecise_floating_point_arithmetic; > + } > + > + bool > + operator== (const imprecise_floating_point_arithmetic &other > + ATTRIBUTE_UNUSED) const > + { > + return true; > + } I don't think it makes sense to have an operator== here for an abstract class with no fields. > +}; > + > +/* Concrete diagnostic to complain about uses of floating point > arithmetic > + in the size argument of malloc etc. */ > + > +class float_as_size_arg : public imprecise_floating_point_arithmetic > +{ > +public: > + float_as_size_arg (tree arg) : m_arg (arg) > + {} > + > + bool operator== (const float_as_size_arg &other) const > + { > + return pending_diagnostic::same_tree_p (m_arg, other.m_arg); > + } > + > + bool emit (rich_location *rich_loc) final override > + { > + diagnostic_metadata m; > + bool warned = warning_meta (rich_loc, m, get_controlling_option > (), > + "use of floating point arithmetic > inside the" > + " size argument might yield > unexpected" > + " results"); > + if (warned) > + inform (rich_loc->get_loc (), "only use operands of a type > that" > + " represents whole numbers inside > the" > + " size argument"); > + return warned; > + } > + > + label_text describe_final_event (const evdesc::final_event &ev) > final > + override > + { > + if (m_arg) > + return ev.formatted_print ("operand %qE is of type %qT", > + m_arg, TREE_TYPE (m_arg)); > + return ev.formatted_print ("at least one operand of the size > argument is" > + " of a floating point type"); > + } > + > +private: > + tree m_arg; > +}; > + > +/* Visitor to find uses of floating point variables/constants in an > svalue. */ > + > +class contains_floating_point_visitor : public visitor > +{ > +public: > + contains_floating_point_visitor (const svalue *root_sval) : > m_result (NULL) > + { > + root_sval->accept (this); > + } > + > + const svalue *get_svalue_to_report () > + { > + return m_result; > + } > + > + void visit_constant_svalue (const constant_svalue *sval) final > override > + { > + /* At the point the analyzer runs, constant integer operands in > a floating > + point expression are already implictly converted to floating > points. > + Thus, we do prefer to report non-constants such that the > diagnostic > + always reports a floating point operand. */ If the constant is seen first, then the non-constant won't be favored (though perhaps binary ops get canonicalized so that constants are on the RHS?). Maybe rework this so that the visitor accumulates an auto_vec<const svalue *> of all floating point svalues it sees, and have get_svalue_to_report do a qsort of them (if non-empty), and pick the first, using a custom comparator to order them into preference of reporting? Though this is possibly overkill. > + tree type = sval->get_type (); > + if (type && FLOAT_TYPE_P (type) && !m_result) > + m_result = sval; > + } > + > + void visit_conjured_svalue (const conjured_svalue *sval) final > override > + { > + tree type = sval->get_type (); > + if (type && SCALAR_FLOAT_TYPE_P (type)) > + m_result = sval; > + } > + > + void visit_initial_svalue (const initial_svalue *sval) final > override > + { > + tree type = sval->get_type (); > + if (type && SCALAR_FLOAT_TYPE_P (type)) > + m_result = sval; > + } > + > +private: > + /* Non-null if at least one floating point operand was found. */ > + const svalue *m_result; > +}; > + > +/* May complain about uses of floating point operands in the > capacity of SVAL. > + > + SVAL is expected to be a region_svalue. */ > + > +void > +region_model::check_region_capacity_for_floats (const svalue *sval, > + region_model_context > *ctxt) > +const > +{ > + if (!ctxt) > + return; > + > + if (const region_svalue *reg_sval = dyn_cast <const region_svalue > *> (sval)) > + { > + const svalue *capacity = get_capacity (reg_sval->get_pointee > ()); > + contains_floating_point_visitor v (capacity); > + if (const svalue *float_sval = v.get_svalue_to_report ()) > + { > + tree diag_arg = get_representative_tree (float_sval); > + ctxt->warn (new float_as_size_arg (diag_arg)); > + } > + } > +} > + > /* Set the value of the region given by LHS_REG to the value given > by RHS_SVAL. > Use CTXT to report any warnings associated with writing to > LHS_REG. */ [...snip...] > @@ -9758,6 +9759,7 @@ Enabling this option effectively enables the > following warnings: > -Wanalyzer-fd-use-without-check @gol > -Wanalyzer-file-leak @gol > -Wanalyzer-free-of-non-heap @gol > +-Wno-analyzer-imprecise-floating-point-arithmetic @gol Lose the "no-" here. > -Wanalyzer-jump-through-null @gol > -Wanalyzer-malloc-leak @gol > -Wanalyzer-mismatching-deallocation @gol > @@ -9946,6 +9948,18 @@ is called on a non-heap pointer (e.g. an on- > stack buffer, or a global). > > See @uref{https://cwe.mitre.org/data/definitions/590.html, CWE-590: > Free of Memory not on the Heap}. > > +@item -Wno-analyzer-imprecise-floating-point-arithmetic > +@opindex Wanalyzer-imprecise-floating-point-arithmetic > +@opindex Wno-analyzer-imprecise-floating-point-arithmetic > +This warning requires @option{-fanalyzer}, which enables it; use > +@option{-Wno-analyzer-imprecise-floating-point-arithmetic} > +to disable it. > + > +This diagnostic warns for paths through the code in which floating > point > +arithmetic is used in locations where precise computation is > needed. This > +diagnostic only warns on use of floating points operands inside the > +allocation size at the moment. "inside the allocation size " -> "inside the calculation of an allocation size". [...snip...] Other than the above, the patch looks good; thanks. Dave