https://gcc.gnu.org/g:148066bd0560b5136692991dacba15c9f21caf96
commit r15-2287-g148066bd0560b5136692991dacba15c9f21caf96 Author: David Malcolm <dmalc...@redhat.com> Date: Wed Jul 24 18:07:54 2024 -0400 diagnostics: SARIF output: potentially add escaped renderings of source (§3.3.4) This patch adds support to our SARIF output for cases where rich_loc.escape_on_output_p () is true, such as for -Wbidi-chars. In such cases, the pertinent SARIF "location" object gains a property bag with property "gcc/escapeNonAscii": true, and the "artifactContent" within the location's physical location's snippet" gains a "rendered" property (§3.3.4) that escapes non-ASCII text in the snippet, such as: "rendered": {"text": where "text" has a string value such as (for a "trojan source" attack): "9 | /*<U+202E> } <U+2066>if (isAdmin)<U+2069> <U+2066> begin admins only */\n" " | ~~~~~~~~ ~~~~~~~~ ^\n" " | | | |\n" " | | | end of bidirectional context\n" " | U+202E (RIGHT-TO-LEFT OVERRIDE) U+2066 (LEFT-TO-RIGHT ISOLATE)\n" where the escaping is affected by -fdiagnostics-escape-format=; with -fdiagnostics-escape-format=bytes, the rendered text of the above is: "9 | /*<e2><80><ae> } <e2><81><a6>if (isAdmin)<e2><81><a9> <e2><81><a6> begin admins only */\n" " | ~~~~~~~~~~~~ ~~~~~~~~~~~~ ^\n" " | | | |\n" " | U+202E (RIGHT-TO-LEFT OVERRIDE) U+2066 (LEFT-TO-RIGHT ISOLATE) end of bidirectional context\n" The patch also refactors/adds enough selftest machinery to be able to test the snippet generation from within the selftest framework, rather than just within DejaGnu (where the regex-based testing isn't sophisticated enough to verify such properties as the above). gcc/ChangeLog: * Makefile.in (OBJS-libcommon): Add selftest-json.o. * diagnostic-format-sarif.cc: Include "selftest.h", "selftest-diagnostic.h", "selftest-diagnostic-show-locus.h", "selftest-json.h", and "text-range-label.h". (class content_renderer): New. (sarif_builder::m_rules_arr): Convert to std::unique_ptr. (sarif_builder::make_location_object): Add class escape_nonascii_renderer. If rich_loc.escape_on_output_p (), pass a nonnull escape_nonascii_renderer to maybe_make_physical_location_object as its snippet_renderer, and add a property bag property "gcc/escapeNonAscii" to the SARIF location object. For other overloads of make_location_object, pass nullptr for the snippet_renderer. (sarif_builder::maybe_make_region_object_for_context): Add "snippet_renderer" param and pass it to maybe_make_artifact_content_object. (sarif_builder::make_tool_object): Drop "const". (sarif_builder::make_driver_tool_component_object): Likewise. Use typesafe unique_ptr variant of object::set for setting "rules" property on driver_obj. (sarif_builder::maybe_make_artifact_content_object): Add param "r" and use it to potentially set the "rendered" property (§3.3.4). (selftest::test_make_location_object): New. (selftest::diagnostic_format_sarif_cc_tests): New. * diagnostic-show-locus.cc: Include "text-range-label.h" and "selftest-diagnostic-show-locus.h". (selftests::diagnostic_show_locus_fixture::diagnostic_show_locus_fixture): New. (selftests::test_layout_x_offset_display_utf8): Use diagnostic_show_locus_fixture to simplify and consolidate setup code. (selftests::test_diagnostic_show_locus_one_liner): Likewise. (selftests::test_one_liner_colorized_utf8): Likewise. (selftests::test_diagnostic_show_locus_one_liner_utf8): Likewise. * gcc-rich-location.h (class text_range_label): Move to new file text-range-label.h. * selftest-diagnostic-show-locus.h: New file, based on material in diagnostic-show-locus.cc. * selftest-json.cc: New file. * selftest-json.h: New file. * selftest-run-tests.cc (selftest::run_tests): Call selftest::diagnostic_format_sarif_cc_tests. * selftest.h (selftest::diagnostic_format_sarif_cc_tests): New decl. gcc/testsuite/ChangeLog: * c-c++-common/diagnostic-format-sarif-file-Wbidi-chars.c: Verify that we have a property bag with property "gcc/escapeNonAscii": true. Verify that we have a "rendered" property for a snippet. * gcc.dg/plugin/diagnostic_plugin_test_show_locus.c: Include "text-range-label.h". gcc/ChangeLog: * text-range-label.h: New file, taking class text_range_label from gcc-rich-location.h. libcpp/ChangeLog: * include/rich-location.h (semi_embedded_vec::semi_embedded_vec): Add copy ctor. (rich_location::rich_location): Remove "= delete" from decl of copy ctor. Add deleted decl of move ctor. (rich_location::operator=): Remove "= delete" from decl of copy assignment. Add deleted decl of move assignment. (fixit_hint::fixit_hint): Add copy ctor decl. Add deleted decl of move. (fixit_hint::operator=): Add copy assignment decl. Add deleted decl of move assignment. * line-map.cc (rich_location::rich_location): New copy ctor. (fixit_hint::fixit_hint): New copy ctor. Signed-off-by: David Malcolm <dmalc...@redhat.com> Diff: --- gcc/Makefile.in | 1 + gcc/diagnostic-format-sarif.cc | 229 +++++++++++++++++++-- gcc/diagnostic-show-locus.cc | 98 ++++----- gcc/gcc-rich-location.h | 17 -- gcc/selftest-diagnostic-show-locus.h | 82 ++++++++ gcc/selftest-json.cc | 119 +++++++++++ gcc/selftest-json.h | 100 +++++++++ gcc/selftest-run-tests.cc | 1 + gcc/selftest.h | 1 + .../diagnostic-format-sarif-file-Wbidi-chars.c | 9 + .../plugin/diagnostic_plugin_test_show_locus.c | 1 + gcc/text-range-label.h | 42 ++++ libcpp/include/rich-location.h | 31 ++- libcpp/line-map.cc | 28 +++ 14 files changed, 669 insertions(+), 90 deletions(-) diff --git a/gcc/Makefile.in b/gcc/Makefile.in index 4fc86ed7938b..8fba8f7db6a2 100644 --- a/gcc/Makefile.in +++ b/gcc/Makefile.in @@ -1832,6 +1832,7 @@ OBJS-libcommon = diagnostic-spec.o diagnostic.o diagnostic-color.o \ vec.o input.o hash-table.o ggc-none.o memory-block.o \ selftest.o selftest-diagnostic.o sort.o \ selftest-diagnostic-path.o \ + selftest-json.o \ selftest-logical-location.o \ text-art/box-drawing.o \ text-art/canvas.o \ diff --git a/gcc/diagnostic-format-sarif.cc b/gcc/diagnostic-format-sarif.cc index 6f61d89363f2..847e1eb9bdfc 100644 --- a/gcc/diagnostic-format-sarif.cc +++ b/gcc/diagnostic-format-sarif.cc @@ -37,9 +37,16 @@ along with GCC; see the file COPYING3. If not see #include "ordered-hash-map.h" #include "sbitmap.h" #include "make-unique.h" +#include "selftest.h" +#include "selftest-diagnostic.h" +#include "selftest-diagnostic-show-locus.h" +#include "selftest-json.h" +#include "text-range-label.h" /* Forward decls. */ class sarif_builder; +class content_renderer; + class escape_nonascii_renderer; /* Subclasses of sarif_object. Keep these in order of their descriptions in the specification. */ @@ -284,6 +291,20 @@ public: sarif_builder &builder); }; +/* Abstract base class for use when making an "artifactContent" + object (SARIF v2.1.0 section 3.3): generate a value for the + 3.3.4 "rendered" property. + Can return nullptr, for "no property". */ + +class content_renderer +{ +public: + virtual ~content_renderer () {} + + virtual std::unique_ptr<sarif_multiformat_message_string> + render (const sarif_builder &builder) const = 0; +}; + /* A class for managing SARIF output (for -fdiagnostics-format=sarif-stderr and -fdiagnostics-format=sarif-file). @@ -312,7 +333,6 @@ public: property (SARIF v2.1.0 section 3.14.11), as invocation objects (SARIF v2.1.0 section 3.20), but we'd want to capture the arguments to toplev::main, and the response files. - - doesn't capture escape_on_output_p - doesn't capture secondary locations within a rich_location (perhaps we should use the "relatedLocations" property: SARIF v2.1.0 section 3.27.22) @@ -379,7 +399,8 @@ private: std::unique_ptr<sarif_physical_location> maybe_make_physical_location_object (location_t loc, enum diagnostic_artifact_role role, - int column_override); + int column_override, + const content_renderer *snippet_renderer); std::unique_ptr<sarif_artifact_location> make_artifact_location_object (location_t loc); std::unique_ptr<sarif_artifact_location> @@ -390,7 +411,8 @@ private: maybe_make_region_object (location_t loc, int column_override) const; std::unique_ptr<sarif_region> - maybe_make_region_object_for_context (location_t loc) const; + maybe_make_region_object_for_context (location_t loc, + const content_renderer *snippet_renderer) const; std::unique_ptr<sarif_region> make_region_object_for_hint (const fixit_hint &hint) const; std::unique_ptr<sarif_multiformat_message_string> @@ -402,9 +424,9 @@ private: make_run_object (std::unique_ptr<sarif_invocation> invocation_obj, std::unique_ptr<json::array> results); std::unique_ptr<sarif_tool> - make_tool_object () const; + make_tool_object (); std::unique_ptr<sarif_tool_component> - make_driver_tool_component_object () const; + make_driver_tool_component_object (); std::unique_ptr<json::array> maybe_make_taxonomies_array () const; std::unique_ptr<sarif_tool_component> maybe_make_cwe_taxonomy_object () const; @@ -430,7 +452,8 @@ private: std::unique_ptr<sarif_artifact_content> maybe_make_artifact_content_object (const char *filename, int start_line, - int end_line) const; + int end_line, + const content_renderer *r) const; std::unique_ptr<sarif_fix> make_fix_object (const rich_location &rich_loc); std::unique_ptr<sarif_artifact_change> @@ -460,7 +483,7 @@ private: bool m_seen_any_relative_paths; hash_set <free_string_hash> m_rule_id_set; - json::array *m_rules_arr; + std::unique_ptr<json::array> m_rules_arr; /* The set of all CWE IDs we've seen, if any. */ hash_set <int_hash <int, 0, 1> > m_cwe_id_set; @@ -1086,21 +1109,74 @@ sarif_builder::make_location_object (const rich_location &rich_loc, const logical_location *logical_loc, enum diagnostic_artifact_role role) { + class escape_nonascii_renderer : public content_renderer + { + public: + escape_nonascii_renderer (const rich_location &richloc, + enum diagnostics_escape_format escape_format) + : m_richloc (richloc), + m_escape_format (escape_format) + {} + + std::unique_ptr<sarif_multiformat_message_string> + render (const sarif_builder &builder) const final override + { + diagnostic_context dc; + diagnostic_initialize (&dc, 0); + dc.m_source_printing.enabled = true; + dc.m_source_printing.colorize_source_p = false; + dc.m_source_printing.show_labels_p = true; + dc.m_source_printing.show_line_numbers_p = true; + + rich_location my_rich_loc (m_richloc); + my_rich_loc.set_escape_on_output (true); + + dc.set_escape_format (m_escape_format); + diagnostic_show_locus (&dc, &my_rich_loc, DK_ERROR); + + std::unique_ptr<sarif_multiformat_message_string> result + = builder.make_multiformat_message_string + (pp_formatted_text (dc.printer)); + + diagnostic_finish (&dc); + + return result; + } + private: + const rich_location &m_richloc; + enum diagnostics_escape_format m_escape_format; + } the_renderer (rich_loc, + m_context.get_escape_format ()); + auto location_obj = ::make_unique<sarif_location> (); /* Get primary loc from RICH_LOC. */ location_t loc = rich_loc.get_loc (); /* "physicalLocation" property (SARIF v2.1.0 section 3.28.3). */ + const content_renderer *snippet_renderer + = rich_loc.escape_on_output_p () ? &the_renderer : nullptr; if (auto phs_loc_obj = maybe_make_physical_location_object (loc, role, - rich_loc.get_column_override ())) + rich_loc.get_column_override (), + snippet_renderer)) location_obj->set<sarif_physical_location> ("physicalLocation", std::move (phs_loc_obj)); /* "logicalLocations" property (SARIF v2.1.0 section 3.28.4). */ set_any_logical_locs_arr (*location_obj, logical_loc); + /* A flag for hinting that the diagnostic involves issues at the + level of character encodings (such as homoglyphs, or misleading + bidirectional control codes), and thus that it will be helpful + to the user if we show some representation of + how the characters in the pertinent source lines are encoded. */ + if (rich_loc.escape_on_output_p ()) + { + sarif_property_bag &bag = location_obj->get_or_create_properties (); + bag.set_bool ("gcc/escapeNonAscii", rich_loc.escape_on_output_p ()); + } + return location_obj; } @@ -1115,7 +1191,8 @@ sarif_builder::make_location_object (const diagnostic_event &event, /* "physicalLocation" property (SARIF v2.1.0 section 3.28.3). */ location_t loc = event.get_location (); - if (auto phs_loc_obj = maybe_make_physical_location_object (loc, role, 0)) + if (auto phs_loc_obj + = maybe_make_physical_location_object (loc, role, 0, nullptr)) location_obj->set<sarif_physical_location> ("physicalLocation", std::move (phs_loc_obj)); @@ -1144,7 +1221,8 @@ std::unique_ptr<sarif_physical_location> sarif_builder:: maybe_make_physical_location_object (location_t loc, enum diagnostic_artifact_role role, - int column_override) + int column_override, + const content_renderer *snippet_renderer) { if (loc <= BUILTINS_LOCATION || LOCATION_FILE (loc) == nullptr) return nullptr; @@ -1161,7 +1239,8 @@ maybe_make_physical_location_object (location_t loc, phys_loc_obj->set<sarif_region> ("region", std::move (region_obj)); /* "contextRegion" property (SARIF v2.1.0 section 3.29.5). */ - if (auto context_region_obj = maybe_make_region_object_for_context (loc)) + if (auto context_region_obj + = maybe_make_region_object_for_context (loc, snippet_renderer)) phys_loc_obj->set<sarif_region> ("contextRegion", std::move (context_region_obj)); @@ -1339,7 +1418,10 @@ sarif_builder::maybe_make_region_object (location_t loc, the pertinent source. */ std::unique_ptr<sarif_region> -sarif_builder::maybe_make_region_object_for_context (location_t loc) const +sarif_builder:: +maybe_make_region_object_for_context (location_t loc, + const content_renderer *snippet_renderer) + const { location_t caret_loc = get_pure_location (loc); @@ -1373,7 +1455,8 @@ sarif_builder::maybe_make_region_object_for_context (location_t loc) const if (auto artifact_content_obj = maybe_make_artifact_content_object (exploc_start.file, exploc_start.line, - exploc_finish.line)) + exploc_finish.line, + snippet_renderer)) region_obj->set<sarif_artifact_content> ("snippet", std::move (artifact_content_obj)); @@ -1716,7 +1799,7 @@ make_run_object (std::unique_ptr<sarif_invocation> invocation_obj, /* Make a "tool" object (SARIF v2.1.0 section 3.18). */ std::unique_ptr<sarif_tool> -sarif_builder::make_tool_object () const +sarif_builder::make_tool_object () { auto tool_obj = ::make_unique<sarif_tool> (); @@ -1777,7 +1860,7 @@ sarif_builder::make_tool_object () const calls the "driver" (see SARIF v2.1.0 section 3.18.1). */ std::unique_ptr<sarif_tool_component> -sarif_builder::make_driver_tool_component_object () const +sarif_builder::make_driver_tool_component_object () { auto driver_obj = ::make_unique<sarif_tool_component> (); @@ -1809,7 +1892,7 @@ sarif_builder::make_driver_tool_component_object () const } /* "rules" property (SARIF v2.1.0 section 3.19.23). */ - driver_obj->set ("rules", m_rules_arr); + driver_obj->set<json::array> ("rules", std::move (m_rules_arr)); return driver_obj; } @@ -1971,12 +2054,16 @@ sarif_builder::get_source_lines (const char *filename, } /* Make an "artifactContent" object (SARIF v2.1.0 section 3.3) for the given - run of lines within FILENAME (including the endpoints). */ + run of lines within FILENAME (including the endpoints). + If R is non-NULL, use it to potentially set the "rendered" + property (3.3.4). */ std::unique_ptr<sarif_artifact_content> -sarif_builder::maybe_make_artifact_content_object (const char *filename, - int start_line, - int end_line) const +sarif_builder:: +maybe_make_artifact_content_object (const char *filename, + int start_line, + int end_line, + const content_renderer *r) const { char *text_utf8 = get_source_lines (filename, start_line, end_line); @@ -1994,6 +2081,12 @@ sarif_builder::maybe_make_artifact_content_object (const char *filename, artifact_content_obj->set_string ("text", text_utf8); free (text_utf8); + /* 3.3.4 "rendered" property. */ + if (r) + if (std::unique_ptr<sarif_multiformat_message_string> rendered + = r->render (*this)) + artifact_content_obj->set ("rendered", std::move (rendered)); + return artifact_content_obj; } @@ -2260,3 +2353,99 @@ diagnostic_output_format_init_sarif_stream (diagnostic_context &context, formatted, stream)); } + +#if CHECKING_P + +namespace selftest { + +static void +test_make_location_object (const line_table_case &case_) +{ + diagnostic_show_locus_fixture_one_liner_utf8 f (case_); + location_t line_end = linemap_position_for_column (line_table, 31); + + /* Don't attempt to run the tests if column data might be unavailable. */ + if (line_end > LINE_MAP_MAX_LOCATION_WITH_COLS) + return; + + test_diagnostic_context dc; + + sarif_builder builder (dc, "MAIN_INPUT_FILENAME", true); + + const location_t foo + = make_location (linemap_position_for_column (line_table, 1), + linemap_position_for_column (line_table, 1), + linemap_position_for_column (line_table, 8)); + const location_t bar + = make_location (linemap_position_for_column (line_table, 12), + linemap_position_for_column (line_table, 12), + linemap_position_for_column (line_table, 17)); + const location_t field + = make_location (linemap_position_for_column (line_table, 19), + linemap_position_for_column (line_table, 19), + linemap_position_for_column (line_table, 30)); + + text_range_label label0 ("label0"); + text_range_label label1 ("label1"); + text_range_label label2 ("label2"); + + rich_location richloc (line_table, foo, &label0, nullptr); + richloc.add_range (bar, SHOW_RANGE_WITHOUT_CARET, &label1); + richloc.add_range (field, SHOW_RANGE_WITHOUT_CARET, &label2); + richloc.set_escape_on_output (true); + + std::unique_ptr<sarif_location> location_obj + = builder.make_location_object + (richloc, nullptr, diagnostic_artifact_role::analysis_target); + ASSERT_NE (location_obj, nullptr); + + auto physical_location + = EXPECT_JSON_OBJECT_WITH_OBJECT_PROPERTY (location_obj.get (), + "physicalLocation"); + { + auto region + = EXPECT_JSON_OBJECT_WITH_OBJECT_PROPERTY (physical_location, "region"); + ASSERT_JSON_INT_PROPERTY_EQ (region, "startLine", 1); + ASSERT_JSON_INT_PROPERTY_EQ (region, "startColumn", 1); + ASSERT_JSON_INT_PROPERTY_EQ (region, "endColumn", 7); + } + { + auto context_region + = EXPECT_JSON_OBJECT_WITH_OBJECT_PROPERTY (physical_location, + "contextRegion"); + ASSERT_JSON_INT_PROPERTY_EQ (context_region, "startLine", 1); + + { + auto snippet + = EXPECT_JSON_OBJECT_WITH_OBJECT_PROPERTY (context_region, "snippet"); + + /* We expect the snippet's "text" to be a copy of the content. */ + ASSERT_JSON_STRING_PROPERTY_EQ (snippet, "text", f.m_content); + + /* We expect the snippet to have a "rendered" whose "text" has a + pure ASCII escaped copy of the line (with labels, etc). */ + { + auto rendered + = EXPECT_JSON_OBJECT_WITH_OBJECT_PROPERTY (snippet, "rendered"); + ASSERT_JSON_STRING_PROPERTY_EQ + (rendered, "text", + "1 | <U+1F602>_foo = <U+03C0>_bar.<U+1F602>_field<U+03C0>;\n" + " | ^~~~~~~~~~~~~ ~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~~\n" + " | | | |\n" + " | label0 label1 label2\n"); + } + } + } +} + +/* Run all of the selftests within this file. */ + +void +diagnostic_format_sarif_cc_tests () +{ + for_each_line_table_case (test_make_location_object); +} + +} // namespace selftest + +#endif /* CHECKING_P */ diff --git a/gcc/diagnostic-show-locus.cc b/gcc/diagnostic-show-locus.cc index d0fc2ff1b6d2..8079809be93d 100644 --- a/gcc/diagnostic-show-locus.cc +++ b/gcc/diagnostic-show-locus.cc @@ -29,8 +29,10 @@ along with GCC; see the file COPYING3. If not see #include "diagnostic.h" #include "diagnostic-color.h" #include "gcc-rich-location.h" +#include "text-range-label.h" #include "selftest.h" #include "selftest-diagnostic.h" +#include "selftest-diagnostic-show-locus.h" #include "cpplib.h" #include "text-art/types.h" #include "text-art/theme.h" @@ -3291,6 +3293,18 @@ namespace selftest { /* Selftests for diagnostic_show_locus. */ +diagnostic_show_locus_fixture:: +diagnostic_show_locus_fixture (const line_table_case &case_, + const char *content) +: m_content (content), + m_tmp_source_file (SELFTEST_LOCATION, ".c", content), + m_ltt (case_), + m_fc () +{ + linemap_add (line_table, LC_ENTER, false, + m_tmp_source_file.get_filename (), 1); +} + /* Verify that cpp_display_width correctly handles escaping. */ static void @@ -3395,11 +3409,9 @@ test_layout_x_offset_display_utf8 (const line_table_case &case_) no multibyte characters earlier on the line. */ const int emoji_col = 102; - temp_source_file tmp (SELFTEST_LOCATION, ".c", content); - file_cache fc; - line_table_test ltt (case_); + diagnostic_show_locus_fixture f (case_, content); - linemap_add (line_table, LC_ENTER, false, tmp.get_filename (), 1); + linemap_add (line_table, LC_ENTER, false, f.get_filename (), 1); location_t line_end = linemap_position_for_column (line_table, line_bytes); @@ -3407,16 +3419,16 @@ test_layout_x_offset_display_utf8 (const line_table_case &case_) if (line_end > LINE_MAP_MAX_LOCATION_WITH_COLS) return; - ASSERT_STREQ (tmp.get_filename (), LOCATION_FILE (line_end)); + ASSERT_STREQ (f.get_filename (), LOCATION_FILE (line_end)); ASSERT_EQ (1, LOCATION_LINE (line_end)); ASSERT_EQ (line_bytes, LOCATION_COLUMN (line_end)); - char_span lspan = fc.get_source_line (tmp.get_filename (), 1); + char_span lspan = f.m_fc.get_source_line (f.get_filename (), 1); ASSERT_EQ (line_display_cols, cpp_display_width (lspan.get_buffer (), lspan.length (), def_policy ())); ASSERT_EQ (line_display_cols, - location_compute_display_column (fc, + location_compute_display_column (f.m_fc, expand_location (line_end), def_policy ())); ASSERT_EQ (0, memcmp (lspan.get_buffer () + (emoji_col - 1), @@ -4215,10 +4227,8 @@ test_diagnostic_show_locus_one_liner (const line_table_case &case_) ....................0000000001111111. ....................1234567890123456. */ const char *content = "foo = bar.field;\n"; - temp_source_file tmp (SELFTEST_LOCATION, ".c", content); - line_table_test ltt (case_); - linemap_add (line_table, LC_ENTER, false, tmp.get_filename (), 1); + diagnostic_show_locus_fixture f (case_, content); location_t line_end = linemap_position_for_column (line_table, 16); @@ -4226,7 +4236,7 @@ test_diagnostic_show_locus_one_liner (const line_table_case &case_) if (line_end > LINE_MAP_MAX_LOCATION_WITH_COLS) return; - ASSERT_STREQ (tmp.get_filename (), LOCATION_FILE (line_end)); + ASSERT_STREQ (f.get_filename (), LOCATION_FILE (line_end)); ASSERT_EQ (1, LOCATION_LINE (line_end)); ASSERT_EQ (16, LOCATION_COLUMN (line_end)); @@ -4246,27 +4256,17 @@ test_diagnostic_show_locus_one_liner (const line_table_case &case_) test_one_liner_labels (); } -/* Version of all one-liner tests exercising multibyte awareness. For - simplicity we stick to using two multibyte characters in the test, U+1F602 - == "\xf0\x9f\x98\x82", which uses 4 bytes and 2 display columns, and U+03C0 - == "\xcf\x80", which uses 2 bytes and 1 display column. Note: all of the - below asserts would be easier to read if we used UTF-8 directly in the - string constants, but it seems better not to demand the host compiler - support this, when it isn't otherwise necessary. Instead, whenever an - extended character appears in a string, we put a line break after it so that - all succeeding characters can appear visually at the correct display column. +/* Version of all one-liner tests exercising multibyte awareness. + These are all called from test_diagnostic_show_locus_one_liner, + which uses diagnostic_show_locus_fixture_one_liner_utf8 to create + the test file; see the notes in diagnostic-show-locus-selftest.h. - All of these work on the following 1-line source file: - - .0000000001111111111222222 display - .1234567890123456789012345 columns - "SS_foo = P_bar.SS_fieldP;\n" - .0000000111111111222222223 byte - .1356789012456789134567891 columns - - which is set up by test_diagnostic_show_locus_one_liner and calls - them. Here SS represents the two display columns for the U+1F602 emoji and - P represents the one display column for the U+03C0 pi symbol. */ + Note: all of the below asserts would be easier to read if we used UTF-8 + directly in the string constants, but it seems better not to demand the + host compiler support this, when it isn't otherwise necessary. Instead, + whenever an extended character appears in a string, we put a line break + after it so that all succeeding characters can appear visually at the + correct display column. */ /* Just a caret. */ @@ -4784,25 +4784,27 @@ test_one_liner_colorized_utf8 () ASSERT_STR_CONTAINS (first_pi + 2, "\xcf\x80"); } +static const char * const one_liner_utf8_content + /* Display columns. + 0000000000000000000000011111111111111111111111111111112222222222222 + 1111111122222222345678900000000123456666666677777777890123444444445 */ + = "\xf0\x9f\x98\x82_foo = \xcf\x80_bar.\xf0\x9f\x98\x82_field\xcf\x80;\n"; + /* 0000000000000000000001111111111111111111222222222222222222222233333 + 1111222233334444567890122223333456789999000011112222345678999900001 + Byte columns. */ + +diagnostic_show_locus_fixture_one_liner_utf8:: +diagnostic_show_locus_fixture_one_liner_utf8 (const line_table_case &case_) +: diagnostic_show_locus_fixture (case_, one_liner_utf8_content) +{ +} + /* Run the various one-liner tests. */ static void test_diagnostic_show_locus_one_liner_utf8 (const line_table_case &case_) { - /* Create a tempfile and write some text to it. */ - const char *content - /* Display columns. - 0000000000000000000000011111111111111111111111111111112222222222222 - 1111111122222222345678900000000123456666666677777777890123444444445 */ - = "\xf0\x9f\x98\x82_foo = \xcf\x80_bar.\xf0\x9f\x98\x82_field\xcf\x80;\n"; - /* 0000000000000000000001111111111111111111222222222222222222222233333 - 1111222233334444567890122223333456789999000011112222345678999900001 - Byte columns. */ - temp_source_file tmp (SELFTEST_LOCATION, ".c", content); - file_cache fc; - line_table_test ltt (case_); - - linemap_add (line_table, LC_ENTER, false, tmp.get_filename (), 1); + diagnostic_show_locus_fixture_one_liner_utf8 f (case_); location_t line_end = linemap_position_for_column (line_table, 31); @@ -4810,14 +4812,14 @@ test_diagnostic_show_locus_one_liner_utf8 (const line_table_case &case_) if (line_end > LINE_MAP_MAX_LOCATION_WITH_COLS) return; - ASSERT_STREQ (tmp.get_filename (), LOCATION_FILE (line_end)); + ASSERT_STREQ (f.get_filename (), LOCATION_FILE (line_end)); ASSERT_EQ (1, LOCATION_LINE (line_end)); ASSERT_EQ (31, LOCATION_COLUMN (line_end)); - char_span lspan = fc.get_source_line (tmp.get_filename (), 1); + char_span lspan = f.m_fc.get_source_line (f.get_filename (), 1); ASSERT_EQ (25, cpp_display_width (lspan.get_buffer (), lspan.length (), def_policy ())); - ASSERT_EQ (25, location_compute_display_column (fc, + ASSERT_EQ (25, location_compute_display_column (f.m_fc, expand_location (line_end), def_policy ())); diff --git a/gcc/gcc-rich-location.h b/gcc/gcc-rich-location.h index 0cd19aab1a7e..55798847726f 100644 --- a/gcc/gcc-rich-location.h +++ b/gcc/gcc-rich-location.h @@ -121,21 +121,4 @@ class gcc_rich_location : public rich_location location_t indent); }; -/* Concrete subclass of libcpp's range_label. - Simple implementation using a string literal. */ - -class text_range_label : public range_label -{ - public: - text_range_label (const char *text) : m_text (text) {} - - label_text get_text (unsigned /*range_idx*/) const final override - { - return label_text::borrow (m_text); - } - - private: - const char *m_text; -}; - #endif /* GCC_RICH_LOCATION_H */ diff --git a/gcc/selftest-diagnostic-show-locus.h b/gcc/selftest-diagnostic-show-locus.h new file mode 100644 index 000000000000..6454b09ac21d --- /dev/null +++ b/gcc/selftest-diagnostic-show-locus.h @@ -0,0 +1,82 @@ +/* Support for selftests involving diagnostic_show_locus. + Copyright (C) 1999-2024 Free Software Foundation, Inc. + +This file is part of GCC. + +GCC is free software; you can redistribute it and/or modify it under +the terms of the GNU General Public License as published by the Free +Software Foundation; either version 3, or (at your option) any later +version. + +GCC is distributed in the hope that it will be useful, but WITHOUT ANY +WARRANTY; without even the implied warranty of MERCHANTABILITY or +FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License +for more details. + +You should have received a copy of the GNU General Public License +along with GCC; see the file COPYING3. If not see +<http://www.gnu.org/licenses/>. */ + +#ifndef GCC_SELFTEST_DIAGNOSTIC_SHOW_LOCUS_H +#define GCC_SELFTEST_DIAGNOSTIC_SHOW_LOCUS_H + +#include "selftest.h" + +/* The selftest code should entirely disappear in a production + configuration, hence we guard all of it with #if CHECKING_P. */ + +#if CHECKING_P + +namespace selftest { + +/* RAII class for use in selftests involving diagnostic_show_locus. + + Manages creating and cleaning up the following: + - writing out a temporary .c file containing CONTENT + - temporarily override the global "line_table" (using CASE_) and + push a line_map starting at the first line of the temporary file + - provide a file_cache. */ + +struct diagnostic_show_locus_fixture +{ + diagnostic_show_locus_fixture (const line_table_case &case_, + const char *content); + + const char *get_filename () const + { + return m_tmp_source_file.get_filename (); + } + + const char *m_content; + temp_source_file m_tmp_source_file; + line_table_test m_ltt; + file_cache m_fc; +}; + +/* Fixture for one-liner tests exercising multibyte awareness. For + simplicity we stick to using two multibyte characters in the test, U+1F602 + == "\xf0\x9f\x98\x82", which uses 4 bytes and 2 display columns, and U+03C0 + == "\xcf\x80", which uses 2 bytes and 1 display column. + + This works with the following 1-line source file: + + .0000000001111111111222222 display + .1234567890123456789012345 columns + "SS_foo = P_bar.SS_fieldP;\n" + .0000000111111111222222223 byte + .1356789012456789134567891 columns + + Here SS represents the two display columns for the U+1F602 emoji and + P represents the one display column for the U+03C0 pi symbol. */ + +struct diagnostic_show_locus_fixture_one_liner_utf8 + : public diagnostic_show_locus_fixture +{ + diagnostic_show_locus_fixture_one_liner_utf8 (const line_table_case &case_); +}; + +} // namespace selftest + +#endif /* #if CHECKING_P */ + +#endif /* GCC_SELFTEST_DIAGNOSTIC_SHOW_LOCUS_H */ diff --git a/gcc/selftest-json.cc b/gcc/selftest-json.cc new file mode 100644 index 000000000000..86f27cb82999 --- /dev/null +++ b/gcc/selftest-json.cc @@ -0,0 +1,119 @@ +/* Selftest support for JSON. + Copyright (C) 2024 Free Software Foundation, Inc. + Contributed by David Malcolm <dmalc...@redhat.com>. + +This file is part of GCC. + +GCC is free software; you can redistribute it and/or modify it under +the terms of the GNU General Public License as published by the Free +Software Foundation; either version 3, or (at your option) any later +version. + +GCC is distributed in the hope that it will be useful, but WITHOUT ANY +WARRANTY; without even the implied warranty of MERCHANTABILITY or +FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License +for more details. + +You should have received a copy of the GNU General Public License +along with GCC; see the file COPYING3. If not see +<http://www.gnu.org/licenses/>. */ + +#include "config.h" +#define INCLUDE_MEMORY +#include "system.h" +#include "coretypes.h" +#include "diagnostic.h" +#include "selftest.h" +#include "selftest-json.h" + +/* The selftest code should entirely disappear in a production + configuration, hence we guard all of it with #if CHECKING_P. */ + +#if CHECKING_P + +namespace selftest { + +/* Assert that VALUE is a non-null json::object, + returning it as such, failing at LOC if this isn't the case. */ + +const json::object * +expect_json_object (const location &loc, + const json::value *value) +{ + ASSERT_NE_AT (loc, value, nullptr); + ASSERT_EQ_AT (loc, value->get_kind (), json::JSON_OBJECT); + return static_cast<const json::object *> (value); +} + +/* Assert that VALUE is a non-null json::object that has property + PROPERTY_NAME. + Return the value of the property. + Use LOC for any failures. */ + +const json::value * +expect_json_object_with_property (const location &loc, + const json::value *value, + const char *property_name) +{ + const json::object *obj = expect_json_object (loc, value); + const json::value *property_value = obj->get (property_name); + ASSERT_NE_AT (loc, property_value, nullptr); + return property_value; +} + +/* Assert that VALUE is a non-null json::object that has property + PROPERTY_NAME, and that the value of that property is a non-null + json::integer_number equalling EXPECTED_VALUE. + Use LOC for any failures. */ + +void +assert_json_int_property_eq (const location &loc, + const json::value *value, + const char *property_name, + long expected_value) +{ + const json::value *property_value + = expect_json_object_with_property (loc, value, property_name); + ASSERT_EQ_AT (loc, property_value->get_kind (), json::JSON_INTEGER); + long actual_value + = static_cast<const json::integer_number *> (property_value)->get (); + ASSERT_EQ_AT (loc, expected_value, actual_value); +} + +/* Assert that VALUE is a non-null json::object that has property + PROPERTY_NAME, and that the property value is a non-null JSON object. + Return the value of the property as a json::object. + Use LOC for any failures. */ + +const json::object * +expect_json_object_with_object_property (const location &loc, + const json::value *value, + const char *property_name) +{ + const json::value *property_value + = expect_json_object_with_property (loc, value, property_name); + ASSERT_EQ_AT (loc, property_value->get_kind (), json::JSON_OBJECT); + return static_cast<const json::object *> (property_value); +} + +/* Assert that VALUE is a non-null json::object that has property + PROPERTY_NAME, and that the value of that property is a non-null + JSON string equalling EXPECTED_VALUE. + Use LOC for any failures. */ + +void +assert_json_string_property_eq (const location &loc, + const json::value *value, + const char *property_name, + const char *expected_value) +{ + const json::value *property_value + = expect_json_object_with_property (loc, value, property_name); + ASSERT_EQ_AT (loc, property_value->get_kind (), json::JSON_STRING); + const json::string *str = static_cast<const json::string *> (property_value); + ASSERT_STREQ_AT (loc, expected_value, str->get_string ()); +} + +} // namespace selftest + +#endif /* #if CHECKING_P */ diff --git a/gcc/selftest-json.h b/gcc/selftest-json.h new file mode 100644 index 000000000000..75a20d519a4c --- /dev/null +++ b/gcc/selftest-json.h @@ -0,0 +1,100 @@ +/* Selftest support for JSON. + Copyright (C) 2024 Free Software Foundation, Inc. + Contributed by David Malcolm <dmalc...@redhat.com>. + +This file is part of GCC. + +GCC is free software; you can redistribute it and/or modify it under +the terms of the GNU General Public License as published by the Free +Software Foundation; either version 3, or (at your option) any later +version. + +GCC is distributed in the hope that it will be useful, but WITHOUT ANY +WARRANTY; without even the implied warranty of MERCHANTABILITY or +FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License +for more details. + +You should have received a copy of the GNU General Public License +along with GCC; see the file COPYING3. If not see +<http://www.gnu.org/licenses/>. */ + +#ifndef GCC_SELFTEST_JSON_H +#define GCC_SELFTEST_JSON_H + +#include "json.h" + +/* The selftest code should entirely disappear in a production + configuration, hence we guard all of it with #if CHECKING_P. */ + +#if CHECKING_P + +namespace selftest { + +/* Assert that VALUE is a non-null json::object, + returning it as such, failing at LOC if this isn't the case. */ + +const json::object * +expect_json_object (const location &loc, + const json::value *value); + +/* Assert that VALUE is a non-null json::object that has property + PROPERTY_NAME. + Return the value of the property. + Use LOC for any failures. */ + +const json::value * +expect_json_object_with_property (const location &loc, + const json::value *value, + const char *property_name); + +/* Assert that VALUE is a non-null json::object that has property + PROPERTY_NAME, and that the value of that property is a non-null + json::integer_number equalling EXPECTED_VALUE. + Use LOC for any failures. */ + +void +assert_json_int_property_eq (const location &loc, + const json::value *value, + const char *property_name, + long expected_value); +#define ASSERT_JSON_INT_PROPERTY_EQ(JSON_VALUE, PROPERTY_NAME, EXPECTED_VALUE) \ + assert_json_int_property_eq ((SELFTEST_LOCATION), \ + (JSON_VALUE), \ + (PROPERTY_NAME), \ + (EXPECTED_VALUE)) + +/* Assert that VALUE is a non-null json::object that has property + PROPERTY_NAME, and that the property value is a non-null JSON object. + Return the value of the property as a json::object. + Use LOC for any failures. */ + +const json::object * +expect_json_object_with_object_property (const location &loc, + const json::value *value, + const char *property_name); +#define EXPECT_JSON_OBJECT_WITH_OBJECT_PROPERTY(JSON_VALUE, PROPERTY_NAME) \ + expect_json_object_with_object_property ((SELFTEST_LOCATION), \ + (JSON_VALUE), \ + (PROPERTY_NAME)) + +/* Assert that VALUE is a non-null json::object that has property + PROPERTY_NAME, and that the value of that property is a non-null + JSON string equalling EXPECTED_VALUE. + Use LOC for any failures. */ + +void +assert_json_string_property_eq (const location &loc, + const json::value *value, + const char *property_name, + const char *expected_value); +#define ASSERT_JSON_STRING_PROPERTY_EQ(JSON_VALUE, PROPERTY_NAME, EXPECTED_VALUE) \ + assert_json_string_property_eq ((SELFTEST_LOCATION), \ + (JSON_VALUE), \ + (PROPERTY_NAME), \ + (EXPECTED_VALUE)) + +} // namespace selftest + +#endif /* #if CHECKING_P */ + +#endif /* GCC_SELFTEST_JSON_H */ diff --git a/gcc/selftest-run-tests.cc b/gcc/selftest-run-tests.cc index e6779206c470..d6c88f864ba7 100644 --- a/gcc/selftest-run-tests.cc +++ b/gcc/selftest-run-tests.cc @@ -97,6 +97,7 @@ selftest::run_tests () diagnostic_color_cc_tests (); diagnostic_show_locus_cc_tests (); diagnostic_format_json_cc_tests (); + diagnostic_format_sarif_cc_tests (); edit_context_cc_tests (); fold_const_cc_tests (); spellcheck_cc_tests (); diff --git a/gcc/selftest.h b/gcc/selftest.h index dcb1463ed906..5afc9399c619 100644 --- a/gcc/selftest.h +++ b/gcc/selftest.h @@ -222,6 +222,7 @@ extern void cgraph_cc_tests (); extern void convert_cc_tests (); extern void diagnostic_color_cc_tests (); extern void diagnostic_format_json_cc_tests (); +extern void diagnostic_format_sarif_cc_tests (); extern void diagnostic_path_cc_tests (); extern void diagnostic_show_locus_cc_tests (); extern void digraph_cc_tests (); diff --git a/gcc/testsuite/c-c++-common/diagnostic-format-sarif-file-Wbidi-chars.c b/gcc/testsuite/c-c++-common/diagnostic-format-sarif-file-Wbidi-chars.c index 283df75670d0..8a287d6c8683 100644 --- a/gcc/testsuite/c-c++-common/diagnostic-format-sarif-file-Wbidi-chars.c +++ b/gcc/testsuite/c-c++-common/diagnostic-format-sarif-file-Wbidi-chars.c @@ -20,4 +20,13 @@ int main() { { dg-final { scan-sarif-file {"text": "unpaired UTF-8 bidirectional control characters detected"} } } { dg-final { scan-sarif-file {"text": "unpaired UTF-8 bidirectional control characters detected"} } } + + Verify that the expected property bag property is present. + { dg-final { scan-sarif-file {"gcc/escapeNonAscii": true} } } + + Verify that the snippets have a "rendered" property. + We check the contents of the property via a selftest. + + { dg-final { scan-sarif-file {"rendered": } } } + */ diff --git a/gcc/testsuite/gcc.dg/plugin/diagnostic_plugin_test_show_locus.c b/gcc/testsuite/gcc.dg/plugin/diagnostic_plugin_test_show_locus.c index 3bd8cab8c7f3..ad84a4a12ce2 100644 --- a/gcc/testsuite/gcc.dg/plugin/diagnostic_plugin_test_show_locus.c +++ b/gcc/testsuite/gcc.dg/plugin/diagnostic_plugin_test_show_locus.c @@ -61,6 +61,7 @@ #include "context.h" #include "print-tree.h" #include "gcc-rich-location.h" +#include "text-range-label.h" int plugin_is_GPL_compatible; diff --git a/gcc/text-range-label.h b/gcc/text-range-label.h new file mode 100644 index 000000000000..8507720e929b --- /dev/null +++ b/gcc/text-range-label.h @@ -0,0 +1,42 @@ +/* Simple implementation of range_label. + Copyright (C) 2014-2024 Free Software Foundation, Inc. + +This file is part of GCC. + +GCC is free software; you can redistribute it and/or modify it under +the terms of the GNU General Public License as published by the Free +Software Foundation; either version 3, or (at your option) any later +version. + +GCC is distributed in the hope that it will be useful, but WITHOUT ANY +WARRANTY; without even the implied warranty of MERCHANTABILITY or +FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License +for more details. + +You should have received a copy of the GNU General Public License +along with GCC; see the file COPYING3. If not see +<http://www.gnu.org/licenses/>. */ + +#ifndef GCC_TEXT_RANGE_LABEL_H +#define GCC_TEXT_RANGE_LABEL_H + +#include "rich-location.h" + +/* Concrete subclass of libcpp's range_label. + Simple implementation using a string literal. */ + +class text_range_label : public range_label +{ + public: + text_range_label (const char *text) : m_text (text) {} + + label_text get_text (unsigned /*range_idx*/) const final override + { + return label_text::borrow (m_text); + } + + private: + const char *m_text; +}; + +#endif /* GCC_TEXT_RANGE_LABEL_H */ diff --git a/libcpp/include/rich-location.h b/libcpp/include/rich-location.h index ae4886f13bac..11c181ef0755 100644 --- a/libcpp/include/rich-location.h +++ b/libcpp/include/rich-location.h @@ -91,6 +91,7 @@ class semi_embedded_vec public: semi_embedded_vec (); ~semi_embedded_vec (); + semi_embedded_vec (const semi_embedded_vec &other); unsigned int count () const { return m_num; } T& operator[] (int idx); @@ -115,6 +116,21 @@ semi_embedded_vec<T, NUM_EMBEDDED>::semi_embedded_vec () { } +/* Copy constructor for semi_embedded_vec. */ + +template <typename T, int NUM_EMBEDDED> +semi_embedded_vec<T, NUM_EMBEDDED>::semi_embedded_vec (const semi_embedded_vec &other) +: m_num (0), + m_alloc (other.m_alloc), + m_extra (nullptr) +{ + if (other.m_extra) + m_extra = XNEWVEC (T, m_alloc); + + for (int i = 0; i < other.m_num; i++) + push (other[i]); +} + /* semi_embedded_vec's dtor. Release any dynamically-allocated memory. */ template <typename T, int NUM_EMBEDDED> @@ -387,11 +403,10 @@ class rich_location /* Destructor. */ ~rich_location (); - /* The class manages the memory pointed to by the elements of - the M_FIXIT_HINTS vector and is not meant to be copied or - assigned. */ - rich_location (const rich_location &) = delete; - void operator= (const rich_location &) = delete; + rich_location (const rich_location &); + rich_location (rich_location &&) = delete; + rich_location &operator= (const rich_location &) = delete; + rich_location &operator= (rich_location &&) = delete; /* Accessors. */ location_t get_loc () const { return get_loc (0); } @@ -547,6 +562,8 @@ protected: mutable expanded_location m_expanded_location; + /* The class manages the memory pointed to by the elements of + the m_fixit_hints vector. */ static const int MAX_STATIC_FIXIT_HINTS = 2; semi_embedded_vec <fixit_hint *, MAX_STATIC_FIXIT_HINTS> m_fixit_hints; @@ -605,7 +622,11 @@ class fixit_hint fixit_hint (location_t start, location_t next_loc, const char *new_content); + fixit_hint (const fixit_hint &other); + fixit_hint (fixit_hint &&other) = delete; ~fixit_hint () { free (m_bytes); } + fixit_hint &operator= (const fixit_hint &) = delete; + fixit_hint &operator= (fixit_hint &&) = delete; bool affects_line_p (const line_maps *set, const char *file, diff --git a/libcpp/line-map.cc b/libcpp/line-map.cc index 41aee987b292..05c4dafd89dd 100644 --- a/libcpp/line-map.cc +++ b/libcpp/line-map.cc @@ -2175,6 +2175,26 @@ rich_location::rich_location (line_maps *set, location_t loc, add_range (loc, SHOW_RANGE_WITH_CARET, label, label_highlight_color); } +/* Copy ctor for rich_location. + Take a deep copy of the fixit hints, which are owneed; + everything else is borrowed. */ + +rich_location::rich_location (const rich_location &other) +: m_line_table (other.m_line_table), + m_ranges (other.m_ranges), + m_column_override (other.m_column_override), + m_have_expanded_location (other.m_have_expanded_location), + m_seen_impossible_fixit (other.m_seen_impossible_fixit), + m_fixits_cannot_be_auto_applied (other.m_fixits_cannot_be_auto_applied), + m_escape_on_output (other.m_escape_on_output), + m_expanded_location (other.m_expanded_location), + m_fixit_hints (), + m_path (other.m_path) +{ + for (unsigned i = 0; i < other.m_fixit_hints.count (); i++) + m_fixit_hints.push (new fixit_hint (*other.m_fixit_hints[i])); +} + /* The destructor for class rich_location. */ rich_location::~rich_location () @@ -2595,6 +2615,14 @@ fixit_hint::fixit_hint (location_t start, { } +fixit_hint::fixit_hint (const fixit_hint &other) +: m_start (other.m_start), + m_next_loc (other.m_next_loc), + m_bytes (xstrdup (other.m_bytes)), + m_len (other.m_len) +{ +} + /* Does this fix-it hint affect the given line? */ bool