On Tue, Aug 15, 2023 at 03:39:40PM -0400, David Malcolm wrote: > On Tue, 2023-08-15 at 13:58 -0400, Lewis Hyatt wrote: > > On Tue, Aug 15, 2023 at 11:43:05AM -0400, David Malcolm wrote: > > > On Wed, 2023-08-09 at 18:14 -0400, Lewis Hyatt wrote: > > > > Class file_cache_slot in input.cc is used to query specific lines > > > > of source > > > > code from a file when needed by diagnostics infrastructure. This > > > > will be > > > > extended in a subsequent patch to support obtaining the source > > > > code from > > > > in-memory generated buffers rather than from a file. The present > > > > patch > > > > refactors class file_cache_slot, putting most of the logic into a > > > > new base > > > > class cache_data_source, in preparation for reusing that code in > > > > the next > > > > patch. There is no change in functionality yet. > > > > > > [...snip...] > > > > > > > I confess I had to reread both this and patch 4/8 to make sense of > > > this; this is probably one of those cases where it's harder to read > > > in > > > patch form than as source, but I think I now understand the new > > > implementation. > > > > Yes, sorry about that. I hope at least splitting into two patches > > here made it > > a little easier. > > > > > > > > Did you try testing this with valgrind (e.g. "make selftest- > > > valgrind")? > > > > > > > Oh interesting, was not aware of this. I think it shows that new > > leaks were > > not introduced with the patch series. > > > > [...snip...] > > > > > > > > I don't think we have any selftest coverage for "\r" in the line- > > > break > > > handling; that would be good to add. > > > > > > This patch is OK for trunk once the rest of the kit is approved. > > > > Thank you. To be clear, were you suggesting to add selftest coverage > > for \r > > endings now, or in a follow up? > > The former, please, so that we can sure that the patch doesn't > introduce any buffer overreads etc. > > Thanks > Dave >
The following (incremental to patch 5/8 or after) adds selftest coverage for alternate line endings. I hope things aren't too unclear this way; I can resend updated versions of some or all of the patches from scratch, if useful. AFAIK this is the current status of things: Patch 1/8: Reviewed, updated version incorporating feedback has not been acked yet, at: https://gcc.gnu.org/pipermail/gcc-patches/2023-August/627250.html Patch 2/8: OKed, pending tweak to reject fixit hints in generated data, which was sent incrementally here: https://gcc.gnu.org/pipermail/gcc-patches/2023-August/627405.html Patch 3/8: OKed, pending new selftest attached to this email. Patch 4/8: OKed, pending tweak to assert on non-NULL buffers which was sent incrementally here: https://gcc.gnu.org/pipermail/gcc-patches/2023-August/628283.html Patch 5/8: OKed Patch 6/8: OKed Patch 7/8: Not reviewed yet Patch 8/8: Waiting additional feedback from you, perhaps SARIF need not worry about this for now and should just ignore generated data locations. Thanks again for taking the time to go through this, I hope it will prove worth it. -Lewis -- >8 -- gcc/ChangeLog: * input.cc (test_reading_source_line): Test additional cases, including generated data and alternate line endings. (input_cc_tests): Adapt to test_reading_source_line() changes. diff --git a/gcc/input.cc b/gcc/input.cc index 4c99df7a205..72274732c6c 100644 --- a/gcc/input.cc +++ b/gcc/input.cc @@ -2392,30 +2392,51 @@ test_make_location_nonpure_range_endpoints (const line_table_case &case_) /* Verify reading of input files (e.g. for caret-based diagnostics). */ static void -test_reading_source_line () +test_reading_source_line (bool generated, const char *e1, const char *e2) { /* Create a tempfile and write some text to it. */ + const char *line1 = "01234567890123456789"; + const char *line2 = "This is the test text"; + const char *line3 = "This is the 3rd line"; + char content[72]; + const int content_len = snprintf (content, sizeof (content), + "%s%s%s%s%s", + line1, e1, line2, e2, line3); + ASSERT_LT (content_len, (int)sizeof (content)); temp_source_file tmp (SELFTEST_LOCATION, ".txt", - "01234567890123456789\n" - "This is the test text\n" - "This is the 3rd line"); + content, content_len, generated); - /* Read back a specific line from the tempfile. */ - char_span source_line = location_get_source_line (tmp.get_filename (), 3); + /* Read back some specific lines from the tempfile, not all in order. */ + const source_id src = generated + ? source_id (tmp.content_buf, tmp.content_len) + : source_id (tmp.get_filename ()); + + char_span source_line = location_get_source_line (src, 1); + ASSERT_TRUE (source_line); + ASSERT_TRUE (source_line.get_buffer () != NULL); + /* N.B. If the line terminator is \r\n, the returned char_span will include + the \r as part of the line. */ + const size_t off1 = strlen (e1) - 1; + ASSERT_EQ (20 + off1, source_line.length ()); + ASSERT_TRUE (!strncmp (line1, source_line.get_buffer (), + source_line.length () - off1)); + + source_line = location_get_source_line (src, 3); ASSERT_TRUE (source_line); ASSERT_TRUE (source_line.get_buffer () != NULL); ASSERT_EQ (20, source_line.length ()); - ASSERT_TRUE (!strncmp ("This is the 3rd line", - source_line.get_buffer (), source_line.length ())); + ASSERT_TRUE (!strncmp (line3, source_line.get_buffer (), + source_line.length ())); - source_line = location_get_source_line (tmp.get_filename (), 2); + source_line = location_get_source_line (src, 2); ASSERT_TRUE (source_line); ASSERT_TRUE (source_line.get_buffer () != NULL); - ASSERT_EQ (21, source_line.length ()); - ASSERT_TRUE (!strncmp ("This is the test text", - source_line.get_buffer (), source_line.length ())); + const size_t off2 = strlen (e2) - 1; + ASSERT_EQ (21 + off2, source_line.length ()); + ASSERT_TRUE (!strncmp (line2, source_line.get_buffer (), + source_line.length () - off2)); - source_line = location_get_source_line (tmp.get_filename (), 4); + source_line = location_get_source_line (src, 4); ASSERT_FALSE (source_line); ASSERT_TRUE (source_line.get_buffer () == NULL); } @@ -4311,7 +4332,11 @@ input_cc_tests () for_each_line_table_case (test_lexer_string_locations_raw_string_unterminated); for_each_line_table_case (test_lexer_char_constants); - test_reading_source_line (); + const char *const line_endings[] = {"\n", "\r", "\r\n"}; + for (bool generated : {false, true}) + for (const char *e1 : line_endings) + for (const char *e2: line_endings) + test_reading_source_line (generated, e1, e2); test_line_offset_overflow ();