David Malcolm <dmalc...@redhat.com> writes: > On Mon, 2019-09-02 at 09:16 +0000, Andrea Corallo wrote: >> Hi all, >> yesterday I've found an interesting bug in libgccjit. >> Seems we have an hard limitation of 200 characters for literal >> strings. >> Attempting to create longer strings lead to ICE during pass_expand >> while performing a sanity check in get_constant_size. >> >> Tracking down the issue seems the code we have was inspired from >> c-family/c-common.c:c_common_nodes_and_builtins were >> array_domain_type >> is actually defined with a size of 200. >> The comment that follows that point sounded premonitory :) :) >> >> /* Make a type for arrays of characters. >> With luck nothing will ever really depend on the length of this >> array type. */ >> >> At least in the current implementation the type is set by >> fix_string_type were the actual string length is taken in account. >> >> I attach a patch updating the logic accordingly and a new testcase >> for that. >> >> make check-jit is passing clean. >> >> Best Regards >> Andrea > > Sorry about the long delay in reviewing this patch. > >> gcc/jit/ChangeLog >> 2019-??-?? Andrea Corallo <andrea.cora...@arm.com> >> >> * jit-playback.h >> (gcc::jit::recording::context m_recording_ctxt): Remove > ^^^^^^^^^ > "playback" here > >> m_char_array_type_node field. > > [...] > >> @@ -670,9 +669,12 @@ playback::rvalue * >> playback::context:: >> new_string_literal (const char *value) >> { >> - tree t_str = build_string (strlen (value), value); >> - gcc_assert (m_char_array_type_node); >> - TREE_TYPE (t_str) = m_char_array_type_node; >> + /* Compare with c-family/c-common.c: fix_string_type. */ >> + size_t len = strlen (value); >> + tree i_type = build_index_type (size_int (len)); >> + tree a_type = build_array_type (char_type_node, i_type); >> + tree t_str = build_string (len, value); >> + TREE_TYPE (t_str) = a_type; > > This code works with string lengths and string sizes which always > requires a little extra care. I'd like to see at least a comment > discussing this, as it's not immediately clear to me that we're > correctly handling the NUL terminator here. > > Consider the string "foo". > This has strlen == 3, and its size is 4 chars. > > build_string's comment says "Note that for a C string literal, LEN > should include the trailing NUL." > > However, build_string appears to allocate one byte more than LEN, and > write a NUL in that final position. > > fix_string_type has: > > int length = TREE_STRING_LENGTH (value); > > where tree.h has: > > /* In C terms, this is sizeof, not strlen. */ > #define TREE_STRING_LENGTH(NODE) (STRING_CST_CHECK (NODE)- >>string.length) > > and: > nchars = length / charsz; > > where for our purposes charsz == 1 and so nchars == length. > > fix_string_type has: > > i_type = build_index_type (size_int (nchars - 1)); > > So I *think* the patch is correct, but there ought to be a comment at > least. > > Or maybe use TREE_STRING_LENGTH (t_str) to more closely follow > fix_string_type? >
Mmmm when I wrote the patch I looked into the build_string code missing the comment and that was my understanding too. But looking better now in the compiler I think the patch is *not* correct. The trouble is that if we call build_string (3, "foo") (as we would do now) we set TREE_STRING_LENGTH to 3 and this appears to be wrong. Actually build_string can be called using the strlen as argument (see for example the various definition of DEF_ATTR_STRING) but apparently this is not how is done for C strings in the frontends. Given that this is what the comment suggests and that modifying the patch with the +1 keeps it functional I think doing what the front-end does is the right thing. > Also, please add an assertion and comment to the testcase to assert > that the very long string is indeed longer than the previous limit of > 200. Ack > Thanks > Dave Thanks for reviewing Andrea