Hi Jonathan,

On 11/14/22 14:14, Jonathan Wakely wrote:
On Mon, 14 Nov 2022 at 11:38, Alejandro Colomar via Gcc <gcc@gcc.gnu.org> wrote:
BTW, I had another idea to add a suffix to string literals to make them
unterminated:

char foo[3] = "foo"u;  // OK
char bar[4] = "bar";   // OK

char baz[4] = "baz"u;  // Warning: initializer is too short.
char etc[3] = "etc";   // Warning: unterminated string.

Is that doable?  Do you think it makes sense?

IMHO no. This is not useful enough to add a language extension, it's
an incredibly niche use case.

I agree it's way too niche.

Your suggested syntax also looks very
confusing with UTF-16 string literals,

Maybe.

and is not sufficiently
distinct from a normal string literal to be obvious when quickly
reading the code. People expect string literals in C to be
null-terminated, having a subtle suffix that changes that would be a
bug farm.

But, you have to combine both the suffix with the corresponding size (one less than for normal strings). A programmer needs to be consciously doing this. For readers of the code, maybe there's a bit more of a readability issue, especially if you don't know the extension. But when you stop a little bit to check what that suffix is doing and then realize the size is weird, a reasonable programmer should at least ask or check the documentation for that thing.

Regarding safety, I also have that thing very present in my mind, and in an attempt to get the compiler on my side, I decided to use 'char *' for NUL-terminated strings, and 'u_char *' for u_nterminated strings. That helps the compiler know when we're using one in place of another, which as you say would be a source of bugs.

Maybe having the type of these new strings be u_char[] instead of char[] would help have more type safety. I didn't suggest this because that would not be how strings in C have always been. However, considering that they are not really strings, it could make sense.


You can do {'b', 'a', 'z'} if you want an explicitly unterminated array of char.

A bit unreadable :)
I think I'll keep using normal literals, and maybe some workaround to disable the warnings for specific cases. Not my preference, but it can work.

Cheers,

Alex

--
<http://www.alejandro-colomar.es/>

Attachment: OpenPGP_signature
Description: OpenPGP digital signature

Reply via email to