Re: [PATCH] libcody: allow non-ASCII module names [PR120458]

Nathaniel Shead Fri, 27 Feb 2026 02:46:15 -0800

Thank you for the patch!  Overall looks good to me, just a few nits and
suggestions below.


On Thu, Feb 26, 2026 at 04:58:06PM +0200, Jean-Christian CÎRSTEA wrote:
> Before this commit, attempting to use non-ASCII characters in quoted
> words failed, even though the protocol allows the usage of such
> characters in quoted words. To fix this:
> 
> 1. Remove `c >= 0x7f` comparison when parsing a quoted word.
> 2. Use `unsigned char` instead of `char` such that `c < 0x20` fails for
>    non-ASCII characters.
> 
>       PR 120458

This should include the bug component, so 'PR c++/120458'.

> 
> libcody/ChangeLog:
> 
>       * buffer.cc (S2C):
>       (MessageBuffer::Lex):
>       * cody.hh:
> 
> gcc/testsuite/ChangeLog:
> 
>       * g++.dg/README:

These changelog entries should have a very short note of what changed.

>       * g++.dg/modules/pr120458_a.C: New test.
>       * g++.dg/modules/pr120458_b.C: New test.
> 
> Signed-off-by: Jean-Christian CÎRSTEA <[email protected]>
> ---
>  gcc/testsuite/g++.dg/README               |  1 +
>  gcc/testsuite/g++.dg/modules/pr120458_a.C |  9 +++++++++
>  gcc/testsuite/g++.dg/modules/pr120458_b.C | 11 +++++++++++
>  libcody/buffer.cc                         |  6 +++---
>  libcody/cody.hh                           |  4 ++--
>  5 files changed, 26 insertions(+), 5 deletions(-)
>  create mode 100644 gcc/testsuite/g++.dg/modules/pr120458_a.C
>  create mode 100644 gcc/testsuite/g++.dg/modules/pr120458_b.C
> 
> diff --git a/gcc/testsuite/g++.dg/README b/gcc/testsuite/g++.dg/README
> index a7b3d5b783b..3301f17b4df 100644
> --- a/gcc/testsuite/g++.dg/README
> +++ b/gcc/testsuite/g++.dg/README
> @@ -15,6 +15,7 @@ inherit      Tests for inheritance -- virtual functions, 
> multiple inheritance, etc.
>  init  Tests for initialization semantics, constructors/destructors, etc.
>  lookup        Tests for lookup semantics, namespaces, using, etc.
>  lto   Tests for Link Time Optimization.
> +modules  Tests for C++20 modules.
>  opt   Tests for fixes of bugs with particular optimizations.
>  overload Tests for overload resolution and conversions.
>  parse         Tests for parsing.
> diff --git a/gcc/testsuite/g++.dg/modules/pr120458_a.C 
> b/gcc/testsuite/g++.dg/modules/pr120458_a.C
> new file mode 100644
> index 00000000000..0774102b0c9
> --- /dev/null
> +++ b/gcc/testsuite/g++.dg/modules/pr120458_a.C
> @@ -0,0 +1,9 @@
> +// check internals by name unless SCC
> +// { dg-additional-options "-fmodules-ts -fdump-lang-module-uid" }

I think we only need "-fmodules", this testcase doesn't use the dump.

> +
> +export module étrange;
> +// { dg-module-cmi étrange }
> +
> +export unsigned f(unsigned x) {
> +     return x + 3;
> +}
> diff --git a/gcc/testsuite/g++.dg/modules/pr120458_b.C 
> b/gcc/testsuite/g++.dg/modules/pr120458_b.C
> new file mode 100644
> index 00000000000..49835b94323
> --- /dev/null
> +++ b/gcc/testsuite/g++.dg/modules/pr120458_b.C
> @@ -0,0 +1,11 @@
> +// check internals by name unless SCC
> +// { dg-additional-options "-fmodules-ts -fdump-lang-module-uid" }
> +
> +export module bar;
> +// { dg-module-cmi bar }

Maybe we could use the example from the PR, "汉字", for a tiny bit of
extra testing.  I think I'd also like to see a simple testcase using a
module mapper file; see e.g. g++.dg/modules/map-1* for an example.

Nathaniel

> +
> +import étrange;
> +
> +export unsigned g(unsigned x) {
> +     return f(x) * 7;
> +}
> diff --git a/libcody/buffer.cc b/libcody/buffer.cc
> index 85c066fef71..d27882b7d4a 100644
> --- a/libcody/buffer.cc
> +++ b/libcody/buffer.cc
> @@ -30,7 +30,7 @@
>  namespace Cody {
>  namespace Detail {
>  
> -static const char CONTINUE = S2C(u8";");
> +static const unsigned char CONTINUE = S2C(u8";");
>  
>  void MessageBuffer::BeginLine ()
>  {
> @@ -239,7 +239,7 @@ int MessageBuffer::Lex (std::vector<std::string> &result)
>  
>    for (std::string *word = nullptr;;)
>      {
> -      char c = *iter;
> +      unsigned char c = *iter;
>  
>        ++iter;
>        if (c == S2C(u8" ") || c == S2C(u8"\t"))
> @@ -292,7 +292,7 @@ int MessageBuffer::Lex (std::vector<std::string> &result)
>                 return EINVAL;
>               }
>  
> -           if (c < S2C(u8" ") || c >= 0x7f)
> +           if (c < S2C(u8" "))
>               goto malformed;
>  
>             ++iter;
> diff --git a/libcody/cody.hh b/libcody/cody.hh
> index 93bce93aa94..7c852eb3aa1 100644
> --- a/libcody/cody.hh
> +++ b/libcody/cody.hh
> @@ -49,14 +49,14 @@ namespace Detail  {
>  
>  #if __cpp_char8_t >= 201811
>  template<unsigned I>
> -constexpr char S2C (char8_t const (&s)[I])
> +constexpr unsigned char S2C (char8_t const (&s)[I])
>  {
>    static_assert (I == 2, "only single octet strings may be converted");
>    return s[0];
>  }
>  #else
>  template<unsigned I>
> -constexpr char S2C (char const (&s)[I])
> +constexpr unsigned char S2C (char const (&s)[I])
>  {
>    static_assert (I == 2, "only single octet strings may be converted");
>    return s[0];
> -- 
> 2.53.0
>

Re: [PATCH] libcody: allow non-ASCII module names [PR120458]

Reply via email to