No, not quite. Valid Punycode characters are `[A-Za-z0-9-]`. This proposal includes `-`, as well as `#` and `;` for HTML entities.
I double-checked the RFC to see the valid Punycode characters and the set above is indeed correct: https://datatracker.ietf.org/doc/html/draft-ietf-idn-punycode-02#section-5 While it would be nice for ltree labels to support *any* printable character, it can't because symbols like `!` and `%` already have special meaning in the querying. This proposal leaves those as is and does not depend on any existing special character. On Tue, Oct 4, 2022 at 6:32 PM Nathan Bossart <nathandboss...@gmail.com> wrote: > On Tue, Oct 04, 2022 at 12:54:46PM -0400, Garen Torikian wrote: > > The punycode range of characters is the exact same set as the existing > > ltree range, with the addition of a hyphen (-). Within this system, any > > human language can be encoded using just A-Za-z0-9-. > > IIUC ASCII characters like '!' and '<' are valid Punycode characters, but > even with your proposal, those wouldn't be allowed. > > -- > Nathan Bossart > Amazon Web Services: https://aws.amazon.com >