On 12/09/2024 11:16, Simon Wolfe wrote:
I have one file name that uses Unicode character U+318DF, which is in the
tertiary pane, more precisely CJK Unified Ideographs Extension H.
touch 𱣟
ls
returns:
''$'\360\261\243\237'
Extension H was introduced in Unicode 15.0 in 2022.
I also notice that this bug occurs with any character with Extension I
(introduced in 2023).
Extension G seems to works okay.
ls 9.4 works as expected for me with glibc-2.39 in a UTF-8 locale.
I.e. that file is displayed directly.
Now if I set the locale to non UTF-8 it will display the form above
(which works on all locales BTW).
$ touch ''$'\360\261\243\237'
$ ls ''$'\360\261\243\237'
𱣟
$ LC_ALL=C ls ''$'\360\261\243\237'
''$'\360\261\243\237'
So I suspect your system libs are not updated to recognize this character,
hence the fallback format is used.
cheers,
Pádraig.