On Sat, 23 Nov 2024 at 11:44, Cedric Blancher <cedric.blanc...@gmail.com> wrote: > > Good morning! > > /bin/ls -l cannot handle printable Unicode characters outside the BMP > > Example using 'π―' > bash -c 'printf "\U0001D4AF\n"' # MATHEMATICAL SCRIPT CAPITAL T > (yes, our mathematicians want to use THAT as file name) > > On Linux: > LC_ALL=en_US.UTF-8 bash -c 't="$(printf "\U0001D4AF\n")" ; touch "$t" "$t$t"' > ls -la > total 8 > -rw-r--r-- 1 ced staden 0 Nov 23 11:29 ΓΆΓΆΓΆΓΆΓΆΓΆΓΆ > -rw-r--r-- 2 ced staden 4 Nov 23 11:31 π― > -rw-r--r-- 2 ced staden 4 Nov 23 11:31π―π― > > On Cygwin: > LC_ALL=en_US.UTF-8 bash -c 't="$(printf "\U0001D4AF\n")" ; touch "$t" "$t$t"' > $ ls -la > -rw-r--r-- 1 ced staden 0 Nov 23 11:29 ΓΆΓΆΓΆΓΆΓΆΓΆΓΆ > -rw-r--r-- 2 ced staden 4 Nov 23 11:31 ''$'\360\235\222\257' > -rw-r--r-- 2 ced staden 4 Nov 23 11:31 ''$'\360\235\222\257\360\235\222\257' > > Looks like the Cygwin locale has a problem with non-BMP chars.
find(1) is even worse: $ find . . ./ΓΆΓΆΓΆΓΆΓΆΓΆΓΆ ./???? ./x??x The Microsoft Explorer GUI shows the file names correctly, so IMO this is not a Windows or Win32 API problem. Ced -- Cedric Blancher <cedric.blanc...@gmail.com> [https://plus.google.com/u/0/+CedricBlancher/] Institute Pasteur -- Problem reports: https://cygwin.com/problems.html FAQ: https://cygwin.com/faq/ Documentation: https://cygwin.com/docs.html Unsubscribe info: https://cygwin.com/ml/#unsubscribe-simple