Re: UTF-8 characters in filenames with Lilypond 2.23

David Wright Wed, 25 May 2022 19:46:06 -0700

On Sat 21 May 2022 at 23:36:26 (-0600), David F. wrote:
> > On May 21, 2022, at 5:01 PM, Jean Abou Samra <j...@abou-samra.fr> wrote:
> > Le 21/05/2022 à 23:20, David F. a écrit :
> >> System: Intel-based macOS
> >> 
> >> I make extensive use of UTF-8 characters in the filenames of my Lilypond 
> >> files.  This works in Lilypond up through version 2.22.  But Lilypond 2.23 
> >> cannot handle UTF-8 characters in filenames.
> >> 
> >> Filename: tést.ly
> >> ====
> >> \version "2.22"
> >> 
> >> { c' }
> >> ====
> >> 
> >> Under v2.22, this file complies without problem.  With 2.23 (including the 
> >> just release 2.23.9) I get the following error:
> >> 
> >> Starting lilypond 2.23.9 [tést.ly]...
> >> warning: cannot find file: `/Users/david/Projects/Lily Scratch/te??st.ly'
> >> fatal error: failed files: "/Users/david/Projects/Lily Scratch/te??st.ly"
> >> Exited with return code 1.
> >> 
> >> I thought that this problem had already been reported, but I can’t find 
> >> any mention of it now.  So I’m reporting it.
> > 
> > This is actually not an issue with LilyPond, but with Frescobaldi,
> > when you have "run LilyPond with English messages" enabled in the
> > Preferences. This problem is known for some time already, but
> > I didn't see an issue in the Frescobaldi tracker, so I just
> > opened one:
> > 
> > https://github.com/frescobaldi/frescobaldi/issues/1438
> > 
> > The workaround is to uncheck "Run LilyPond with English messages"
> > in Edit > Preferences > LilyPond Preferences.
> 
> No, I get the same error running lilypond from the command line:
> 
> $ lilypond --png tést.ly 
> GNU LilyPond 2.23.9 (running Guile 2.2)
> warning: cannot find file: `t??st.ly'
> fatal error: failed files: "t??st.ly”


I didn't spot this until I read Hraban's post about MacOS, about which
I know almost nothing, but your two errors are not the same. Compared
in close proximity, you can more easily see:

  warning: cannot find file: `/Users/david/Projects/Lily Scratch/te??st.ly'
                                                                  ↑
                                     warning: cannot find file: `t??st.ly'

So it looks as if the first error is a normalisation error: é has been
decomposed into e + ´, or rather e + COMBINING ACUTE ACCENT U+0301,
the latter being CC81 in utf-8, whereas the second error looks
like a composed é, which is C3A9 in utf-8. Each appears to generate
two question marks, but the first leaves the initial e in place,
so three chars in all.

>From glancing at pages like:
https://mjtsai.com/blog/2017/03/24/apfss-bag-of-bytes-filenames/
https://news.ycombinator.com/item?id=14495823
and the article discussed in the latter (there's a link at the top),
I don't envy anyone supporting these filesystems.

Is Python's unicodedata.normalize("NFKC", … ) meant to handle all this?

Cheers,
David.

Re: UTF-8 characters in filenames with Lilypond 2.23

Reply via email to