Re: Wrong letter in title

David Kastrup Sun, 30 Sep 2018 05:54:02 -0700

Davide Liessi <davide.lie...@gmail.com> writes:

> Il giorno dom 20 mag 2018 alle ore 18:35 Davide Liessi
> <davide.lie...@gmail.com> ha scritto:
>> The file
>>
>> \version "2.19.81"
>> \header { title = "č" }
>> { b1 }
>>
>> results in a PDF with correct printed title (lowercase c with caron)
>> but wrong title field in metadata (Ċ, i.e. uppercase c with dot
>> above).
>
> On Sun, 20 May 2018 20:52:58 +0200 David Kastrup wrote:
>> Ghostscript bug when converting PostScript output to PDF.  The
>> PostScript reads (pasted from less' display)
>>
>> mark /Creator (LilyPond 2.21.0)
>> /Title (<FE><FF>^A^M)
>> /DOCINFO pdfmark
>>
>> which is the correct UTF16-LE string with BOM.  GhostScript however
>> converts the ^M (0x0d) into ^J (0x0a), basically converting an ASCII CR
>> to an ASCII LF.  Unfortunately, we are not in the middle of ASCII here.
>
> Actually, it turns out that the behaviour of GhostScript is not wrong
> and this is probably a bug in how LilyPond produces the PostScript
> file.
>
> PostScript strings must either properly escape non-ASCII or ASCII
> non-printable bytes, e.g., as \ddd with ddd the octal representation,
> or they must be defined as a hexadecimal string (see [1], pages
> 29–31).


Uh WHAT?  To quote:

    The \ddd form may be used to include any 8-bit character constant in
    a string.  One, two, or three octal digits may be specified, with
    high-order overflow ignored. This notation is preferred for
    specifying a character outside the recommended ASCII character set
    for the PostScript language, since the notation itself stays within
    the standard set and thereby avoids possible difficulties in
    transmitting or storing the text of the program. It is recommended
    that three octal digits always be used, with leading zeros as
    needed, to prevent ambiguity. The string (\0053) , for example,
    contains two characters—an ASCII 5 (Control-E) followed by the digit
    3—whereas the strings (\53) and (\053) contain one character, the
    ASCII character whose code is octal 53 (plus sign).

Recommended/preferred is not at all equivalent to "must".  However, one
problem indeed is that strings as such have no notion of encoding and
CR, LF, CRLF are all equivalent.  So at least those bytes, when they
occur as part of UTF-16, would warrant escaping.

-- 
David Kastrup

_______________________________________________
bug-lilypond mailing list
bug-lilypond@gnu.org
https://lists.gnu.org/mailman/listinfo/bug-lilypond

Re: Wrong letter in title

Reply via email to