> On Feb 25, 2017, at 11:41 AM, lilypond-user-requ...@gnu.org wrote: > > Date: Sat, 25 Feb 2017 17:34:54 +0100 (CET) > From: Karl Hammar <k...@opal.lcl.aspodata.se > <mailto:k...@opal.lcl.aspodata.se>> > To: Joseph Austin <drtechda...@gmail.com <mailto:drtechda...@gmail.com>> > <snip> > > And, rp26 clearly states in section 5: > > In addition, if a byte order mark which specifies UNICODE such as > 'FF FE' or 'FE FF' exists, the character code SET should be treated > as UNICODE. > > There is such a "byte order mark" for utf8, see [2]. And then by > extension, you just have to insert that BOM somewhere in the midi > file (exists == not restricted to the lyrics meta event, preferable > in track 0 at time 0) and it would be legal (according to the > recommendation) to use utf8 straigth out the box. > > [2] http://www.unicode.org/faq/utf_bom.html#BOM > <http://www.unicode.org/faq/utf_bom.html#BOM> > > <snip> > >> only ASCII chars between 0 and 127 are allowed. > > Your wording is too hard. complete_midi_96-1-3.pdf, p.137 (or [1] > p.10) clearly says "should", but > > "other characters codes > using the high-order bit may be used for interchange of files between > different programs on the same computer which supports an extended > character set. Programs on a computer which does not support > non-ASCII characters should ignore those characters."
I stand corrected. >> But if we are going to use a "private standard", we might as well >> imitate the "official" standard and insert something like >> FF 05 07 { @ U T F 8 } >> And lobby AMEI/MMA to adopt an official UTF8 position. > > Could be good, but why just not capitalize on the BOM and just use > utf8. > > Regards, > /Karl Hammar OK, the UTF-8 BOM is 0x EF BB BF But given that the MIDI file is not a "text file" but a binary file with text fields scattered throughout, normally embedded in various MIDI Meta-events, where should the BOM be placed? Interpreting your suggestion, we could add a Lyric Meta-Event with the BOM as the text field to Track 0 Time 0. That should work for lyrics, but RP-26 indicates that lyrics "language encoding" should not extend to other types of text events. For other text events, it seems we would need to prefix every UTF-8 text field with the BOM. --- Joe Austin
_______________________________________________ lilypond-user mailing list lilypond-user@gnu.org https://lists.gnu.org/mailman/listinfo/lilypond-user