Am 12.03.19 um 01:14 schrieb Aaron Hill:
On 2019-03-11 3:40 pm, David Kastrup wrote:
Urs Liska <li...@openlilylib.org> writes:
Am 11.03.19 um 20:22 schrieb Aaron Hill:
On 2019-03-11 11:30 am, David Kastrup wrote:
Urs Liska <li...@openlilylib.org> writes:
Hi,
I've written a poor-man's implementation of a simple \letterspaced
markup command:
#(define-markup-command
(letterspaced layout props text)(markup?)
(let*
((chars (string->list text))
(dummy (ly:message "Chars: ~a" chars))
(spaced-text
(string-join
(map string chars) " ")))
(interpret-markup layout props
(markup spaced-text))))
However, this scrambles umlauts and presumably other UTF-8
characters
as you can see with
{
s1 ^\markup \letterspaced "Täst"
}
=>Chars: (T � � s t)
Obviously the characters are wrongly en/decoded along the way, which
makes me think whether I have simply forgotten an encoding setting
somewhere (although I have no idea where and how I should include
that) or whether that whole routine is totally clumsy.
Any pointer would be appreciated.
Guile-1.8 has only byte strings, not Unicode character strings.
However, the regexp procedures are locale aware, so you can use
something like
/./ isn't smart enough to match Unicode graphemes. You would need
/\X/, however that is not supported in POSIX ERE. Neither is the
approximation /\P{M}\p{M}*+/.
I can confirm that the suggestion doesn't work for me, even with the
given example. It's still "T s t" (see attached).
Do you have an UTF-8 locale set?
That's because the file you attached was not in UTF-8. I was able to
open it using ISO 8859-1. In UTF-8, the 0xe4 for ä becomes U+FFFD
(Replacement Character). It should have been encoded as 0xc3 0xa4.
Either fixing the encoding or just retyping the umlaut A results in a
successful result.
Not with me. Also when injecting David's procedure in my actual project
it doesn't seem to work. And I assume you are *not* talking about the
encoding of the procedure definition?
Also, I should have been clear before. David's code should work for
most cases. I was just being pedantic that /./ would not work if the
input has combining characters. For instance, if you type U+0308
(Combining Diaeresis) after an 'a', you'll get an ä. But the simple
regex would not treat that as a single grapheme. The result would be
"T a ̈ s t".
I did understand it that way, and it would not be an issue in the
project I'm working on. There it's just some umlauts.
Urs
-- Aaron Hill
_______________________________________________
lilypond-user mailing list
lilypond-user@gnu.org
https://lists.gnu.org/mailman/listinfo/lilypond-user
_______________________________________________
lilypond-user mailing list
lilypond-user@gnu.org
https://lists.gnu.org/mailman/listinfo/lilypond-user