I noticed that in st, combined Unicode characters don't seem to be
preserved in memory. For example, if I run "printf 'AB\xcd\x9dCDE\n'" in
a Xterm then select the resulting line, I the clipboard data includes
the Unicode sequence:

    ~% echo $TERM
    xterm-256color
    ~% printf 'AB\xcd\x9dCDE\n'
    AB͝CDE
    ~% xclip -o | xxd
    0000000: 4142 cd9d 4344 450a                      AB..CDE.

However, with st, the sequence vanishes:

    ~% echo $TERM
    st-256color
    ~% printf 'AB\xcd\x9dCDE\n'
    ABCDE
    ~% xclip -o | xxd
    0000000: 4142 4344 450a                           ABCDE.

Urxvt's behaviour is also the same as Xterm with an added bonus: it
actually renders the combined Unicode sequence where as on Xterm and st,
the tie character is not visible (although if you paste "AB\u035d" into
st with no other trailing characters, the tie appears albeit glitchily).

I don't have a patch or any immediate plans to look into patching it but
perhaps improve Unicode support could be added to the TODO list.

Eric

Reply via email to