On 10/22/2016 06:25 AM, Juha Manninen via Lazarus wrote:
On Sat, Oct 22, 2016 at 4:12 AM, Martin Frb via Lazarus
<lazarus@lists.lazarus-ide.org> wrote:
Which ones does it not support?
When I added it to SynEdit it was complete. It had all the combinings that
the utf8 standard had back then. (at least that I could find in the
documentation)
Of course if a new combining range is added, it will not contain it. If that
is needed one needs an external (OS or otherwise) library, that can/will be
updated on those occasions.
Mind "combining codepoints" have nothing to do with how many codepoints will
be represented by one glyph.
Ok, I was confusing the Unicode terms again.
I guess the biggest complexity is in glyphs and ligatures. I still
don't understand their details.
However for a program that must care about Unicode, like a text layout
app, the rules for combining codepoints and glyphs are equally
important. Codepoints for one glyph should never be split or copied
separately. Isn't it so?
SynEdit is a text layout app, too.
In that sense the function IsCombining is not enough for practical
purposes. A comprehensive library function should take care of glyphs
(+ other rules), too.
I looked at Bero's PUCU and the other links:
http://forum.lazarus.freepascal.org/index.php/topic,33064.msg214342.html#msg214342
but it went over my head. I must study the issue more later.
* A reality check! *
Despite problems and incompleteness of our Unicode support, it is
actually better than most other solutions out there.
Ok, most programming tools support Unicode somehow but people use them wrong.
A good example is our forum SMF software. It deals with text layout
and definitely should handle Unicode but it does not.
Not even single Codepoints beyond BMP which should be the most easy
case! No combining rules needed or anything.
Try to add this text to a forum post: (I hope the mail SW can deal with it...)
"Have 🍷 for FPC 💓 Lazarus."
Now the fact is that code made with FPC / Lazarus using the LazUnicode
functions and enumerators supports Unicode already much better than
most code out there!
Juha
I think that there is a degree of confusion about the use of ligatures.
Ligatures (at least in English) are typographical elements, not language
elements. Not all typefaces support them and the code for a ligature
should never appear in the source text. It is the function of the
display software to combine adjacent characters and display the
appropriate ligature if and only if the font that is used supports them.
A proportional typeface may display the character sequence 'fl' by using
the appropriate ligature glyph. A monospaced typeface would display the
same sequence as two characters, as would any typeface that did not
include the ligature glyphs.
Ligatures improve the appearance of text but are strictly a display
function and shouldn't actually appear in the text itself. This may not
be true for other writing systems and other languages but is certainly
true for English and perhaps other western European languages as well.
--
TRUTH in her dress finds facts too tight.
In fiction she moves with ease.
Stray Birds by Rabindranath Tagore
--
_______________________________________________
Lazarus mailing list
Lazarus@lists.lazarus-ide.org
http://lists.lazarus-ide.org/listinfo/lazarus