2013/3/22 Asmus Freytag <[email protected]>: > The number of conventions that can be applicable to certain punctuation > characters is truly staggering, and it seems unlikely that Unicode is the > right place to > a) discover all of them or > b) standardize an expression for them.
My intent is certainly not to discover and encode all of them. But existing characters are well known for having very common distinct semantics which merit separate encodings. And this includes notably their use as numeric grouping separators or decimal separators. Such common semantic modifiers would be eaiser to support than encoding many new special variants of characters (that won't even be rendered by most applications, and thus won't be used). Some examples : the invisible multiplication sign, the invisible function sign, and even the Latin/Greek mathematical letter-symbols which were only encoded for encoding style differences which have occasional but rare semantic differences. For me, adding those variants was really pseudo-coding, breaking the fundamental encoding model, and complicatin the task for font creators, renderer designers, and increasing a lot the size and complexity of collation tables. Many of these character variants could have been expressed as a base character and some modifier (whose distinct rendering was only optional), allowing a much easier integration and better use. Because of that the UCD is full of many added variants that re alsmost never used and we have to leave with encoded texts that persist in using ambguous characters for the most common possible distinctions.

