On Mon, 2004-04-26 at 08:12, Dan Sugalski wrote:
> At 9:34 PM -0400 4/25/04, Bryan C. Warnock wrote:
> >On Sun, 2004-04-25 at 16:34, Dan Sugalski wrote:
> >>  Just a heads up, there are two things that have been pointed out.
> >>
> >>  First, the transset op is transcharset. The abbreviation was a bit sloppy.
> >>
> >>  Second, in spots where "character" is used, substitute "grapheme", as
> >>  I'm going to. Noting, of course, that a grapheme is *not* a glyph.
> >>  Glyphs are display things that we're staying very very (very!) far
> >>  away from. The change'll go into the op names--getglyph instead of
> >>  getcharacter and suchlike things.
> >>
> >>  Hopefully using a different word'll help people remember that
> >>  glyph!=codepoint, though we'll see how well that one works.
> >
> >I don't understand.  Substitute grapheme for character, as you're
> >staying away from glyphs, but "getglyph" for "getcharacter"?
> 
> Gah. And that sound is the sound of me banging my head agains the 
> wall because I'm an idiot. It's grapheme, everywhere.
> 
> >And what about codepoints that *are* glyphs and/but aren't graphemes?
> 
> Where do we have those? (I'm getting tempted instead to just call 
> them fred--it'll at least avoid some of this confusion...)

Beats me.  I don't know what you mean by grapheme.  Or glyph.
:-)

The web has a wide variety of definitions, most of them centered on some
association with a spoken language (the grapheme/phoneme
association).

While that certainly covers what I think you mean - letters,
ideographs, diacritical combinations, etc. - and I'm fairly
certain that extends to other written representations of
language - punctuation, white space, numerics - I don't know if
it extends to things that aren't.  The Arabic tatweel (0x0640),
for instance, is pure a typesetting construct.

Then you've got non-language things like math operators,
arrows, and "dingbats". 

And *then* you've got several ranges of "Presentation Forms",
which Unicode explicitly references as glyphs.  For instance,
see 0xFB50 - 0xFDFF, Arabic Presentation Forms-A.

Perhaps fred *is* better.


-- 
Bryan C. Warnock
bwarnock@(gtemail.net|raba.com)

Reply via email to