On Mon, 2004-04-26 at 08:12, Dan Sugalski wrote: > At 9:34 PM -0400 4/25/04, Bryan C. Warnock wrote: > >On Sun, 2004-04-25 at 16:34, Dan Sugalski wrote: > >> Just a heads up, there are two things that have been pointed out. > >> > >> First, the transset op is transcharset. The abbreviation was a bit sloppy. > >> > >> Second, in spots where "character" is used, substitute "grapheme", as > >> I'm going to. Noting, of course, that a grapheme is *not* a glyph. > >> Glyphs are display things that we're staying very very (very!) far > >> away from. The change'll go into the op names--getglyph instead of > >> getcharacter and suchlike things. > >> > >> Hopefully using a different word'll help people remember that > >> glyph!=codepoint, though we'll see how well that one works. > > > >I don't understand. Substitute grapheme for character, as you're > >staying away from glyphs, but "getglyph" for "getcharacter"? > > Gah. And that sound is the sound of me banging my head agains the > wall because I'm an idiot. It's grapheme, everywhere. > > >And what about codepoints that *are* glyphs and/but aren't graphemes? > > Where do we have those? (I'm getting tempted instead to just call > them fred--it'll at least avoid some of this confusion...)
Beats me. I don't know what you mean by grapheme. Or glyph. :-) The web has a wide variety of definitions, most of them centered on some association with a spoken language (the grapheme/phoneme association). While that certainly covers what I think you mean - letters, ideographs, diacritical combinations, etc. - and I'm fairly certain that extends to other written representations of language - punctuation, white space, numerics - I don't know if it extends to things that aren't. The Arabic tatweel (0x0640), for instance, is pure a typesetting construct. Then you've got non-language things like math operators, arrows, and "dingbats". And *then* you've got several ranges of "Presentation Forms", which Unicode explicitly references as glyphs. For instance, see 0xFB50 - 0xFDFF, Arabic Presentation Forms-A. Perhaps fred *is* better. -- Bryan C. Warnock bwarnock@(gtemail.net|raba.com)