Steven D'Aprano <st...@pearwood.info> writes: > On Sun, 16 Jul 2017 12:33:10 +1000, Ben Finney wrote: > > > And yet the ASCII and Unicode standard says code point 0x0A (U+000A > > LINE FEED) is a character, by definition. > [...] > > > Is an acute accent a character? > > > > Yes, according to Unicode. ‘´’ (U+0301 ACUTE ACCENT) is a character. > > Do you have references for those claims?
The Unicode Standard <URL:http://www.unicode.org/versions/Unicode10.0.0/> frequently uses “character” as the unit of semantic value that Unicode deals in. See the “Contents” table for many references. In §2.2 under the sub-heading “Characters, Not Glyphs” it defines the term, and thereafter uses “character” in a way that includes all such units, even formatting codes. See §2.11 “Combining Characters” for a definition that includes accent characters like U+0301: Combining Characters. Characters intended to be positioned relative to an associated base character are depicted in the character code charts above, below, or through a dotted circle. The standard even uses the term “format characters” to refer to code points with a functional purpose and no glyph representation, such as U+000A LINE FEED. > Because I'm pretty sure that Unicode is very, very careful to never > use the word "character" in a formal or normative manner, only as an > informal term for "the kinds of things that regular folk consider > letters or characters or similar". I don't know whether you consider the Core Specification document to be speaking in “formal or normative manner”. Either way that doesn't affect my point that Unicode does define “character” and it includes all code points in that definition. If you're going to disqualify anything that isn't “formal and normative manner” from what we're allowed to infer as the Unicode Standard telling us is a character, then you're going to have to either disregard most of the Core Specification document, or allow it as formal and/or normative. > And I don't think regular folks would know what a line feed was if it > jumped out of their computer and bit them :-) Are we talking about definitions, or are we talking about what regular folks would know? Regular folks know that “fish” has meaning, but I wouldn't want to try matching that regular-folk knowledge with a definition of what a “fish” is and is not. Quite frequently, a definition useful for a formal standard is *not* coterminus with what regular folk will think is in our out of that definition. -- \ “I have said to you to speak the truth is a painful thing. To | `\ be forced to tell lies is much worse.” —Oscar Wilde, _De | _o__) Profundis_, 1897 | Ben Finney -- https://mail.python.org/mailman/listinfo/python-list