Victor Gaultney wrote,

> Treating italic like punctuation is a win for a lot of people:

Italic Unicode encoding is a win for a lot of people regardless of approach.  Each of the listed wins remains essentially true whether treated as punctuation, encoded atomically, or selected with VS.

> My main point in suggesting that Unicode needs these characters is that
> italic has been used to indicate specific meaning - this text is somehow
> special - for over 400 years, and that content should be preserved in plain
> text.

( http://www.unicode.org/versions/Unicode11.0.0/ch02.pdf )

"Plain text must contain enough information to permit the text to be rendered legibly, and nothing more."

The argument is that italic information can be stripped yet still be read.  A persuasive argument towards encoding would need to negate that; it would have to be shown that removing italic information results in a loss of meaning.

The decision makers at Unicode are familiar with italic use conventions such as those shown in "The Chicago Manual of Style" (first published in 1906).  The question of plain-text italics has arisen before on this list and has been quickly dismissed.

Unicode began with the idea of standardizing existing code pages for the exchange of computer text using a unique double-byte encoding rather than relying on code page switching.  Latin was "grandfathered" into the standard.  Nobody ever submitted a formal proposal for Basic Latin.  There was no outreach to establish contact with the user community -- the actual people who used the script as opposed to the "computer nerds" who grew up with ANSI limitations and subsequent ISO code pages.  Because that's how Unicode rolled back then.  Unicode did what it was supposed to do WRT Basic Latin.

When someone points out that italics are used for disambiguation as well as stress, the replies are consistent.

"That's not what plain-text is for."  "That's not how plain-text works."  "That's just styling and so should be done in rich-text." "Since we do that in rich-text already, there's no reason to provide for it in plain-text."  "You can already hack it in plain-text by enclosing the string with slashes."  And so it goes.

But if variant letter form information is stripped from a string like "Jackie Brown", the primary indication that the string represents either a person's name or a Tarantino flick title is also stripped.  "Thorstein Veblen" is either a dead economist or the name of a fictional yacht in the Travis McGee series.  And so forth.

Computer text tradition aside, nobody seems to offer any legitimate reason why such information isn't worthy of being preservable in plain-text.  Perhaps there isn't one.

I'm not qualified to assess the impact of italic Unicode inclusion on the rich-text world as mentioned by David Starner.  Maybe another list member will offer additional insight or a second opinion.

Reply via email to