You're not very explicit about the Tag encoding you use for these styles. Of course it must not be a language tag so the introducer is not U+E0001, or a cancel-all tag so it is not prefixed by U+E007F It cannot also use letter-like, digit-like and hyphen-like tag characters for its introduction. So probably you use some prefix in U+E0002..U+E001F and some additional tag (tag "I" for italic, tag "B" for bold, tag "U" for underline, tag "S" for strikethough?) and the cancel tag to return to normal text (terminate the tagged sequence).
Or may be you just use standard HTML encoding by adding U+E0000 to each character of the HTML tag syntax (including attributes and close tags, allowing embedding?) So you use the "<" and ">" tag characters (possibly also the space tag U+E0020, or TAB tag U+E0009 for separating attributes and the quotation tags for attribute values)? Is your proposal also allowing the embedding of other HTML objects (such as SVG)? In that case what you do is only to remap the HTML syntax outside the standard text. If an attribute values contains standard text (such as <span title="Some text">...</span>) do you also remap the attribute value, i.e. "Some text"? Do you remap the technical name of the HTML tag itself i.e. "span" in the last example? And what is then the interest compared to standard HTML (it is not more compact, and just adds another layer on top of it), except allowing to embed it in places where plain HTML would be restricted by form inputs or would be reconverted using character entities hiding the effect of "<", ">" and "&" in HTML so they are not reinterpreted as HTML but as plain-text characters? Now let's suppose that your convention starts being decoded and used in some applications, this could be used to transport sensitive active scripts (e.g. Javascript event handlers or plain <script> elements): this adds an extra layer of security needed now in these applications, plus updated to security tools/antivirus scanners. I bet in fact that all tag characters are most often restricted in text input forms, and will be silently discarded or the whole text will be rejected. For me the tag characters is just a quirk for trying to embed in text, some higher level protocol which is actually not part of the text but only metadata, including for use with existing language tags (in HTML/SVG we can already use the lang="..." or xml:lang="..." for that purpose, in MIME and HTTP(S) we can already use the "Language:" and "Accept-Language:" headers). We were told that these tag characters were deprecated, and in fact even their use for language tags has not found any significant use except some trials (but there are now better technologies available in lot of softwares, APIs and services, and application design/development tools, or document editing/publishing tools). Le dim. 27 janv. 2019 à 21:10, James Kass via Unicode <unicode@unicode.org> a écrit : > > A new beta of BabelPad has been released which enables input, storing, > and display of italics, bold, strikethrough, and underline in plain-text > using the tag characters method described earlier in this thread. This > enhancement is described in the release notes linked on this download page: > > http://www.babelstone.co.uk/Software/index.html > >