On Mon, 21 Jan 2019 00:29:42 -0800 David Starner via Unicode <unicode@unicode.org> wrote:
> The superscripts show a problem with multiple encoding; even if you > think they should be Unicode superscripts, and they look like Unicode > superscripts, they might be HTML superscripts. Same thing would happen > with italics if they were encoded in Unicode. But if one strips the mark-up out, and searching is then based on the collation elements of the text, then this is not a problem. Mathematical and ASCII capitals differ only at the identity level. Searching on the basis of codepoint sequences would come unstuck with scriptio continua scripts - WJ and ZWSP can be optionally inserted to improve line-breaking, and even to overcome spell-checkers. Richard.