On 06/12/2016 11:16 PM, Steven D'Aprano wrote: > "Safe to transmit in text protocols" surely should mean "any Unicode code > point", since all of Unicode is text. What's so special about the base64 > ones? > > Well, that depends on your context. For somebody who cares about sending > bits over a physical wire, their idea of "text" is not Unicode, but a > subset of ASCII *bytes*.
Not necessarily. The encoding of the text containing the results of the base64 encoding does not matter provided the letters and numbers used in base64 can be represented. I could take the text and paste it in an email and send it via UTF-8, or UTF-16. Won't make a difference provided the decoder can deal decode that specific unicode encoding. The other end could even cut and paste the base64 letters and numbers out of his email body and paste it into a decoder. How the letters and numbers got to him is immaterial and irrelevant. Sure in the context of email base64 data is usually sent using UTF-8 encoding these days. But there's no requirement that base64 data always has to be encoded in ASCII, UTF-8, or LATIN1. > The end result is that after you've base64ed your "binary" data, to > get "text" data, what are you going to do with is? Treat it as Unicode code > points? Probably not. Sure. Why not? Write it to a text file. Put it in an email. Place it in a word doc. Print it. Whatever. > Squirt it down a wire as bytes? Almost certainly. Sometimes yes. But not always. > Looking at this from the high-level perspective of Python, that makes it > conceptually bytes not text. I don't see how this is always the case. From a high-level python perspective it's definitely text. That's the whole point of base64! -- https://mail.python.org/mailman/listinfo/python-list