Re: Managing Google Groups headaches

Ned Batchelder Fri, 06 Dec 2013 18:27:02 -0800

On 12/6/13 8:03 AM, rusi wrote:

I think you're off on the wrong track here.  This has nothing to do with
>plain text (ascii or otherwise).  It has to do with divorcing how you
>store and transport messages (be they plain text, HTML, or whatever)
>from how a user interacts with them.


Evidently (and completely inadvertently) this exchange has just
illustrated one of the inadmissable assumptions:

"unicode as a medium is universal in the same way that ASCII used to be"

I wrote a number of ellipsis characters ie codepoint 2026 as in:

   - human communication…
(is not very different from)
   - machine communication…

Somewhere between my sending and your quoting those ellipses became
the replacement character FFFD

> >   - human communication�
> >(is not very different from)
> >   - machine communication�

Leaving aside whose fault this is (very likely buggy google groups),
this mojibaking cannot happen if the assumption "All text is ASCII"
were to uniformly hold.

Of course with unicode also this can be made to not happen, but that
is fragile and error-prone.  And that is because ASCII (not extended)
is ONE thing in a way that unicode is hopelessly a motley inconsistent
variety.

You seem to be suggesting that we should stick to ASCII. There are ofcourse languages that need more than just the Latin alphabet. How wouldyou suggest we support them? Or maybe I don't understand?


--Ned.

--
https://mail.python.org/mailman/listinfo/python-list

Re: Managing Google Groups headaches

Reply via email to