Re: python 3.3 repr

2013-11-15 Thread Steven D'Aprano
On Fri, 15 Nov 2013 17:47:01 +, Neil Cerutti wrote: > The unicode support I'm learning in Go is, "Everything is utf-8, right? > RIGHT?!?" It also has the interesting behavior that indexing strings > retrieves bytes, while iterating over them results in a sequence of > runes. > > It comes with

Re: python 3.3 repr

2013-11-15 Thread Terry Reedy
On 11/15/2013 6:28 AM, Robin Becker wrote: I'm trying to understand what's going on with this simple program if __name__=='__main__': print("repr=%s" % repr(u'\xc1')) print("%%r=%r" % u'\xc1') On my windows XP box this fails miserably if run directly at a terminal C:\tmp> \Python33\p

Re: python 3.3 repr

2013-11-15 Thread Gene Heskett
On Friday 15 November 2013 13:52:40 Mark Lawrence did opine: > On 15/11/2013 16:36, Gene Heskett wrote: > > On Friday 15 November 2013 11:28:19 Joel Goldstick did opine: > >> On Fri, Nov 15, 2013 at 10:03 AM, Robin Becker > > > > wrote: > >>> ... > >>> > > became popular. > >

Unicode stdin/stdout (was: Re: python 3.3 repr)

2013-11-15 Thread random832
Of course, the real solution to this issue is to replace sys.stdout on windows with an object that can handle Unicode directly with the WriteConsoleW function - the problem there is that it will break code that expects to be able to use sys.stdout.buffer for binary I/O. I also wasn't able to get th

Re: python 3.3 repr

2013-11-15 Thread Mark Lawrence
On 15/11/2013 16:36, Gene Heskett wrote: On Friday 15 November 2013 11:28:19 Joel Goldstick did opine: On Fri, Nov 15, 2013 at 10:03 AM, Robin Becker wrote: ... became popular. Really? you cried and laughed over 7 vs. 8 bits? That's lovely (?). ;). That eighth bit sure was less

Re: python 3.3 repr

2013-11-15 Thread Neil Cerutti
On 2013-11-15, Chris Angelico wrote: > Other languages _have_ gone for at least some sort of Unicode > support. Unfortunately quite a few have done a half-way job and > use UTF-16 as their internal representation. That means there's > no difference between U+0012, U+0123, and U+1234, but U+12345 >

Re: python 3.3 repr

2013-11-15 Thread Cousin Stanley
> > We don't say len({42: None}) to discover > that the dict requires 136 bytes, > why would you use len("heåvy") > to learn that it uses 23 bytes ? > #!/usr/bin/env python # -*- coding: utf-8 -*- """ illustrate the difference in length of python objects and the size of thei

Re: python 3.3 repr

2013-11-15 Thread Serhiy Storchaka
15.11.13 17:32, Roy Smith написав(ла): Anybody remember RAD-50? It let you represent a 6-character filename (plus a 3-character extension) in a 16 bit word. RT-11 used it, not sure if it showed up anywhere else. In three 16-bit words. -- https://mail.python.org/mailman/listinfo/python-list

Re: python 3.3 repr

2013-11-15 Thread Chris Angelico
On Sat, Nov 16, 2013 at 4:10 AM, Steven D'Aprano wrote: > No, UTF-8 is okay for writing to files, but it's not suitable for text > strings. Correction: It's _great_ for writing to files (and other fundamentally byte-oriented streams, like network connections). Does a superb job as the default enc

Re: python 3.3 repr

2013-11-15 Thread Steven D'Aprano
On Fri, 15 Nov 2013 14:43:17 +, Robin Becker wrote: > Things went wrong when utf8 was not adopted as the standard encoding > thus requiring two string types, it would have been easier to have a len > function to count bytes as before and a glyphlen to count glyphs. Now as > I understand it we

Re: python 3.3 repr

2013-11-15 Thread Chris Angelico
On Sat, Nov 16, 2013 at 4:06 AM, Zero Piraeus wrote: > : > > On Fri, Nov 15, 2013 at 10:32:54AM -0500, Roy Smith wrote: >> Anybody remember RAD-50? It let you represent a 6-character filename >> (plus a 3-character extension) in a 16 bit word. RT-11 used it, not >> sure if it showed up anywhere

Re: python 3.3 repr

2013-11-15 Thread Zero Piraeus
: On Fri, Nov 15, 2013 at 10:32:54AM -0500, Roy Smith wrote: > Anybody remember RAD-50? It let you represent a 6-character filename > (plus a 3-character extension) in a 16 bit word. RT-11 used it, not > sure if it showed up anywhere else. Presumably 16 is a typo, but I just had a moderate amou

Re: python 3.3 repr

2013-11-15 Thread Gene Heskett
On Friday 15 November 2013 11:28:19 Joel Goldstick did opine: > On Fri, Nov 15, 2013 at 10:03 AM, Robin Becker wrote: > > ... > > > >>> became popular. > >> > >> Really? you cried and laughed over 7 vs. 8 bits? That's lovely (?). > >> ;). That eighth bit sure was less confusing than

Re: python 3.3 repr

2013-11-15 Thread William Ray Wing
On Nov 15, 2013, at 10:18 AM, Robin Becker wrote: > On 15/11/2013 15:07, Joel Goldstick wrote: > > > > >> >> Cool, someone here is older than me! I came in with the 8080, and I >> remember split octal, but sixes are something I missed out on. > > The pdp 10/15 had 18 bit words and

Re: python 3.3 repr

2013-11-15 Thread Chris Angelico
On Sat, Nov 16, 2013 at 2:39 AM, Robin Becker wrote: >> Dealing with bytes and Unicode is complicated, and the 2->3 transition is >> not easy, but let's please not spread the misunderstanding that somehow the >> Flexible String Representation is at fault. However you store Unicode code >> points,

Re: python 3.3 repr

2013-11-15 Thread Antoon Pardon
Op 15-11-13 16:39, Robin Becker schreef: > . >> >> Dealing with bytes and Unicode is complicated, and the 2->3 transition >> is not easy, but let's please not spread the misunderstanding that >> somehow the Flexible String Representation is at fault. However you >> store Unicode code point

Re: python 3.3 repr

2013-11-15 Thread Robin Becker
. Dealing with bytes and Unicode is complicated, and the 2->3 transition is not easy, but let's please not spread the misunderstanding that somehow the Flexible String Representation is at fault. However you store Unicode code points, they are different than bytes, and it is complex

Re: python 3.3 repr

2013-11-15 Thread Roy Smith
On Nov 15, 2013, at 10:18 AM, Robin Becker wrote: > The pdp 10/15 had 18 bit words and could be organized as 3*6 or 2*9 I don't know about the 15, but the 10 had 36 bit words (18-bit halfwords). One common character packing was 5 7-bit characters per 36 bit word (with the sign bit left over).

Re: python 3.3 repr

2013-11-15 Thread Robin Becker
On 15/11/2013 15:07, Joel Goldstick wrote: Cool, someone here is older than me! I came in with the 8080, and I remember split octal, but sixes are something I missed out on. The pdp 10/15 had 18 bit words and could be organized as 3*6 or 2*9, pdp 8s had 12 bits I think, then came

Re: python 3.3 repr

2013-11-15 Thread Chris Angelico
On Sat, Nov 16, 2013 at 1:43 AM, Robin Becker wrote: > .. > >> I'm still stuck on Python 2, and while I can understand the controversy >> ("It breaks my Python 2 code!"), this seems like the right thing to have >> done. In Python 2, unicode is an add-on. One of the big design drivers in

Re: python 3.3 repr

2013-11-15 Thread Ned Batchelder
On Friday, November 15, 2013 9:43:17 AM UTC-5, Robin Becker wrote: > Things went wrong when utf8 was not adopted as the standard encoding thus > requiring two string types, it would have been easier to have a len function > to > count bytes as before and a glyphlen to count glyphs. Now as I unde

Re: python 3.3 repr

2013-11-15 Thread Joel Goldstick
On Fri, Nov 15, 2013 at 10:03 AM, Robin Becker wrote: > ... > >>> became popular. >>> >> Really? you cried and laughed over 7 vs. 8 bits? That's lovely (?). >> ;). That eighth bit sure was less confusing than codepoint >> translations > > > > no we had 6 bits in 60 bit words as I recall;

Re: python 3.3 repr

2013-11-15 Thread Robin Becker
... became popular. Really? you cried and laughed over 7 vs. 8 bits? That's lovely (?). ;). That eighth bit sure was less confusing than codepoint translations no we had 6 bits in 60 bit words as I recall; extracting the nth character involved division by 6; smart people did trick

Re: python 3.3 repr

2013-11-15 Thread Robin Becker
On 15/11/2013 14:40, Serhiy Storchaka wrote: .. and then use repr throughout. Or rather try: ascii except NameError: ascii = repr and then use ascii throughout. apparently you can import ascii from future_builtins and the print() function is available as

Re: python 3.3 repr

2013-11-15 Thread Joel Goldstick
>> Some of us have been doing this long enough to remember when "just plain >> text" meant only a single case of the alphabet (and a subset of ascii >> punctuation). On an ASR-33, your C program would print like: >> >> MAIN() \( >> PRINTF("HELLO, ASCII WORLD"); >> \) >> >> because ASR-33's

Re: python 3.3 repr

2013-11-15 Thread Robin Becker
.. I'm still stuck on Python 2, and while I can understand the controversy ("It breaks my Python 2 code!"), this seems like the right thing to have done. In Python 2, unicode is an add-on. One of the big design drivers in Python 3 was to make unicode the standard. The idea behind re

Re: python 3.3 repr

2013-11-15 Thread Serhiy Storchaka
15.11.13 15:54, Ned Batchelder написав(ла): No, but I've found that significant programs that run on both 2 and 3 need to have some shims to make the code work anyway. You could do this: try: repr = ascii except NameError: pass and then use repr throughout. Or ra

Re: python 3.3 repr

2013-11-15 Thread Robin Becker
On 15/11/2013 13:54, Ned Batchelder wrote: . No, but I've found that significant programs that run on both 2 and 3 need to have some shims to make the code work anyway. You could do this: try: repr = ascii except NameError: pass yes I tried that, but

Re: python 3.3 repr

2013-11-15 Thread Roy Smith
In article , Ned Batchelder wrote: > In Python3, repr() will return a Unicode string, and will preserve existing > Unicode characters in its arguments. This has been controversial. To get > the Python 2 behavior of a pure-ascii representation, there is the new > builtin ascii(), and a corres

Re: python 3.3 repr

2013-11-15 Thread Ned Batchelder
On Friday, November 15, 2013 7:16:52 AM UTC-5, Robin Becker wrote: > On 15/11/2013 11:38, Ned Batchelder wrote: > .. > > > > In Python3, repr() will return a Unicode string, and will preserve existing > > Unicode characters in its arguments. This has been controversial. To get > > the P

Re: python 3.3 repr

2013-11-15 Thread Robin Becker
On 15/11/2013 11:38, Ned Batchelder wrote: .. In Python3, repr() will return a Unicode string, and will preserve existing Unicode characters in its arguments. This has been controversial. To get the Python 2 behavior of a pure-ascii representation, there is the new builtin ascii(),

Re: python 3.3 repr

2013-11-15 Thread Ned Batchelder
On Friday, November 15, 2013 6:28:15 AM UTC-5, Robin Becker wrote: > I'm trying to understand what's going on with this simple program > > if __name__=='__main__': > print("repr=%s" % repr(u'\xc1')) > print("%%r=%r" % u'\xc1') > > On my windows XP box this fails miserably if run direc

python 3.3 repr

2013-11-15 Thread Robin Becker
I'm trying to understand what's going on with this simple program if __name__=='__main__': print("repr=%s" % repr(u'\xc1')) print("%%r=%r" % u'\xc1') On my windows XP box this fails miserably if run directly at a terminal C:\tmp> \Python33\python.exe bang.py Traceback (most rece