Re: [Python-Dev] [Python-checkins] cpython: Elaborate that sizeof only accounts for the object itself.

2012-06-17 Thread Eli Bendersky
On Sun, Jun 17, 2012 at 11:42 AM, martin.v.loewis <
[email protected]> wrote:

> http://hg.python.org/cpython/rev/cddaf96c8149
> changeset:   77484:cddaf96c8149
> parent:  77482:1f6c23ed8218
> user:Martin v. Löwis 
> date:Sun Jun 17 10:40:16 2012 +0200
> summary:
>  Elaborate that sizeof only accounts for the object itself.
>
> files:
>  Doc/library/sys.rst |  3 +++
>  1 files changed, 3 insertions(+), 0 deletions(-)
>
>
> diff --git a/Doc/library/sys.rst b/Doc/library/sys.rst
> --- a/Doc/library/sys.rst
> +++ b/Doc/library/sys.rst
> @@ -441,6 +441,9 @@
>does not have to hold true for third-party extensions as it is
> implementation
>specific.
>
> +   Only the memory consumption directly attributed to the object is
> +   accounted for, not the memory consumption of objects it refers to.
> +
>If given, *default* will be returned if the object does not provide
> means to
>retrieve the size.  Otherwise a :exc:`TypeError` will be raised.
>
>
Great, thanks.

Eli
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] VS 11 Express is Metro only.

2012-06-17 Thread Martin v. Löwis
> Microsoft is now claiming that it will be able to using something called
> "multi-targeting", "later this year.
> 
> http://blogs.msdn.com/b/vcblog/archive/2012/06/15/10320645.aspx

Interesting. Unlike the re-addition of desktop app support for Express,
which was a purely political decision, this change likely involves
significant code changes to the CRT, in particular if they aim for a
single CRT that works on XP but uses newer features on Vista+.

At the downside for CPython, this probably also means that they will not
extend the life of VS 2010 just to support XP, as they can tell people
to switch to VS 2012 and still support XP. So VS 2010 will probably
expire on 07/14/2015 for mainstream support as planned today.

So for 3.4, we have to ask whether to switch to VS 2012 or stay with
VS 2010, and for 3.5, we have to ask whether to switch to VS 2014
(assuming that's when the next release is made).

Regards,
Martin
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


[Python-Dev] Raw string syntax inconsistency

2012-06-17 Thread Christian Heimes
Hello,

the topic came up on the python-users list today. The raw string syntax
has a minor inconsistency. The ru"" notation is a syntax error although
we support rb"". Neither rb"" nor ru"" are supported on Python 2.7.

Python 3.3:

  works: r"", ur"", br"", rb""
  syntax error: ru""

Python 2.7:

  works: r"", ur"", br""
  syntax error: ru"", rb""

The ru"" notation isn't necessary for Python 2 compatibility but it's
still an inconsistency. The docs [1] also state that 'r' is a prefix,
not a suffix. On the other hand the lexical definition doesn't mention
the u"" notation yet.

Christian

[1]
http://docs.python.org/py3k/reference/lexical_analysis.html#string-and-bytes-literals

___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Raw string syntax inconsistency

2012-06-17 Thread Nick Coghlan
On Sun, Jun 17, 2012 at 9:45 PM, Christian Heimes  wrote:
> Hello,
>
> the topic came up on the python-users list today. The raw string syntax
> has a minor inconsistency. The ru"" notation is a syntax error although
> we support rb"". Neither rb"" nor ru"" are supported on Python 2.7.
>
> Python 3.3:
>
>  works: r"", ur"", br"", rb""
>  syntax error: ru""
>
> Python 2.7:
>
>  works: r"", ur"", br""
>  syntax error: ru"", rb""
>
> The ru"" notation isn't necessary for Python 2 compatibility but it's
> still an inconsistency. The docs [1] also state that 'r' is a prefix,
> not a suffix. On the other hand the lexical definition doesn't mention
> the u"" notation yet.

I suggest we drop the "raw Unicode" syntax altogether for 3.3, as its
current implementation in 3.x doesn't match 2.x, and the 2.x "not
really raw" mechanism only made any sense because the language support
for embedding Unicode characters directly in string literals wasn't as
robust as it is in 3.x (Terry Reedy pointed out this problem a while
back, but I failed to follow up on it at the time).

$ python
Python 2.7.3 (default, Apr 30 2012, 21:18:11)
[GCC 4.7.0 20120416 (Red Hat 4.7.0-2)] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> print(ur'\u03B3')
γ

$ ./python
Python 3.3.0a4+ (default:cfbf6aa5c9e3+, Jun 17 2012, 15:25:45)
[GCC 4.7.0 20120507 (Red Hat 4.7.0-5)] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> ur'\u03B3'
'\\u03B3'
[73691 refs]
>>> r'\u03B3'
'\\u03B3'
[73691 refs]
>>> '\u03B3'
'γ'

Better to have a noisy conversion error than silently risking
producing different output.

So, while PEP 414 will allow u"" to run unmodified, ur"" will still
need to be changed to something else, because that partially escaped
behaviour isn't available in 3.x and we don't want to reintroduce it.

I've created http://bugs.python.org/issue15096 to track this reversion.

Cheers,
Nick.


Cheers,
Nick.

-- 
Nick Coghlan   |   [email protected]   |   Brisbane, Australia
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] [Python-checkins] cpython: Improve an internal ipaddress test, add a comment explaining why treating

2012-06-17 Thread Nadeem Vawda
On Sun, Jun 17, 2012 at 8:33 AM, nick.coghlan
 wrote:
> +    @property
> +    def version(self):
> +        msg = '%200s has no version specified' % (type(self),)
> +        raise NotImplementedError(msg)
> +

Shouldn't that be "%.200s", rather than "%200s"?
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Raw string syntax inconsistency

2012-06-17 Thread Martin v. Löwis
> So, while PEP 414 will allow u"" to run unmodified, ur"" will still
> need to be changed to something else, because that partially escaped
> behaviour isn't available in 3.x and we don't want to reintroduce it.

Given that the PEP currently explicitly supports ur, I think the
reversal of the reversal will need some discussion in the PEP.

(this reminds me of Germany's path wrt. nuclear power - where the
previous government decided to pullout from nuclear power (der
Ausstieg), the current government reverted that (Ausstieg vom Ausstieg),
and then, after the Fukushima accident, decided to revert that decision
(der Ausstieg vom Ausstieg vom Ausstieg aus der Atomenergie)).

Regards,
Martin

___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Raw string syntax inconsistency

2012-06-17 Thread Terry Reedy

On 6/17/2012 10:59 AM, "Martin v. Löwis" wrote:

So, while PEP 414 will allow u"" to run unmodified, ur"" will still
need to be changed to something else, because that partially escaped
behaviour isn't available in 3.x and we don't want to reintroduce it.


Given that the PEP currently explicitly supports ur, I think the
reversal of the reversal will need some discussion in the PEP.


Definitely. The current version of the PEP is contradictory.

"Combination of the unicode prefix with the raw string prefix will also 
be supported, just as it was in Python 2.


No changes are proposed to Python 3's actual Unicode handling, only to 
the acceptable forms for string literals."


Because there is an (unintuitive and obviously forgettable) interaction 
effect between 'u' and 'r' in 2.7, truly supporting 'ur', *just as it 
was in Python 2*, means changing "Python 3's actual Unicode handling".


The premise of the discussion of adding 'u', and of Guido's acceptance, 
was that "it's about as harmless as they come". I do not remember any 
discussion of 'ur' and what it really means in 2.x, and that supporting 
it meant adding back 2.x's interaction effect. Indeed, Nick's version 
goes on to say "This PEP was originally written by Armin Ronacher, and 
Guido's approval was given based on that version." Armin's original 
version (and subsequent edit) only proposed adding 'u' (and 'U') and 
made no mention of 'ur'. Nick's seemingly innocuous addition of also 
adding 'ur' came after Guido's approval, and as discovered, is not so 
innocuous.


I do not think he needs to discuss adding and deleting support, but 
merely state that 'ur' support is not added because 'ur' has a special 
meaning that would require changing literal handling. The sentence about 
supporting 'ur' could be negated and moved after the sentence about not 
changing Unicode handling. A possibility:


"Combination of the unicode prefix with the raw string prefix will not 
be supported because in Python 2, the combination 'ur' has a special 
meaning that would require changing the handling of unicode literals"


--
Terry Jan Reedy




___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Raw string syntax inconsistency

2012-06-17 Thread Nick Coghlan
On Mon, Jun 18, 2012 at 3:54 AM, Terry Reedy  wrote:
> The premise of the discussion of adding 'u', and of Guido's acceptance, was
> that "it's about as harmless as they come". I do not remember any discussion
> of 'ur' and what it really means in 2.x, and that supporting it meant adding
> back 2.x's interaction effect. Indeed, Nick's version goes on to say "This
> PEP was originally written by Armin Ronacher, and Guido's approval was given
> based on that version." Armin's original version (and subsequent edit) only
> proposed adding 'u' (and 'U') and made no mention of 'ur'. Nick's seemingly
> innocuous addition of also adding 'ur' came after Guido's approval, and as
> discovered, is not so innocuous.

Right, that matches my recollection as well - we (or least I) thought
mapping "ur" to the Python 3 "r" prefix was sufficient, but it turns
out doing so means there are some 2.x string literals that will
silently behave differently in 3.x.

Martin's right that that part of the PEP should definitely be amended
(along with the relevant section in What's New)

> I do not think he needs to discuss adding and deleting support, but merely
> state that 'ur' support is not added because 'ur' has a special meaning that
> would require changing literal handling. The sentence about supporting 'ur'
> could be negated and moved after the sentence about not changing Unicode
> handling. A possibility:
>
> "Combination of the unicode prefix with the raw string prefix will not be
> supported because in Python 2, the combination 'ur' has a special meaning
> that would require changing the handling of unicode literals"

In addition to changing the proposal section to only cover "u" and
"U", I'll actually add a new subsection along the lines of the
following:

Exclusion of Raw Unicode Strings
-

Python 2.x includes a concept of "raw Unicode" strings. These are
partially raw string literals that still support the "\u" and "\U"
escape codes for Unicode character entry, but otherwise treat "\" as a
literal backslash character. As 3.x has no such concept of a partially
raw string literal, explicit raw Unicode literals are still not
supported. Such literals in Python 2 code will need to be converted to
ordinary Unicode literals for forward compatibility with Python 3.

Cheers,
Nick.

-- 
Nick Coghlan   |   [email protected]   |   Brisbane, Australia
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Raw string syntax inconsistency

2012-06-17 Thread Guido van Rossum
Would it make sense to detect and reject these in 3.3 if the 2.7 syntax is
used?

--Guido van Rossum (sent from Android phone)
On Jun 17, 2012 1:13 PM, "Nick Coghlan"  wrote:

> On Mon, Jun 18, 2012 at 3:54 AM, Terry Reedy  wrote:
> > The premise of the discussion of adding 'u', and of Guido's acceptance,
> was
> > that "it's about as harmless as they come". I do not remember any
> discussion
> > of 'ur' and what it really means in 2.x, and that supporting it meant
> adding
> > back 2.x's interaction effect. Indeed, Nick's version goes on to say
> "This
> > PEP was originally written by Armin Ronacher, and Guido's approval was
> given
> > based on that version." Armin's original version (and subsequent edit)
> only
> > proposed adding 'u' (and 'U') and made no mention of 'ur'. Nick's
> seemingly
> > innocuous addition of also adding 'ur' came after Guido's approval, and
> as
> > discovered, is not so innocuous.
>
> Right, that matches my recollection as well - we (or least I) thought
> mapping "ur" to the Python 3 "r" prefix was sufficient, but it turns
> out doing so means there are some 2.x string literals that will
> silently behave differently in 3.x.
>
> Martin's right that that part of the PEP should definitely be amended
> (along with the relevant section in What's New)
>
> > I do not think he needs to discuss adding and deleting support, but
> merely
> > state that 'ur' support is not added because 'ur' has a special meaning
> that
> > would require changing literal handling. The sentence about supporting
> 'ur'
> > could be negated and moved after the sentence about not changing Unicode
> > handling. A possibility:
> >
> > "Combination of the unicode prefix with the raw string prefix will not be
> > supported because in Python 2, the combination 'ur' has a special meaning
> > that would require changing the handling of unicode literals"
>
> In addition to changing the proposal section to only cover "u" and
> "U", I'll actually add a new subsection along the lines of the
> following:
>
> Exclusion of Raw Unicode Strings
> -
>
> Python 2.x includes a concept of "raw Unicode" strings. These are
> partially raw string literals that still support the "\u" and "\U"
> escape codes for Unicode character entry, but otherwise treat "\" as a
> literal backslash character. As 3.x has no such concept of a partially
> raw string literal, explicit raw Unicode literals are still not
> supported. Such literals in Python 2 code will need to be converted to
> ordinary Unicode literals for forward compatibility with Python 3.
>
> Cheers,
> Nick.
>
> --
> Nick Coghlan   |   [email protected]   |   Brisbane, Australia
> ___
> Python-Dev mailing list
> [email protected]
> http://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe:
> http://mail.python.org/mailman/options/python-dev/guido%40python.org
>
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Raw string syntax inconsistency

2012-06-17 Thread Nick Coghlan
On Mon, Jun 18, 2012 at 6:41 AM, Guido van Rossum  wrote:
> Would it make sense to detect and reject these in 3.3 if the 2.7 syntax is
> used?

Possibly - I'm trying not to actually *change* any of the internals of
the string literal processing, though. (If I recall the way we
implemented the change correctly, by the time we get to processing the
string contents, we've forgotten which specific prefix was used)

However, tis question did remind me of another detail I wanted to
check after realising this discrepancy existed: it turns out this
semantic inconsistency already arises if you use "from __future__
import unicode_literals" to get supposedly "Python 3 style" string
literals in 2.x

Python 2.7.3 (default, May 29 2012, 14:54:22)
>>> from __future__ import unicode_literals
>>> print(r"\u03b3")
γ
>>> print("\u03b3")
γ

Python 3.2.1 (default, Jul 11 2011, 18:54:42)
>>> print(r"\u03b3")
\u03b3
>>> print("\u03b3")
γ

So, perhaps the answer is to leave this as is, and try to make 2to3
smart enough to detect such escapes and replace them with their
properly encoded (according to the source code encoding) Unicode
equivalent? After all, that's already the way to include such
characters in a forward compatible way when using the future import:

Python 2.7.3 (default, May 29 2012, 14:54:22)
>>> from __future__ import unicode_literals
>>> print("γ")
γ
>>> print(r"γ\n")
γ\n

Python 3.2.1 (default, Jul 11 2011, 18:54:42)
>>> print("γ")
γ
>>> print(r"γ\n")
γ\n

So, rather than going ahead with reverting "ur" support as I first
suggested (since it turns out that's not a *new* problem, but just a
different way of spelling an *existing* problem), how about I do the
following:

1. Add a note to PEP 414 and the Py3k porting guide regarding the
discrepancy in escaping semantics for raw Unicode strings between 2.x
and 3.x
2. Reject the tracker issue for reverting the ur support (the semantic
problem already exists, and any solution we come up with for
__future__.unicode_literals should handle the ur prefix as well)
3. Create a new feature request for 2to3 to see if it can
automatically handle the problem of translating "\u" and "\U" escapes
into properly encoded Unicode characters

The scope of the problem is really quite small: you have to be using a
raw Unicode string in 2.x (either via the string prefix, or the future
import) *and* using a "\u" or "\U" escape within that string.

Regards,
Nick.

>
> --Guido van Rossum (sent from Android phone)
>
> On Jun 17, 2012 1:13 PM, "Nick Coghlan"  wrote:
>>
>> On Mon, Jun 18, 2012 at 3:54 AM, Terry Reedy  wrote:
>> > The premise of the discussion of adding 'u', and of Guido's acceptance,
>> > was
>> > that "it's about as harmless as they come". I do not remember any
>> > discussion
>> > of 'ur' and what it really means in 2.x, and that supporting it meant
>> > adding
>> > back 2.x's interaction effect. Indeed, Nick's version goes on to say
>> > "This
>> > PEP was originally written by Armin Ronacher, and Guido's approval was
>> > given
>> > based on that version." Armin's original version (and subsequent edit)
>> > only
>> > proposed adding 'u' (and 'U') and made no mention of 'ur'. Nick's
>> > seemingly
>> > innocuous addition of also adding 'ur' came after Guido's approval, and
>> > as
>> > discovered, is not so innocuous.
>>
>> Right, that matches my recollection as well - we (or least I) thought
>> mapping "ur" to the Python 3 "r" prefix was sufficient, but it turns
>> out doing so means there are some 2.x string literals that will
>> silently behave differently in 3.x.
>>
>> Martin's right that that part of the PEP should definitely be amended
>> (along with the relevant section in What's New)
>>
>> > I do not think he needs to discuss adding and deleting support, but
>> > merely
>> > state that 'ur' support is not added because 'ur' has a special meaning
>> > that
>> > would require changing literal handling. The sentence about supporting
>> > 'ur'
>> > could be negated and moved after the sentence about not changing Unicode
>> > handling. A possibility:
>> >
>> > "Combination of the unicode prefix with the raw string prefix will not
>> > be
>> > supported because in Python 2, the combination 'ur' has a special
>> > meaning
>> > that would require changing the handling of unicode literals"
>>
>> In addition to changing the proposal section to only cover "u" and
>> "U", I'll actually add a new subsection along the lines of the
>> following:
>>
>> Exclusion of Raw Unicode Strings
>> -
>>
>> Python 2.x includes a concept of "raw Unicode" strings. These are
>> partially raw string literals that still support the "\u" and "\U"
>> escape codes for Unicode character entry, but otherwise treat "\" as a
>> literal backslash character. As 3.x has no such concept of a partially
>> raw string literal, explicit raw Unicode literals are still not
>> supported. Such literals in Python 2 code will need to be converted to
>> ordinary Unicode li

Re: [Python-Dev] Raw string syntax inconsistency

2012-06-17 Thread Guido van Rossum
On Sun, Jun 17, 2012 at 4:55 PM, Nick Coghlan  wrote:

> On Mon, Jun 18, 2012 at 6:41 AM, Guido van Rossum 
> wrote:
> > Would it make sense to detect and reject these in 3.3 if the 2.7 syntax
> is
> > used?
>
> Possibly - I'm trying not to actually *change* any of the internals of
> the string literal processing, though. (If I recall the way we
> implemented the change correctly, by the time we get to processing the
> string contents, we've forgotten which specific prefix was used)
>
> However, tis question did remind me of another detail I wanted to
> check after realising this discrepancy existed: it turns out this
> semantic inconsistency already arises if you use "from __future__
> import unicode_literals" to get supposedly "Python 3 style" string
> literals in 2.x
>
> Python 2.7.3 (default, May 29 2012, 14:54:22)
> >>> from __future__ import unicode_literals
> >>> print(r"\u03b3")
> γ
> >>> print("\u03b3")
> γ
>
> Python 3.2.1 (default, Jul 11 2011, 18:54:42)
> >>> print(r"\u03b3")
> \u03b3
> >>> print("\u03b3")
> γ
>
> So, perhaps the answer is to leave this as is, and try to make 2to3
> smart enough to detect such escapes and replace them with their
> properly encoded (according to the source code encoding) Unicode
> equivalent?


But the whole point of the reintroduction of u"..." is to support code that
isn't run through 2to3. Frankly, I don't care how it's done, but I'd say
it's important not to silently have different behavior for the same
notation in the two versions. If that means we have to add an extra step to
the compiler to reject r"\u03b3", so be it.


> After all, that's already the way to include such
> characters in a forward compatible way when using the future import:
>
> Python 2.7.3 (default, May 29 2012, 14:54:22)
> >>> from __future__ import unicode_literals
> >>> print("γ")
> γ
> >>> print(r"γ\n")
> γ\n
>
> Python 3.2.1 (default, Jul 11 2011, 18:54:42)
> >>> print("γ")
> γ
> >>> print(r"γ\n")
> γ\n
>

Hm. I still encounter enough environments that don't know how to display
such characters that I would prefer to have a rock solid \u escape
mechanism. I can think of two ways to support "expanded" unicode characters
in raw strings a la Python 2; (a) let the re module interpret the escapes
(like it does for \r and \n); (b) the user can write r"someblah" "\u03b3"
r"moreblah".


> So, rather than going ahead with reverting "ur" support as I first
> suggested (since it turns out that's not a *new* problem, but just a
> different way of spelling an *existing* problem), how about I do the
> following:
>
> 1. Add a note to PEP 414 and the Py3k porting guide regarding the
> discrepancy in escaping semantics for raw Unicode strings between 2.x
> and 3.x
> 2. Reject the tracker issue for reverting the ur support (the semantic
> problem already exists, and any solution we come up with for
> __future__.unicode_literals should handle the ur prefix as well)
> 3. Create a new feature request for 2to3 to see if it can
> automatically handle the problem of translating "\u" and "\U" escapes
> into properly encoded Unicode characters
>
> The scope of the problem is really quite small: you have to be using a
> raw Unicode string in 2.x (either via the string prefix, or the future
> import) *and* using a "\u" or "\U" escape within that string.
>

Yeah, but if you do this and it breaks you likely won't notice until way
late in your QA cycle, when it may be tough to track down the origin. I'd
rather make ru"\u03b3" a syntax error if we can't give it the same meaning
as in Python 2.

(I'm not sure what to do about the same bug with __future__. Maybe we
should declare that a bug and "fix" it in a future 2.7 bugfix release?)


> Regards,
> Nick.
>
> >
> > --Guido van Rossum (sent from Android phone)
> >
> > On Jun 17, 2012 1:13 PM, "Nick Coghlan"  wrote:
> >>
> >> On Mon, Jun 18, 2012 at 3:54 AM, Terry Reedy  wrote:
> >> > The premise of the discussion of adding 'u', and of Guido's
> acceptance,
> >> > was
> >> > that "it's about as harmless as they come". I do not remember any
> >> > discussion
> >> > of 'ur' and what it really means in 2.x, and that supporting it meant
> >> > adding
> >> > back 2.x's interaction effect. Indeed, Nick's version goes on to say
> >> > "This
> >> > PEP was originally written by Armin Ronacher, and Guido's approval was
> >> > given
> >> > based on that version." Armin's original version (and subsequent edit)
> >> > only
> >> > proposed adding 'u' (and 'U') and made no mention of 'ur'. Nick's
> >> > seemingly
> >> > innocuous addition of also adding 'ur' came after Guido's approval,
> and
> >> > as
> >> > discovered, is not so innocuous.
> >>
> >> Right, that matches my recollection as well - we (or least I) thought
> >> mapping "ur" to the Python 3 "r" prefix was sufficient, but it turns
> >> out doing so means there are some 2.x string literals that will
> >> silently behave differently in 3.x.
> >>
> >> Martin's right that that part of the PEP should definitely be amended
> 

Re: [Python-Dev] Raw string syntax inconsistency

2012-06-17 Thread MRAB

On 18/06/2012 00:55, Nick Coghlan wrote:

On Mon, Jun 18, 2012 at 6:41 AM, Guido van Rossum  wrote:

 Would it make sense to detect and reject these in 3.3 if the 2.7 syntax is
 used?


Possibly - I'm trying not to actually *change* any of the internals of
the string literal processing, though. (If I recall the way we
implemented the change correctly, by the time we get to processing the
string contents, we've forgotten which specific prefix was used)

However, tis question did remind me of another detail I wanted to
check after realising this discrepancy existed: it turns out this
semantic inconsistency already arises if you use "from __future__
import unicode_literals" to get supposedly "Python 3 style" string
literals in 2.x

Python 2.7.3 (default, May 29 2012, 14:54:22)

 from __future__ import unicode_literals
 print(r"\u03b3")

γ

 print("\u03b3")

γ

Python 3.2.1 (default, Jul 11 2011, 18:54:42)

 print(r"\u03b3")

\u03b3

 print("\u03b3")

γ

So, perhaps the answer is to leave this as is, and try to make 2to3
smart enough to detect such escapes and replace them with their
properly encoded (according to the source code encoding) Unicode
equivalent?


What if it's not possible to encode that character? I suppose that it
could be expanded into a string expression so that a non-raw string
literal could be used, possibly using implicit concatenation,
parenthesised, if necessary (or always?).

> After all, that's already the way to include such characters in a

forward compatible way when using the future import:

Python 2.7.3 (default, May 29 2012, 14:54:22)

 from __future__ import unicode_literals
 print("γ")

γ

 print(r"γ\n")

γ\n

Python 3.2.1 (default, Jul 11 2011, 18:54:42)

 print("γ")

γ

 print(r"γ\n")

γ\n

So, rather than going ahead with reverting "ur" support as I first
suggested (since it turns out that's not a *new* problem, but just a
different way of spelling an *existing* problem), how about I do the
following:

1. Add a note to PEP 414 and the Py3k porting guide regarding the
discrepancy in escaping semantics for raw Unicode strings between 2.x
and 3.x
2. Reject the tracker issue for reverting the ur support (the semantic
problem already exists, and any solution we come up with for
__future__.unicode_literals should handle the ur prefix as well)
3. Create a new feature request for 2to3 to see if it can
automatically handle the problem of translating "\u" and "\U" escapes
into properly encoded Unicode characters

The scope of the problem is really quite small: you have to be using a
raw Unicode string in 2.x (either via the string prefix, or the future
import) *and* using a "\u" or "\U" escape within that string.


[snip]
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


[Python-Dev] What's the best way to debug python3 source code?

2012-06-17 Thread gmspro
Hi,

What's the best way to debug python3 source code?
To fix a bug i need to debug source code(C files).
I use gdb to debug.
But how can i get the exact file/point to fix the bug?
How can i know quickly where the bug is?
How can i run python>>> from gdb and giving input there how can i test and 
debug to fix a bug?

Someone please explain/elaborate the process you use/do as usual with examples.

Thanks.
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Raw string syntax inconsistency

2012-06-17 Thread Stephen J. Turnbull
"Martin v. Löwis" writes:

 > (this reminds me of Germany's path wrt. nuclear power

Yeah, except presumably Python won't be buying cheap "raw Unicode"
support from Perl. ;-)

___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Raw string syntax inconsistency

2012-06-17 Thread Martin v. Löwis
On 17.06.2012 22:41, Guido van Rossum wrote:
> Would it make sense to detect and reject these in 3.3 if the 2.7 syntax
> is used?

Maybe we are talking about different things: The (new) proposal is that
the ur prefix in 3.3 is a syntax error (again, as it was before PEP
414). So, yes: the raw unicode literals will be rejected (not by
explicitly detecting them, though).

Regards,
Martin

___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Raw string syntax inconsistency

2012-06-17 Thread Terry Reedy

On 6/17/2012 9:07 PM, Guido van Rossum wrote:

On Sun, Jun 17, 2012 at 4:55 PM, Nick Coghlan 


So, perhaps the answer is to leave this as is, and try to make 2to3
smart enough to detect such escapes and replace them with their
properly encoded (according to the source code encoding) Unicode
equivalent?


But the whole point of the reintroduction of u"..." is to support code
that isn't run through 2to3.


People writing 2&3 code sometimes use 2to3 once (or a few times) on 
their 2.6/7 version during development to find things they must pay 
attention to. So Nick's idea could be helpful to people who do not want 
to use 2to3 routinely either in development or deployment.


> Frankly, I don't care how it's done, but

I'd say it's important not to silently have different behavior for the
same notation in the two versions.


The fundamental problem was giving the 'u' prefix two different meanings 
in 2.x: 'change the storage type from bytes to unicode', and 'change the 
contents by partially cooking the literal even when raw processing is 
requested'*. The only way to silently have the same behavior is to 
re-introduce the second meaning of partial cooking. (But I would rather 
make it unnecessary.) But that would freeze the 'u' prefix, or at least 
'ur' ('un-raw') forever. It would be better to introduce a new, separate 
'p' prefix, to mean partially raw, partially cooked. (But I am opposes to


*I think this non-orthogonal interaction effect was a design mistake and 
that it would have been better to have re do all the cooking needed by 
also interpreting \u and \U sequences. I also think we should add this 
now for 3.3 if possible, to make partial cooking at the parsing stage 
unnecessary. Putting the processing in re makes it work for all strings, 
not just those given as literals.


> If that means we have to add an extra

step to the compiler to reject r"\u03b3", so be it.


I do not get this. Surely you cannot mean to suddenly start rejecting, 
in 3.3, a large set of perfectly legal and sensible 6 and 10 character 
sequences when embedded in literals?




Hm. I still encounter enough environments that don't know how to display
such characters that I would prefer to have a rock solid \u escape
mechanism. I can think of two ways to support "expanded" unicode
characters in raw strings a la Python 2;


(a) let the re module interpret the escapes (like it does for \r and \n);

As said above, I favor this. The 2.x partial cooking (with 'ur' prefix) 
was primarily a substitute for this.


(b) the user can write r"someblah" "\u03b3" r"moreblah".

This is somewhat orthogonal to (a). Users can this whenever they want 
partial processing of backslashes without doubling those they want left 
as is. A generic example is r'someraw' 'somecooked' r'moreraw' 
'morecooked'.


--
Terry Jan Reedy



___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] What's the best way to debug python3 source code?

2012-06-17 Thread Terry Reedy

On 6/18/2012 12:43 AM, gmspro wrote:


What's the best way to debug python3 source code?

...

The pydev list is for development *of* future python releases. For 
questions about development *with* current releases, please ask on 
python-list or other user oriented forums.


--
Terry Jan Reedy



___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Raw string syntax inconsistency

2012-06-17 Thread Martin v. Löwis
> But the whole point of the reintroduction of u"..." is to support code
> that isn't run through 2to3. Frankly, I don't care how it's done, but
> I'd say it's important not to silently have different behavior for the
> same notation in the two versions. If that means we have to add an extra
> step to the compiler to reject r"\u03b3", so be it.

It's actually ur"\u03b3" that will be rejected, and that falls out
easily by just not being able to parse it. The 2.x r"\u03b3" denotes
a 6-character (byte) string, which continues to be understood as a
6-character Unicode string in 3.3.

> Hm. I still encounter enough environments that don't know how to display
> such characters that I would prefer to have a rock solid \u escape
> mechanism.

If you want to use them under the revised PEP 414, you will have to
avoid making them raw, and just use a plain u prefix. IOW, you need
to double all backslashes that you want to stand on their own, and
then use \u escapes to denote non-typable characters.


> Yeah, but if you do this and it breaks you likely won't notice until way
> late in your QA cycle, when it may be tough to track down the origin.
> I'd rather make ru"\u03b3" a syntax error if we can't give it the same
> meaning as in Python 2.

That's exactly the proposal, see

http://bugs.python.org/issue15096
http://bugs.python.org/file26036/issue15096-1.patch

Regards,
Martin
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] What's the best way to debug python3 source code?

2012-06-17 Thread Martin v. Löwis
> What's the best way to debug python3 source code?
> To fix a bug i need to debug source code(C files).
> I use gdb to debug.

If the bug is presumably in C, then using gdb works fine for me.

> But how can i get the exact file/point to fix the bug?

As usual: set breakpoints and watch points. In some cases, also augment
the C code to print out trace messages. Use the excellent python-gdb.py,
which requires a recent gdb version.

> How can i know quickly where the bug is?

The fastest way is probably Linus Torvald's approach: just look at the
code for 20 seconds, and see the bug without running the code. YMMV.

> How can i run python>>> from gdb and giving input there how can i test
> and debug to fix a bug?

If you start Python in gdb, then do "r", it will automatically start
interactive mode.

> Someone please explain/elaborate the process you use/do as usual with
> examples.

I wasn't quite sure whether your question is off-topic for python-dev:
this last request definitely is. python-dev is not a place to get free
education. Instead, it is a place where *you* contribute to Python. If
you work on a specific bug and have a specific question about it, feel
free to ask. However, "teach me how to debug" is best asked on other
mailing lists.

Regards,
Martin

___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Raw string syntax inconsistency

2012-06-17 Thread Terry Reedy

On 6/18/2012 2:06 AM, "Martin v. Löwis" wrote:


Hm. I still encounter enough environments that don't know how to display
such characters that I would prefer to have a rock solid \u escape
mechanism.


If you want to use them under the revised PEP 414, you will have to
avoid making them raw, and just use a plain u prefix. IOW, you need
to double all backslashes that you want to stand on their own, and
then use \u escapes to denote non-typable characters.


And such literals will mean the same thing in 2.x and 3.3+.

--
Terry Jan Reedy




___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Raw string syntax inconsistency

2012-06-17 Thread Nick Coghlan
On Mon, Jun 18, 2012 at 3:59 PM, "Martin v. Löwis"  wrote:
> On 17.06.2012 22:41, Guido van Rossum wrote:
>> Would it make sense to detect and reject these in 3.3 if the 2.7 syntax
>> is used?
>
> Maybe we are talking about different things: The (new) proposal is that
> the ur prefix in 3.3 is a syntax error (again, as it was before PEP
> 414). So, yes: the raw unicode literals will be rejected (not by
> explicitly detecting them, though).

I think GvR was replying to my email where I was briefly reconsidering
the idea of keeping them around (because the unicode_literals future
import already suffers from this problem of literals that don't mean
the same things in 2.x and in 3.x). However, that was flawed reasoning
on my part - simply banning them altogether in 3.x is the simplest
option to ensure this particular error doesn't pass silently,
especially since there are alternate forward compatible ways to write
them, such as:

Python 2.7.3 (default, May 29 2012, 14:54:22)
>>> from __future__ import unicode_literals
>>> print(u"\u03b3" r"\n")
γ\n
>>> print(u"\u03b3\\n")
γ\n

Python 3.3.0a4 (default:f1dd70bfb4c5, May 31 2012, 09:47:51)
>>> print(u"\u03b3" r"\n")
γ\n
>>> print(u"\u03b3\\n")
γ\n

Cheers,
Nick.

-- 
Nick Coghlan   |   [email protected]   |   Brisbane, Australia
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com