Re: [Python-Dev] What does a double coding cookie mean?

2016-03-16 Thread Serhiy Storchaka

On 16.03.16 08:34, Glenn Linderman wrote:

 From the PEP 263:


More precisely, the first or second line must match the regular
expression "coding[:=]\s*([-\w.]+)". The first group of this
expression is then interpreted as encoding name. If the encoding
is unknown to Python, an error is raised during compilation. There
must not be any Python statement on the line that contains the
encoding declaration.


Clearly the regular expression would only match the first of multiple
cookies on the same line, so the first one should always win... but
there should only be one, from the first PEP quote "a magic comment".


"The first group of this expression" means the first regular expression 
group. Only the part between parenthesis "([-\w.]+)" is interpreted as 
encoding name, not all expression.



___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] What does a double coding cookie mean?

2016-03-16 Thread Serhiy Storchaka

On 16.03.16 02:28, Guido van Rossum wrote:

I agree that the spirit of the PEP is to stop at the first coding
cookie found. Would it be okay if I updated the PEP to clarify this?
I'll definitely also update the docs.


Could you please also update the regular expression in PEP 263 to
"^[ \t\v]*#.*?coding[:=][ \t]*([-.a-zA-Z0-9]+)"?

Coding cookie must be in comment, only the first occurrence in the line 
must be taken to account (here is a bug in CPython), encoding name must 
be ASCII, and there must not be any Python statement on the line that 
contains the encoding declaration. [1]


[1] https://bugs.python.org/issue18873

___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] What does a double coding cookie mean?

2016-03-16 Thread Glenn Linderman

On 3/16/2016 12:09 AM, Serhiy Storchaka wrote:

On 16.03.16 08:34, Glenn Linderman wrote:

 From the PEP 263:


More precisely, the first or second line must match the regular
expression "coding[:=]\s*([-\w.]+)". The first group of this
expression is then interpreted as encoding name. If the encoding
is unknown to Python, an error is raised during compilation. There
must not be any Python statement on the line that contains the
encoding declaration.


Clearly the regular expression would only match the first of multiple
cookies on the same line, so the first one should always win... but
there should only be one, from the first PEP quote "a magic comment".


"The first group of this expression" means the first regular 
expression group. Only the part between parenthesis "([-\w.]+)" is 
interpreted as encoding name, not all expression.


Sure.  But there is no mention anywhere in the PEP of more than one 
being legal: just more than one position for it, EITHER line 1 or line 
2. So while the regular expression mentioned is not anchored, to allow 
variation in syntax between emacs and vim, "must match the regular 
expression" doesn't imply "several times", and when searching for a 
regular expression that might not be anchored, one typically expects to 
find the first.


Glenn
___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] What does a double coding cookie mean?

2016-03-16 Thread M.-A. Lemburg
On 16.03.2016 01:28, Guido van Rossum wrote:
> I agree that the spirit of the PEP is to stop at the first coding
> cookie found. Would it be okay if I updated the PEP to clarify this?
> I'll definitely also update the docs.

+1

The only reason to read up to two lines was to address the use of
the shebang on Unix, not to be able to define two competing
source code encodings :-)

> On Tue, Mar 15, 2016 at 2:04 PM, Brett Cannon  wrote:
>>
>>
>> On Tue, 15 Mar 2016 at 13:31 Guido van Rossum  wrote:
>>>
>>> I came across a file that had two different coding cookies -- one on
>>> the first line and one on the second. CPython uses the first, but mypy
>>> happens to use the second. I couldn't find anything in the spec or
>>> docs ruling out the second interpretation. Does anyone have a
>>> suggestion (apart from following CPython)?
>>>
>>> Reference: https://github.com/python/mypy/issues/1281
>>
>>
>> I think the spirit of PEP 263 is for the first specified encoding to win as
>> the support of two lines is to support shebangs and not multiple encodings
>> :) . I also think the fact that tokenize.detect_encoding() doesn't
>> automatically read two lines from its input also suggests the intent is
>> "first encoding wins" (and that is the semantics of the function).
> 
> 
> 

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Experts (#1, Mar 16 2016)
>>> Python Projects, Coaching and Consulting ...  http://www.egenix.com/
>>> Python Database Interfaces ...   http://products.egenix.com/
>>> Plone/Zope Database Interfaces ...   http://zope.egenix.com/

2016-03-07: Released eGenix pyOpenSSL 0.13.14 ... http://egenix.com/go89
2016-02-19: Released eGenix PyRun 2.1.2 ...   http://egenix.com/go88

::: We implement business ideas - efficiently in both time and costs :::

   eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
   Registered at Amtsgericht Duesseldorf: HRB 46611
   http://www.egenix.com/company/contact/
  http://www.malemburg.com/

___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] What does a double coding cookie mean?

2016-03-16 Thread Serhiy Storchaka

On 16.03.16 09:46, Glenn Linderman wrote:

On 3/16/2016 12:09 AM, Serhiy Storchaka wrote:

On 16.03.16 08:34, Glenn Linderman wrote:

 From the PEP 263:


More precisely, the first or second line must match the regular
expression "coding[:=]\s*([-\w.]+)". The first group of this
expression is then interpreted as encoding name. If the encoding
is unknown to Python, an error is raised during compilation. There
must not be any Python statement on the line that contains the
encoding declaration.


Clearly the regular expression would only match the first of multiple
cookies on the same line, so the first one should always win... but
there should only be one, from the first PEP quote "a magic comment".


"The first group of this expression" means the first regular
expression group. Only the part between parenthesis "([-\w.]+)" is
interpreted as encoding name, not all expression.


Sure.  But there is no mention anywhere in the PEP of more than one
being legal: just more than one position for it, EITHER line 1 or line
2. So while the regular expression mentioned is not anchored, to allow
variation in syntax between emacs and vim, "must match the regular
expression" doesn't imply "several times", and when searching for a
regular expression that might not be anchored, one typically expects to
find the first.


Actually "must match the regular expression" is not correct, because 
re.match() implies anchoring at the start. I have proposed more correct 
regular expression in other branch of this thread.


___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com