Re: [Python-Dev] What does a double coding cookie mean?
On 16.03.16 08:34, Glenn Linderman wrote: From the PEP 263: More precisely, the first or second line must match the regular expression "coding[:=]\s*([-\w.]+)". The first group of this expression is then interpreted as encoding name. If the encoding is unknown to Python, an error is raised during compilation. There must not be any Python statement on the line that contains the encoding declaration. Clearly the regular expression would only match the first of multiple cookies on the same line, so the first one should always win... but there should only be one, from the first PEP quote "a magic comment". "The first group of this expression" means the first regular expression group. Only the part between parenthesis "([-\w.]+)" is interpreted as encoding name, not all expression. ___ Python-Dev mailing list [email protected] https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] What does a double coding cookie mean?
On 16.03.16 02:28, Guido van Rossum wrote: I agree that the spirit of the PEP is to stop at the first coding cookie found. Would it be okay if I updated the PEP to clarify this? I'll definitely also update the docs. Could you please also update the regular expression in PEP 263 to "^[ \t\v]*#.*?coding[:=][ \t]*([-.a-zA-Z0-9]+)"? Coding cookie must be in comment, only the first occurrence in the line must be taken to account (here is a bug in CPython), encoding name must be ASCII, and there must not be any Python statement on the line that contains the encoding declaration. [1] [1] https://bugs.python.org/issue18873 ___ Python-Dev mailing list [email protected] https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] What does a double coding cookie mean?
On 3/16/2016 12:09 AM, Serhiy Storchaka wrote: On 16.03.16 08:34, Glenn Linderman wrote: From the PEP 263: More precisely, the first or second line must match the regular expression "coding[:=]\s*([-\w.]+)". The first group of this expression is then interpreted as encoding name. If the encoding is unknown to Python, an error is raised during compilation. There must not be any Python statement on the line that contains the encoding declaration. Clearly the regular expression would only match the first of multiple cookies on the same line, so the first one should always win... but there should only be one, from the first PEP quote "a magic comment". "The first group of this expression" means the first regular expression group. Only the part between parenthesis "([-\w.]+)" is interpreted as encoding name, not all expression. Sure. But there is no mention anywhere in the PEP of more than one being legal: just more than one position for it, EITHER line 1 or line 2. So while the regular expression mentioned is not anchored, to allow variation in syntax between emacs and vim, "must match the regular expression" doesn't imply "several times", and when searching for a regular expression that might not be anchored, one typically expects to find the first. Glenn ___ Python-Dev mailing list [email protected] https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] What does a double coding cookie mean?
On 16.03.2016 01:28, Guido van Rossum wrote: > I agree that the spirit of the PEP is to stop at the first coding > cookie found. Would it be okay if I updated the PEP to clarify this? > I'll definitely also update the docs. +1 The only reason to read up to two lines was to address the use of the shebang on Unix, not to be able to define two competing source code encodings :-) > On Tue, Mar 15, 2016 at 2:04 PM, Brett Cannon wrote: >> >> >> On Tue, 15 Mar 2016 at 13:31 Guido van Rossum wrote: >>> >>> I came across a file that had two different coding cookies -- one on >>> the first line and one on the second. CPython uses the first, but mypy >>> happens to use the second. I couldn't find anything in the spec or >>> docs ruling out the second interpretation. Does anyone have a >>> suggestion (apart from following CPython)? >>> >>> Reference: https://github.com/python/mypy/issues/1281 >> >> >> I think the spirit of PEP 263 is for the first specified encoding to win as >> the support of two lines is to support shebangs and not multiple encodings >> :) . I also think the fact that tokenize.detect_encoding() doesn't >> automatically read two lines from its input also suggests the intent is >> "first encoding wins" (and that is the semantics of the function). > > > -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Experts (#1, Mar 16 2016) >>> Python Projects, Coaching and Consulting ... http://www.egenix.com/ >>> Python Database Interfaces ... http://products.egenix.com/ >>> Plone/Zope Database Interfaces ... http://zope.egenix.com/ 2016-03-07: Released eGenix pyOpenSSL 0.13.14 ... http://egenix.com/go89 2016-02-19: Released eGenix PyRun 2.1.2 ... http://egenix.com/go88 ::: We implement business ideas - efficiently in both time and costs ::: eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 http://www.egenix.com/company/contact/ http://www.malemburg.com/ ___ Python-Dev mailing list [email protected] https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] What does a double coding cookie mean?
On 16.03.16 09:46, Glenn Linderman wrote: On 3/16/2016 12:09 AM, Serhiy Storchaka wrote: On 16.03.16 08:34, Glenn Linderman wrote: From the PEP 263: More precisely, the first or second line must match the regular expression "coding[:=]\s*([-\w.]+)". The first group of this expression is then interpreted as encoding name. If the encoding is unknown to Python, an error is raised during compilation. There must not be any Python statement on the line that contains the encoding declaration. Clearly the regular expression would only match the first of multiple cookies on the same line, so the first one should always win... but there should only be one, from the first PEP quote "a magic comment". "The first group of this expression" means the first regular expression group. Only the part between parenthesis "([-\w.]+)" is interpreted as encoding name, not all expression. Sure. But there is no mention anywhere in the PEP of more than one being legal: just more than one position for it, EITHER line 1 or line 2. So while the regular expression mentioned is not anchored, to allow variation in syntax between emacs and vim, "must match the regular expression" doesn't imply "several times", and when searching for a regular expression that might not be anchored, one typically expects to find the first. Actually "must match the regular expression" is not correct, because re.match() implies anchoring at the start. I have proposed more correct regular expression in other branch of this thread. ___ Python-Dev mailing list [email protected] https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
