[Python-Dev] Re: Inline links in Misc/NEWS entries

2019-08-14 Thread Terry Reedy

On 8/13/2019 6:31 PM, Kyle Stanley wrote:

The primary purpose of me creating this topic was because there seems to 
be some sentiment that it's perfectly fine to exclusively use plaintext 
in the news entries. Especially in cases where authors have rejected 
suggestions to adding the Sphinx markup in their PRs. There seems to be 
some sentiment that it's perfectly fine to exclusively use plaintext in 
every news entry. Personally, I think it's a bit more nuanced, and that 
the links can sometimes be very helpful for readers.


Beyond what Ned said, (news markup is relatively new), people may be 
uncertain what is allowed and when appropriate.  Also, there is some 
situation for me where markup seems to be a nuisance and looks like it 
is introducing an error.  So I have changed unicode quotes and removed 
some rst markup.  Also, for IDLE, news entries to idlelib/NEWS.txt. 
where markup, as opposed to unicode, is noise.


Bottom line: I would rather a knowledgeable editor prettify the blurbs 
in a consistent manner after I am done with them.  To me, this is a 
place where specializations pays.


--
Terry Jan Reedy
___
Python-Dev mailing list -- [email protected]
To unsubscribe send an email to [email protected]
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/[email protected]/message/53QDMBQHW2S6F7JI4YSGAYYKJOVOIFQF/


[Python-Dev] Re: What to do about invalid escape sequences

2019-08-14 Thread Random832
On Mon, Aug 12, 2019, at 15:15, Terry Reedy wrote:
> Please no more combinations. The presence of both legal and illegal 
> combinations is already a mild nightmare for processing and testing. 
> idlelib.colorizer has the following re to detest legal combinations
> 
>  stringprefix = r"(?i:r|u|f|fr|rf|b|br|rb)?"

More advanced syntax highlighting editors have to handle each string type 
separately anyway, because they highlight (valid) backslash-escapes and 
f-string formatters. The proposed 'v-string' type would need separate handling 
even in a simplistic editor like IDLE, because it's different at the basic 
level of \" not ending the string (whereas, for better or worse, all current 
string types have exactly the same rules for how to find the end delimiter)
___
Python-Dev mailing list -- [email protected]
To unsubscribe send an email to [email protected]
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/[email protected]/message/H77NNYCZI37JCHGSMIHMTKNQVK5SGCWY/


[Python-Dev] Re: An f-string issue [Was: Re: Re: What to do about invalid escape sequences]

2019-08-14 Thread Random832
On Sat, Aug 10, 2019, at 19:54, Glenn Linderman wrote:
> Because of the "invalid escape sequence" and "raw string" discussion, 
> when looking at the documentation, I also noticed the following 
> description for f-strings:
> 
> > Escape sequences are decoded like in ordinary string literals (except when 
> > a literal is also marked as a raw string). After decoding, the grammar for 
> > the contents of the string is: followed by lots of stuff, followed by 
> > Backslashes are not allowed in format expressions and will raise an 
> error: f"newline: {ord('\n')}"  # raises SyntaxError 
>  What I don't understand is how, if f-strings are processed AS 
> DESCRIBED, how the \n is ever seen by the format expression.
> 
>  The description is that they are first decoded like ordinary strings, 
> and then parsed for the internal grammar containing {} expressions to 
> be expanded. If that were true, the \n in the above example would 
> already be a newline character, and the parsing of the format 
> expression would not see the backslash. And if it were true, that would 
> actually be far more useful for this situation.
> 
>  So given that it is not true, why not? And why go to the extra work of 
> prohibiting \ in the format expressions?

AIUI there were strong objections to the "AS DESCRIBED" process (which would 
require almost all valid uses of backslashes inside to be doubled, and would 
incidentally leave your example *still* a syntax error), and disallowing 
backslashes is a way to pretend that it doesn't work that way and leave open 
the possibility of changing how it works in the future without breaking 
compatibility.

The only dubious benefit to the described process with backslashes allowed 
would be that f-strings (or other strings, in the innermost level) could be 
infinitely nested as f'{f\'{f\\\'{...}\\\'}\'}', rather than being hard-limited 
to four levels as f'''{f"""{f'{"..."}'}"""}'''
___
Python-Dev mailing list -- [email protected]
To unsubscribe send an email to [email protected]
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/[email protected]/message/7NR4XYRTUAWFKGZEDTLODEWXV7YML6TZ/


[Python-Dev] Re: Snapshot formats in tracemalloc vs profiler

2019-08-14 Thread Yonatan Zunger
Update: Thanks to Victor's advice and the PEP445 hooks, I put together a
pretty comprehensive logging/sampling heap profiler for Python, and it
works great. The package is now available via pip
 for anyone who needs it!

On Thu, Jun 27, 2019 at 4:21 PM Yonatan Zunger  wrote:

> Well, then. I think I'm going to have some fun with this. :)
>
> Thank you!
>
> On Thu, Jun 27, 2019 at 4:17 PM Victor Stinner 
> wrote:
>
>> Le ven. 28 juin 2019 à 01:03, Yonatan Zunger  a écrit :
>> > Although while I have you hear, I do have a further question about how
>> tracemalloc works: If I'm reading the code correctly, traces get removed by
>> tracemalloc when objects are free, which means that at equilibrium (e.g. at
>> the end of a function) the trace would show just the data which leaked.
>> That's very useful in most cases, but I'm trying to hunt down a situation
>> where memory usage is transiently spiking -- which might be due to
>> something being actively used, or to something building up and overwhelming
>> the GC, or to evil elves in the CPU for all I can tell so far. Would it be
>> completely insane for tracemalloc to have a mode where it either records
>> frees separately (e.g. as a malloc of negative space, at the trace where
>> the free is happening), or where it simply ignores frees altogether?
>>
>> My very first implementation of tracemalloc produced a log of malloc
>> and free calls. Problem: transferring the log from a slow set top box
>> to a desktop computer was slow, and parsing the log was very slow.
>> Parsing complexity is in O(n) where n is the number of malloc or free
>> calls, knowning that Python calls malloc(), realloc() or free()
>> 270,000 times per second in average:
>>
>> https://www.python.org/dev/peps/pep-0454/#log-calls-to-the-memory-allocator
>>
>> tracemalloc is built on top of PEP 445 -- Add new APIs to customize
>> Python memory allocators:
>> https://www.python.org/dev/peps/pep-0445/
>>
>> Using these PEP 445 hooks, you should be able to do whatever you want
>> on Python memory allocations and free :-)
>>
>> Example of toy project to inject memory allocation failures:
>> https://github.com/vstinner/pyfailmalloc
>>
>> Victor
>>
>
___
Python-Dev mailing list -- [email protected]
To unsubscribe send an email to [email protected]
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/[email protected]/message/IFWYZFSL5LTMJ4LRY2LICDGAMLR3FNTR/


[Python-Dev] Re: What to do about invalid escape sequences

2019-08-14 Thread Eric V. Smith



On 8/14/2019 11:02 AM, Random832 wrote:

On Mon, Aug 12, 2019, at 15:15, Terry Reedy wrote:

Please no more combinations. The presence of both legal and illegal
combinations is already a mild nightmare for processing and testing.
idlelib.colorizer has the following re to detest legal combinations

  stringprefix = r"(?i:r|u|f|fr|rf|b|br|rb)?"


More advanced syntax highlighting editors have to handle each string type separately 
anyway, because they highlight (valid) backslash-escapes and f-string formatters. 
The proposed 'v-string' type would need separate handling even in a simplistic 
editor like IDLE, because it's different at the basic level of \" not ending 
the string (whereas, for better or worse, all current string types have exactly the 
same rules for how to find the end delimiter)


The reason I defined f-strings as I did is so that lexer/parsers 
(editors, syntax highlighters, other implementations, etc.) could easily 
ignore them, at least as a first pass. They're literally like all other 
strings to the lexer. Python's lexer/parser says that a string is:


- some optional letters, making the string prefix
- an opening quote or triple quote
- some optional chars, with \ escaping
- a matching closing quote or triple quote

The parser then validates the string prefix ('f' is okay, 'b' is okay, 
'fb' isn't okay, 'x' isn't okay, etc.) It then operates on the contents 
of the string, based on what the string prefix tell it to do.


So all an alternate lexer/parser has to do is add 'f' to the valid 
string prefixes, and it could then at least skip over f-strings. 
Somewhere in my notes I have 3 or 4 examples of projects that did this, 
and voila: they "supported" f-strings. Imagine a syntax highlighter that 
didn't want to highlight the inside of an f-string.


The proposed v-strings would indeed break this. I'm opposed to them for 
this reason, among others.


That all said, I am considering moving f-string parsing into the CPython 
parser. That would let you say things like:


f'some text {ord('a')}'

I'm not sure that's a great idea, but I've discussed it with several 
alternate implementations, and with authors of several editors, and they 
seem okay with it. I'm following Guido's parser experiment with some 
interest, to see how it might interact with this proposal. Might they 
also be okay with v-strings? Maybe. But it seems like a lot of hassle 
for a very minor feature.


Eric
___
Python-Dev mailing list -- [email protected]
To unsubscribe send an email to [email protected]
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/[email protected]/message/BDZCXGRW5KTUOGMRT6OHH6S3UD4BV5ZV/


[Python-Dev] Re: Snapshot formats in tracemalloc vs profiler

2019-08-14 Thread Victor Stinner
That looks pretty cool! I'm really happy that PEP 445 hooks are reused
for something different than tracemalloc ;-)

Victor

Le mer. 14 août 2019 à 20:12, Yonatan Zunger  a écrit :
>
> Update: Thanks to Victor's advice and the PEP445 hooks, I put together a 
> pretty comprehensive logging/sampling heap profiler for Python, and it works 
> great. The package is now available via pip for anyone who needs it!
>
> On Thu, Jun 27, 2019 at 4:21 PM Yonatan Zunger  wrote:
>>
>> Well, then. I think I'm going to have some fun with this. :)
>>
>> Thank you!
>>
>> On Thu, Jun 27, 2019 at 4:17 PM Victor Stinner  wrote:
>>>
>>> Le ven. 28 juin 2019 à 01:03, Yonatan Zunger  a écrit :
>>> > Although while I have you hear, I do have a further question about how 
>>> > tracemalloc works: If I'm reading the code correctly, traces get removed 
>>> > by tracemalloc when objects are free, which means that at equilibrium 
>>> > (e.g. at the end of a function) the trace would show just the data which 
>>> > leaked. That's very useful in most cases, but I'm trying to hunt down a 
>>> > situation where memory usage is transiently spiking -- which might be due 
>>> > to something being actively used, or to something building up and 
>>> > overwhelming the GC, or to evil elves in the CPU for all I can tell so 
>>> > far. Would it be completely insane for tracemalloc to have a mode where 
>>> > it either records frees separately (e.g. as a malloc of negative space, 
>>> > at the trace where the free is happening), or where it simply ignores 
>>> > frees altogether?
>>>
>>> My very first implementation of tracemalloc produced a log of malloc
>>> and free calls. Problem: transferring the log from a slow set top box
>>> to a desktop computer was slow, and parsing the log was very slow.
>>> Parsing complexity is in O(n) where n is the number of malloc or free
>>> calls, knowning that Python calls malloc(), realloc() or free()
>>> 270,000 times per second in average:
>>> https://www.python.org/dev/peps/pep-0454/#log-calls-to-the-memory-allocator
>>>
>>> tracemalloc is built on top of PEP 445 -- Add new APIs to customize
>>> Python memory allocators:
>>> https://www.python.org/dev/peps/pep-0445/
>>>
>>> Using these PEP 445 hooks, you should be able to do whatever you want
>>> on Python memory allocations and free :-)
>>>
>>> Example of toy project to inject memory allocation failures:
>>> https://github.com/vstinner/pyfailmalloc
>>>
>>> Victor



-- 
Night gathers, and now my watch begins. It shall not end until my death.
___
Python-Dev mailing list -- [email protected]
To unsubscribe send an email to [email protected]
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/[email protected]/message/N3ZAEFYF64Q7WH7LL26E2T2NUV7ENGCY/


[Python-Dev] Re: What to do about invalid escape sequences

2019-08-14 Thread Glenn Linderman

On 8/14/2019 8:02 AM, Random832 wrote:

On Mon, Aug 12, 2019, at 15:15, Terry Reedy wrote:

Please no more combinations. The presence of both legal and illegal
combinations is already a mild nightmare for processing and testing.
idlelib.colorizer has the following re to detest legal combinations

  stringprefix = r"(?i:r|u|f|fr|rf|b|br|rb)?"

More advanced syntax highlighting editors have to handle each string type separately 
anyway, because they highlight (valid) backslash-escapes and f-string formatters. 
The proposed 'v-string' type would need separate handling even in a simplistic 
editor like IDLE, because it's different at the basic level of \" not ending 
the string (whereas, for better or worse, all current string types have exactly the 
same rules for how to find the end delimiter)
I had to read this several times, and then only after reading Eric's 
reply, it finally hit me that what you are saying is that \" doesn't end 
the string in any other form of string, but that sequence would end a 
v-string.


It seems that also explains why Serhiy, in describing his experiment 
really raw string literals mentioned having to change the tokenizer as 
well as the parser (proving that it isn't impossible to deal with truly 
raw strings).


\" not ending a raw string was certainly a gotcha for me when I started 
using Python (with a background in C and Perl among other languages), 
and it convinced me not to raw strings, that that gotcha was not worth 
the other benefits of raw strings. Serhiy said:
Currently a raw literal cannot end in a single backslash (e.g. in 
r"C:\User\"). Although there are reasons for this. It is an old 
gotcha, and there are many closed issues about it. This question is 
even included in FAQ. 
which indicates that I am not the only one that has been tripped up by 
that over the years.


Trying to look at it from the eyes of a beginning programmer, the whole 
idea of backslash being an escape character is an unnatural artifice. 
I'm unaware (but willing to be educated) of any natural language, when 
using quotations, that has such a  concept. Nested quotations exist, in 
various forms:  use of a different quotation mark for the inner and 
outer quotations, and block quotations (which in English, have increased 
margin on both sides, and have a blank line before and after).


Python actually supports constructs very similar to the natural language 
formats, allowing both  " and ' for quotations and nested quotations, 
and the triple-quoted string with either " or ' is very similar in 
concept to a block quotation. But _all_ the strings forms are burdened 
with surprises for the beginning programmer: escape sequences of one 
sort or another must be learned and understood to avoid surprises when 
using the \ character.


Programming languages certainly need an escape character mechanism to 
deal with characters that cannot easily be typed on a keyboard (such as 
¤ ¶ etc.), or which are visually indistinguishable from other characters 
or character sequences (various widths of white space), or which would 
be disruptive to the flow of code or syntax if represented by the usual 
character (newline, carriage return, formfeed, maybe others). But these 
are programming concepts, not natural language concept.  The basic 
concept of a quoted string should best be borrowed directly from natural 
language, and then enhancements to that made to deal with programming 
concepts.


In Python, as in C, the escape characters are built in the basic string 
syntax, one must learn the quirks of the escaping mechanism in order to 
write


In Perl, " strings include escapes, and ' strings do not. So there is a 
basic string syntax that is similar to natural language, and one that is 
extended to include programming concepts. [N.B. There are lots of 
reasons I switched from Perl to Python, and don't have any desire to go 
back, but I have to admit, that the lack of a truly raw string in Python 
was a disappointment.]


So that, together with the desire for new escape sequences, and the 
creation of a new escape mechanism in the f-string {} (which adds both { 
and } as escape characters by requiring them to be doubled to be treated 
as literal inside an f-string, instead of using \{ and \} as the escapes 
[which would have been possible, due to the addition of the f prefix]), 
and the issue that because every current \-escape is defined to do 
something, is why I suggested elsewhere in this thread 
 
that perhaps the whole irregular string syntax should be rebooted with a 
future import, and it seems it could both be simpler, more regular, and 
more powerful as a result. And by using a future import, there are no 
backward incompatibility issues, and migration can be module by module.


The more I think about this, the more tempting it is to attempt to fork 
Python just to have a better string syntax

[Python-Dev] Re: An f-string issue [Was: Re: Re: What to do about invalid escape sequences]

2019-08-14 Thread Glenn Linderman

On 8/14/2019 8:09 AM, Random832 wrote:

On Sat, Aug 10, 2019, at 19:54, Glenn Linderman wrote:

Because of the "invalid escape sequence" and "raw string" discussion,
when looking at the documentation, I also noticed the following
description for f-strings:


Escape sequences are decoded like in ordinary string literals (except when a 
literal is also marked as a raw string). After decoding, the grammar for the 
contents of the string is: followed by lots of stuff, followed by
Backslashes are not allowed in format expressions and will raise an

error: f"newline: {ord('\n')}"  # raises SyntaxError
  What I don't understand is how, if f-strings are processed AS
DESCRIBED, how the \n is ever seen by the format expression.

  The description is that they are first decoded like ordinary strings,
and then parsed for the internal grammar containing {} expressions to
be expanded. If that were true, the \n in the above example would
already be a newline character, and the parsing of the format
expression would not see the backslash. And if it were true, that would
actually be far more useful for this situation.

  So given that it is not true, why not? And why go to the extra work of
prohibiting \ in the format expressions?

AIUI there were strong objections to the "AS DESCRIBED" process (which would 
require almost all valid uses of backslashes inside to be doubled, and would incidentally 
leave your example *still* a syntax error), and disallowing backslashes is a way to 
pretend that it doesn't work that way and leave open the possibility of changing how it 
works in the future without breaking compatibility.

The only dubious benefit to the described process with backslashes allowed would be that f-strings (or other strings, 
in the innermost level) could be infinitely nested as f'{f\'{f\\\'{...}\\\'}\'}', rather than being hard-limited to 
four levels as f'''{f"""{f'{"..."}'}"""}'''


Sure. I am just pointing out (and did so in the issue I created for 
documentation as well), that the documentation does not currently 
correctly describe the implemenation, which is misleading to the user.


While I have opinions on how things could work better, my even stronger 
opinion is that documentation should *accurately* describe how things 
work, even if it how it works is more complex than it should be.
___
Python-Dev mailing list -- [email protected]
To unsubscribe send an email to [email protected]
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/[email protected]/message/B2HI27XRCA43GVV2D2MF5IZOUX5NG2PW/


[Python-Dev] Re: Inline links in Misc/NEWS entries

2019-08-14 Thread Kyle Stanley
> Also, for IDLE, news entries to idlelib/NEWS.txt
> where markup, as opposed to unicode, is noise.

Interesting, I actually wasn't aware of the distinction for idlelib/NEWS. I
can imagine that Sphinx constructs such as :func:, :meth:, and :class:
would not be nearly as useful there. However, I can imagine reST being
occasionally useful.

Based on a recent feature that was added to the IDLE, I could imagine
explicit inline reST links being somewhat useful for something like this:

Add support for displaying line numbers. See `IDLE Options menu <
https://docs.python.org/3/library/idle.html#options-menu-shell-and-editor
>`_.

The inline link would appear to readers as "IDLE Options menu" and allow
them to click on it to navigate to the corresponding documentation for more
information on the feature.

> Bottom line: I would rather a knowledgeable editor prettify the blurbs
> in a consistent manner after I am done with them.  To me, this is a
> place where specializations pays.

I completely agree. I was mainly addressing situations where PR authors
were rejecting or disregarding suggestions to add the markup in the news
entry from those who are knowledgeable of the Sphinx constructs/roles. It
wouldn't be reasonable to expect all PR authors to be able to properly
utilize every supported markup feature.

This is a rare occurrence, but I've had it happen a couple of times
recently. Based on the responses so far, it likely occurred due to some
developers not being aware that Misc/NEWS supported Sphinx roles and some
of the basic features of reST. That's completely understandable if it's a
newer feature.

Hopefully this discussion and any potential updates to the devguide will
improve awareness of the feature, and provide instructions on when it's
appropriate to utilize.

Thanks,
Kyle Stanley

On Wed, Aug 14, 2019 at 3:31 AM Terry Reedy  wrote:

> On 8/13/2019 6:31 PM, Kyle Stanley wrote:
>
> > The primary purpose of me creating this topic was because there seems to
> > be some sentiment that it's perfectly fine to exclusively use plaintext
> > in the news entries. Especially in cases where authors have rejected
> > suggestions to adding the Sphinx markup in their PRs. There seems to be
> > some sentiment that it's perfectly fine to exclusively use plaintext in
> > every news entry. Personally, I think it's a bit more nuanced, and that
> > the links can sometimes be very helpful for readers.
>
> Beyond what Ned said, (news markup is relatively new), people may be
> uncertain what is allowed and when appropriate.  Also, there is some
> situation for me where markup seems to be a nuisance and looks like it
> is introducing an error.  So I have changed unicode quotes and removed
> some rst markup.  Also, for IDLE, news entries to idlelib/NEWS.txt.
> where markup, as opposed to unicode, is noise.
>
> Bottom line: I would rather a knowledgeable editor prettify the blurbs
> in a consistent manner after I am done with them.  To me, this is a
> place where specializations pays.
>
> --
> Terry Jan Reedy
> ___
> Python-Dev mailing list -- [email protected]
> To unsubscribe send an email to [email protected]
> https://mail.python.org/mailman3/lists/python-dev.python.org/
> Message archived at
> https://mail.python.org/archives/list/[email protected]/message/53QDMBQHW2S6F7JI4YSGAYYKJOVOIFQF/
>
___
Python-Dev mailing list -- [email protected]
To unsubscribe send an email to [email protected]
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/[email protected]/message/K5TXWQ7ARKDV73MNEIED35PZDAL6T2L2/