Re: [Python-Dev] Multilingual programming article on the Red Hat Developer blog
A welcome article. One correction should be made, I believe: the area of code point space used for the smuggling of bytes under PEP-383 is not a "Unicode Private Use Area", but a portion of the trailing surrogate range. This is a code violation, which I imagine is why "surrogateescape" is an error handler, not a codec. http://www.unicode.org/faq/private_use.html I believe the private use area was considered and rejected for PEP-383. In an implementation of the type unicode based on UTF-16 (Jython), lone surrogates preclude a naive use of the platform string library. This is on my mind at the moment as I'm working several bugs in Jython's unicode type, and can see why it has been too difficult. Jeff On 10/09/2014 08:17, Nick Coghlan wrote: Since it may come in handy when discussing "Why was Python 3 necessary?" with folks, I wanted to point out that my article on the transition to multilingual programming has now been reposted on the Red Hat developer blog: http://developerblog.redhat.com/2014/09/09/transition-to-multilingual-programming-python/ I wouldn't normally bring the Red Hat brand into an upstream discussion like that, but this myth that Python 3 is killing the language, and that Python 2 could have continued as a viable development platform indefinitely "if only Guido and the core development team hadn't decided to go ahead and create Python 3", is just plain wrong, and it really needs to die. I'm hoping that borrowing a bit of Red Hat's enterprise credibility will finally get people to understand that we really do have some idea what we're doing, which is why most of our redistributors and many of our key users are helping to push the migration forward, while we also continue to support existing Python 2 users :) Cheers, Nick. ___ Python-Dev mailing list [email protected] https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Multilingual programming article on the Red Hat Developer blog
On 12/09/2014 04:28, Stephen J. Turnbull wrote: Jeff Allen writes: > A welcome article. One correction should be made, I believe: the area of > code point space used for the smuggling of bytes under PEP-383 is not a > "Unicode Private Use Area", but a portion of the trailing surrogate > range. Nice catch. Note that the surrogate range was originally part of the Private Use Area, but it was carved out with the adoption of UTF-16 in about 1993. In practice, I doubt that there are any current implementations claiming compatibility with Unicode 1.0 (IIRC, UTF-16 was made mandatory in Unicode 1.1). That's a helpful bit of history that explains the uncharacteristic inaccuracy. Most I can do to keep the current position clear in my head. I've always thought that the "right" way to handle the private use area for "platforms" like Python and Emacs, which may need to use it for their own purposes (such as "undecodable bytes") but want to respect its use by applications, is to create an auxiliary table mapping the private use area to objects describing the characters represented by the private use code points. These objects would have attributes such as external representation for text I/O, glyph (for GUI display), repr (for TTY display), various Unicode properties, etc. Simply having a block "for private use" seems to create an unmanaged space for conflict, reminiscent of the "other 128 characters" in bilingual programming. I wondered if the way to respect use by applications might be to make it private to a particular sub-class of str, idly however. Jeff Allen ___ Python-Dev mailing list [email protected] https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Multilingual programming article on the Red Hat Developer blog
Jim, Stephen: It seems like we're off topic here, but to answer all as briefly as possible: 1. Java does not really have a Unicode type, therefore not one that validates. It has a String type that is a sequence of UTF-16 code units. There are some String methods and Character methods that deal with code points represented as int. I can put any 16-bit values I like in a String. 2. With proper accounting for indices, and as long as surrogates appear in pairs, I believe operations like find or endswith give correct answers about the unicode, when applied to the UTF-16. This is an attractive implementation option, and mostly what we do. 3. I'm fixing some bugs where we get it wrong beyond the BMP, and the fix involves banning lone surrogates (completely). At present you can't type them in literals but you can sneak them in from Java. 4. I think (with Antoine) if Jython supported PEP-383 byte smuggling, it would have to do it the same way as CPython, as it is visible. It's not impossible (I think), but is messy. Some are strongly against. Jeff Allen On 12/09/2014 16:37, Jim J. Jewett wrote: On September 11, 2014, Jeff Allen wrote: ... "surrogateescape" is an error handler, not a codec. True, but I believe that is a CPython implementation detail. Other implementations (including jython) should implement the surrogatescape API, but I don't think it is important to use the same internal representation for the invalid bytes. lone surrogates preclude a naive use of the platform string library Invalid input often causes problems. Are you saying that there are situations where the platform string library could easily handle invalid characters in general, but has a problem with the specific case of lone surrogates? ___ Python-Dev mailing list [email protected] https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
[Python-Dev] Smuggling bytes in a UTF-16 implementation of str/unicode (was: Multilingual programming article on the Red Hat Developer blog)
This feels like a jython-dev discussion. But anyway ... On 17/09/2014 00:57, Stephen J. Turnbull wrote: The CPython representation uses trailing surrogates only[1], so it's never possible to interpret them as anything but non-characters -- as soon as you encounter them you know that it's a lone surrogate. Surely you can do the same. As long as the Java string manipulation functions don't check for surrogates, you should be fine with this representation. Of course I suppose your matching functions (etc) don't check for them either, so you will be somewhat vulnerable to bugs due to treating them as characters. But the same is true for CPython, AFAIK. They don't check. I agree that since only the trailing surrogate code points are allowed, you can tell that you have one, even in the UTF-16 form. The problem is that, if strings containing lone trailing surrogates are allowed, then: u'\udc83' in u'abc\U00010083xyz' u'abc\U00010083xyz'.endswith(u'\udc83xyz') are both True, if implemented in the obvious way on the UTF-16 representation. And this should not be so in Jython, which claims to be a wide build. (I can't actually type the second one, but I can get the same effect in Jython 2.7b3 via a java.lang.StringBuilder.) I believe that the usual string operations work correctly on the UTF-16 version of the string, as long as indexes are adjusted correctly. If we think it is ok that code using such methods give the wrong answer when fed strings containing smuggled bytes, then isolated (trailing) surrogates could be allowed. It's the user's fault for calling the method on that data. But I think it kinder that our implementation defend users from these wrong answers. In the latest state of Jython, we do this by rigorously preventing the construction of a PyUnicode containing a lone surrogate, so we can just use UTF-16 operations without further checks. I'm not sure that rigour will be universally welcomed, and clearly it precludes PEP-383 byte smuggling. Jeff ___ Python-Dev mailing list [email protected] https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEPs: ``.. code:: python`` or ``::`` (syntax highlighting)
The way this is expressed to docutils is slightly different from the way it would be expressed to Sphinx. I expected someone would mention this in relation to a possible move to RTD and Sphinx for PEPs and potential to have to re-work the ReST. Sorry if this was obvious, and the re-work simply too trivial to mention. Both use pygments, but the directive to Sphinx is ".. code-block:: ". The "::" shorthand works, meaning to take the language from the last ".. highlight:: " directive, or conf.py (usually "python"). This may be got from the references [1] vs [2] and [3] in Wes' original post, but in addition there's a little section in the devguide [6]. In my experience, when browsing a .rst file, GitHub recognises my code blocks (Sphinx "code-block::") and it colours Python (and Java) but not Python console. It does not use the scheme chosen in conf.py (but nor does RTD [7]). There are other limitations. Browsing the devguide source [8] there gives a good idea what the GitHub can and cannot represent in this view. [6] https://devguide.python.org/documenting/#showing-code-examples [7] https://docs.readthedocs.io/en/latest/faq.html#i-want-to-use-the-blue-default-sphinx-theme [8] https://github.com/python/devguide Jeff Allen On 03/12/2017 04:49, Wes Turner wrote: Add pygments for ``.. code::`` directive PEP syntax highlighting #1206 https://github.com/python/pythondotorg/issues/1206 Syntax highlighting is an advantage for writers, editors, and readers. reStructuredText PEPs are rendered into HTML with docutils. Syntax highlighting in Docutils 0.9+ is powered by Pygments. If Pygments is not installed, or there is a syntax error, syntax highlighting is absent. Docutils renders ``.. code::`` blocks with Python syntax highlighting by default. You can specify ``.. code:: python`` or ``.. code:: python3``. - GitHub shows Pygments syntax highlighting for ``.. code::`` directives for .rst and .restructuredtext documents - PEPs may eventually be hosted on ReadTheDocs with Sphinx (which installs docutils and pygments as install_requires in setup.py). https://github.com/python/peps/issues/2 https://github.com/python/core-workflow/issues/5 In order to use pygments with pythondotorg-hosted PEPs, a few things need to happen: - [ ] Include ``pygments`` in ``base-requirements.txt`` - [ ] Pick a pygments theme - Should we use the sphinx_rtd_theme default for consistency with the eventual RTD-hosted PEPs? - [ ] Include the necessary pygments CSS in the PEPs django template - [ ] rebuild the PEPs - Start using code directives in new PEPs - Manually review existing PEPs after adding code directives PEPs may use ``.. code::`` blocks instead of ``::`` so that code is syntax highlighted. On Saturday, December 2, 2017, Nick Coghlan <mailto:[email protected]>> wrote: On 3 December 2017 at 12:32, Wes Turner > wrote: > Pending a transition of PEPs to ReadTheDocs (with HTTPS on a custom domain? > and redirects?) (is there a gh issue for this task?), See https://github.com/python/peps/projects/1 <https://github.com/python/peps/projects/1> and https://github.com/python/core-workflow/issues/5 <https://github.com/python/core-workflow/issues/5> Cheers, Nick. -- Nick Coghlan | [email protected] | Brisbane, Australia ___ Python-Dev mailing list [email protected] https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/ja.py%40farowl.co.uk ___ Python-Dev mailing list [email protected] https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Is object the most base type? (bpo-20285)
On 02/02/2018 07:25, Steven D'Aprano wrote: How about: "the base class of the class heirarchy" "the root of the class heirarchy" Java ... now says: "Class Object is the root of the class hierarchy. Every class has Object as a superclass. All objects, including arrays, implement the methods of this class." Either for me, but I feel I should draw attention to the spelling. (Java is right.) Ironically, the word derives from priesthood (hieratic), not from inheritance (heir). Jeff Allen ___ Python-Dev mailing list [email protected] https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Subtle difference between f-strings and str.format()
My credentials for this are that I re-worked str.format in Jython quite
extensively, and I followed the design of f-strings a bit when they were
introduced, but I haven't used them to write anything.
On 29/03/2018 00:48, Tim Peters wrote:
[Tim Delaney ]
...
I also assumed (not having actually used an f-string) that all its
formatting arguments were evaluated before formatting.
It's a string - it doesn't have "arguments" as such. For example:
def f(a, b, n):
return f"{a+b:0{n}b}" # the leading "f" makes it an f-string
Agreed "argument" is the wrong word, but so is "string". It's an
expression returning a string, in which a, b and n are free variables. I
think we can understand it best as a string-display
(https://docs.python.org/3/reference/expressions.html#list-displays), or
a sort of eval() call.
The difference Serhiy identifies emerges (I think) because in the
conventional interpretation of a format call, the arguments of format
are evaluated left-to right (all of them) and then formatted in the
order references are encountered to these values in a tuple or
dictionary. In an f-string expressions are evaluated as they are
encountered. A more testing example is therefore perhaps:
'{1} {0}'.format(a(), b()) # E1
f'{b()}{a()}' # E2
I think I would be very surprised to find b called before a in E1
because of the general contract on the meaning of method calls. I'm
assuming that's what an AST-based optimisation would do? There's no
reason in E2 to call them in any other order than b then a and the
documentation tells me they are.
But do I expect a() to be called before the results of b() are
formatted? In E1 I definitely expect that. In E2 I don't think I'd be
surprised either way. Forced to guess, I would guess that b() would be
formatted and in the output buffer before a() was called, since it gives
the implementation fewer things to remember. Then I hope I would not
depend on this guesswork. Strictly-speaking the documentation doesn't
say when the result is formatted in relation to the evaluation of other
expressions, so there is permission for Serhiy's idea #2.
I think the (internal) AST change implied in Serhiy's idea #1 is the
price one has to pay *if* one insists on optimising str.format().
str.format just a method like any other. The reasons would have to be
very strong to give it special-case semantics. I agree that the cases
are rare in which one would notice a difference. (Mostly I think it
would be a surprise during debugging.) But I think users should be able
to rely on the semantics of call. Easier optimisation doesn't seem to me
a strong enough argument.
This leaves me at:
1: +1
2a, 2b: +0
3: -1
Jeff Allen
___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe:
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 572: Assignment Expressions
On 24/04/2018 02:42, Chris Jerdonek wrote: On Mon, Apr 23, 2018 at 4:54 PM, Greg Ewing wrote: Tim Peters wrote: if (diff := x - x_base) and (g := gcd(diff, n)) > 1: return g My problem with this is -- how do you read such code out loud? It could be... "if diff, which we define as x - x_base, and g, which ." etc. That's good. It also makes it natural to expect only a simple name. One can "define" a name, but assignment to a complex left-side expression is not definition (binding). Jeff Allen ___ Python-Dev mailing list [email protected] https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Every Release Can Be a Mini "Python 4000", Within Reason (was (name := expression) doesn't fit the narrative of PEP 20)
On 27/04/2018 08:38, Greg Ewing wrote:
How would you complete the following sentence? "The ':='
symbol is a much better symbol for assignment than '=',
because..."
... users new to programming but with a scientific background expect '='
to be a statement of an algebraic relationship between mathematical
quantities, not an instruction to the machine to do something.
That's easy to answer. (I can remember this particular light bulb
moment in a fellow student, who had been using a different name in every
assignment statement, and had found loops impossible to understand.)
Also it frees up '=' to be used with something like its expected meaning
in conditional statements, without making parsing hard/impossible. There
are arguments the other way, like brevity and familiarity to other
constituencies. But I feel we all know this.
Having chosen to go the '=', '==' route, the cost is large to change,
especially to get the other half of the benefit ('=' as a predicate). So
I think the question might be who is it better for and how much do we care.
And whether the days are gone when anyone learns algebra before programming.
I speculate this all goes back to some pre-iteration version of FORmula
TRANslation, where to its inventors '=' was definition and these really
were "statements" in the normal sense of stating a truth.
Jeff Allen
___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe:
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Every Release Can Be a Mini "Python 4000", Within Reason (was (name := expression) doesn't fit the narrative of PEP 20)
On 30/04/2018 07:22, Greg Ewing wrote: Jeff Allen wrote: I speculate this all goes back to some pre-iteration version of FORmula TRANslation, where to its inventors '=' was definition and these really were "statements" in the normal sense of stating a truth. Yeah, also the earliest FORTRAN didn't even *have* comparison operators. A conditional branch was something like I should have known that would turn out to be the most interesting part in my message. Not to take us further off topic, I'll just say thanks to Eitan's reply, I found this: http://www.softwarepreservation.org/projects/FORTRAN/BackusEtAl-Preliminary%20Report-1954.pdf They were not "statements", but "formulas" while '=' was assignment (sec 8) *and* comparison (sec 10B). So conversely to our worry, they actually wanted users to think of assignment initially as a mathematical formula (page 2) in order to exploit the similarity to a familiar concept, albeit a=a+i makes no sense from this perspective. Jeff Allen ___ Python-Dev mailing list [email protected] https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Support of UTF-16 and UTF-32 source encodings
I'm approaching this from the premise that we would like to avoid
needless surprises for users not versed in text encoding. I did a simple
experiment with notepad on Windows 7 as if a naïve user. If I write the
one-line program:
print("Hello world.") # by Jeff
It runs, no surprise.
We may legitimately encounter Unicode in string literals and comments.
If I write:
print("j't'kif Anaïs!") # par Hervé
and try to save it, notepad tells me this file "contains characters in
Unicode format which will be lost if you save this as an ANSI encoded
text file." To keep the Unicode information I should cancel and choose a
Unicode option. In the "Save as" dialogue the default encoding is ANSI.
The second option "Unicode" is clearly right as the warning said
"Unicode" 3 times and I don't know what big-endian or UTF-8 mean. Good
that worked. Closed and opened it looks exactly as I typed it.
But the bytes I actually wrote on disk consist of a BOM and UTF-16-LE.
And running it I get:
File "bonjour.py", line 1
SyntaxError: Non-UTF-8 code starting with '\xff' in file bonjour.py on
line 1, but no encoding declared; see
http://python.org/dev/peps/pep-0263/ for details
If I take the hint here and save as UTF-8, then it works, including
printing the accent. Inspection of the bytes shows it starts with a
UTF-8 BOM.
In Jython I get the same results (choking on UTF-16), but saved as
UTF-8, it works. I just have to make sure that's a Unicode constant if I
want it to print correctly, as we're at 2.7. Jython has a checkered past
with encodings, but tries to do exactly the same as CPython 2.7.x.
Now, a fact I haven't mentioned is that my machine was localised to
simplified Chinese (to diagnose some bug) during this test. If I
re-localise to my usual English (UK), I do not get the guidance from
notepad: instead it quietly saves as Latin-1 (cp1252), perhaps because
I'm in Western Europe. Python baulks at this, at the first accented
character. If I save from notepad as Unicode or UTF-8 the results are as
before, including the BOM.
In some circumstances, then, the natural result of using notepad and not
sticking to ASCII may be UTF-16-LE with a BOM, or Latin-1 depending on
localisation, it seems. The Python error message provides a clue what a
user should do, but they would need some background, a helpful teacher,
or the Internet to sort it out.
Jeff Allen
On 15/11/2015 07:23, Stephen J. Turnbull wrote:
Steve Dower writes:
> Saying [UTF-16] is rarely used is rather exposing your own
> unawareness though - it could arguably be the most commonly used
> encoding (depending on how you define "used").
Because we're discussing the storage of .py files, the relevant
definition is the one used by the Unicode Standard, of course: a
text/plain stream intended to be manipulated by any conformant Unicode
processor that claims to handle text/plain. File formats with in-band
formatting codes and allowing embedded non-text content like Word, or
operating system or stdlib APIs, don't count. Nor have I seen UTF-16
used in email or HTML since the unregretted days of Win2k betas[1]
(but I don't frequent Windows- or Java-oriented sites, so I have to
admit my experience is limited in a possibly relevant way).
In Japan my impression is that modern versions of Windows have
Memopad[sic] configured to emit UTF-8-with-signature by default for
new files, and if not, the abomination known as Shift JIS (I'm not
sure if that is a user or OEM option, though). Never a widechar
encoding (after all, the whole point of Shift JIS was to use an 8-bit
encoding for the katakana syllabary to save space or bandwidth).
I think if anyone wants to use UTF-16 or UTF-32 for exchange of Python
programs, they probably already know how to convert them to UTF-8. As
somebody already suggested, this can be delegated to the py.exe
launcher, if necessary, AFAICS.
I don't see any good reason for allowing non-ASCII-compatible
encodings in the reference CPython interpreter.
However, having mentioned Windows and Java, I have to wonder about
IronPython and Jython, respectively. Having never lived in either of
those environments, I don't know what text encoding their users might
prefer (or even occasionally encounter) in Python program source.
Steve
Footnotes:
[1] The version of Outlook Express shipped with them would emit
"HTML" mail with ASCII tags and UTF-8-encoded text (even if it was
encodable in pure ASCII). No, it wasn't spam, either, so it probably
really was Outlook Express as it claimed to be in one of the headers.
___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe:https://mail.python.org/mailman/options/python-dev/ja.py
Re: [Python-Dev] Using more specific methods in Python unit tests
On 16/02/2014 00:22, Nick Coghlan wrote: On 16 February 2014 09:20, Ned Deily wrote: In article <[email protected]>, Benjamin Peterson wrote: On Sat, Feb 15, 2014, at 10:12 AM, Serhiy Storchaka wrote: Although Raymond approved a patch for test_bigmem [2], his expressed the insistent recommendation not to do this. So I stop committing new reviewed patches. Terry recommended to discuss this in Python-Dev. What are your thoughts? I tend to agree with Raymond. I think such changes are very welcome when the module or tests are otherwise being changed, but on their on constitute unnecessary churn. Right, there are a few key problems with large scale style changes to the test suite: 1. The worst case scenario is where we subtly change a test so that it is no longer testing what it is supposed to be testing, allowing the future introduction of an undetected regression. This isn't particularly *likely*, but a serious problem if it happens. 2. If there are pending patches for that module that include new tests, then the style change may cause the patches to no longer apply cleanly, require rework of bug fix and new feature patches to accommodate the style change. 3. Merges between branches may become more complicated (for reasons similar to 2), unless the style change is also applied to the maintenance branches (which is problematic due to 1). I spend a *lot* of time working with the Python test suite on behalf of Jython, so I appreciate the care CPython puts into its testing. To a large extent, tests written for CPython drive Jython development: I suspect I work with a lot more failing tests than anyone here. Where we have a custom test, I often update them from in the latest CPython tests. Often a test failure is not caused by code I just wrote, but by adding a CPython test or removing a "skip if Jython", and not having written anything yet. While the irrefutable "False is not true" always raises a smile, I'd welcome something more informative. It's a more than a "style" issue. What Nick says above is also not false, as general guidance, but taken as an iron rule seem to argue against concurrent development at all. Don't we manage this change pretty well already? I see little risk of problems 1-3 in the actual proposal, as the changes themselves are 99% of the "drop-in replacement" type: -self.assertTrue(isinstance(x, int)) +self.assertIsInstance(x, int) I found few places, on a quick scan, that risked changing the meaning: they introduce an if-statement, or refactor the expression -- I don't mean they're actually wrong. The point about breaking Serhiy's patch into independent parts will help manage with merging and this risk. The tests are not library code, but their other use is as an example of good practice in unit testing. I pretty much get my idea of Python's test facilities from this work. It was a while before I realised more expressive methods were available. Jeff Jeff Allen ___ Python-Dev mailing list [email protected] https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 463: Exception-catching expressions
On 22/02/2014 16:36, Brett Cannon wrote: On Sat, Feb 22, 2014 at 4:13 AM, Antoine Pitrou <mailto:[email protected]>> wrote: On Fri, 21 Feb 2014 09:37:29 -0800 Guido van Rossum mailto:[email protected]>> wrote: > I'm put off by the ':' syntax myself (it looks to me as if someone forgot a > newline somewhere) but 'then' feels even weirder (it's been hard-coded in > my brain as meaning the first branch of an 'if'). Would 'else' work rather than 'then'? thing = stuff['key'] except KeyError else None That reads to me like the exception was silenced and only if there is no exception the None is returned, just like an 'else' clause on a 'try' statement. I personally don't mind the 'then' as my brain has been hard-coded to mean "the first branch of a statement" so it's looser than being explicitly associated with 'if' but with any multi-clause statement. I read *except* as 'except if', and *:* as 'then' (often), so the main proposal reads naturally to me. I'm surprised to find others don't also, as that's the (only?) pronunciation that makes the familiar if-else and try-except constructs approximate English. Isn't adding a new keyword (*then*) likely to be a big deal? There is the odd example of its use as an identifier, just in our test code: http://hg.python.org/cpython/file/0695e465affe/Lib/test/test_epoll.py#l168 http://hg.python.org/cpython/file/0695e465affe/Lib/test/test_xmlrpc.py#l310 Jeff Allen ___ Python-Dev mailing list [email protected] https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Python 3.3.4150
Jason: I get that too, now I try it. The place to report bugs is: http://bugs.python.org/ However, please take a look at http://bugs.python.org/issue14512 before you file a new one. Jeff Allen On 28/02/2014 17:05, Burgoon, Jason wrote: Good day Python Dev Team -- One of our users has reported the following: I have installed the given msi on 64 bit as per the install instructions document. One of the shortcuts 'Start Menu\Programs\Python 3.3\ Module Docs' is not getting launched. When I launch this shortcut, it is not opening any window. I have tried with admin user and non-admin user. Is this expected behavior? Please adviseand thanks for your help. Jason ___ Python-Dev mailing list [email protected] https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] hg branching + log question
I have also found hg difficult to get to grips with from cold (but I like it). The hg command and its help are good, as Antoine says, but if I'm doing something complex, the visualisation of the change sets that TortoiseHG provides is invaluable (and of other invisible structures, such as the MQ patch stack). The context menus are also a clue to what you might want to do next when you can't guess what word comes after hg help ... . I found it helpful to practice extensively on something that doesn't matter. The gap for me is still examples of what I want "done well". Clearly the Python repos represent complex work, but even accepting it is all done well, are without much commentary. This is very good: http://hgbook.red-bean.com/read/ , but there are hints it has not kept up. This also: http://legacy.python.org/dev/peps/pep-0385/ Jeff Allen On 17/03/2014 23:53, Sean Felipe Wolfe wrote: Ah well, ok. That seems pretty counterintuitive to me though. I suppose Hg has its quirks just like ... that other DCVS system ... :P On Mon, Mar 17, 2014 at 1:07 PM, Antoine Pitrou wrote: On Mon, 17 Mar 2014 13:02:23 -0700 Sean Felipe Wolfe wrote: I'm getting my feet wet with the cpython sources and Mercurial. I'm a bit confused -- when I checkout a branch, eg. 3.3, and I do an 'hg log', why do I see log messages for other branches? This is a classic hg question, you would get the answer by asking Mercurial for help: hg log --help :) Basically, to restrict the log to a given branch, just use the -b option: hg log -b 3.3. Regards Antoine. ___ Python-Dev mailing list [email protected] https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/ether.joe%40gmail.com ___ Python-Dev mailing list [email protected] https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] List vs Tuple / Homogeneous vs Heterogeneous / Mutable vs Immutable
The "think of tuples like a struct in C" explanation immediately reminded me that ... On 16/04/2014 21:42, Taavi Burns wrote (in his excellent notes from the language summit): The demographics have changed. How do we change the docs and ecosystem to avoid the assumption that Python programmers already know how to program in C? Good question. My version was going to be that if you are dealing with tuples of mixed data like (name, age, shoesize), inserting something or sorting, in the way a list can, would confuse your code. A list, you almost always iterate over, to do the same thing with each member, and that only works if they are the same type of thing. Then I realised David Beazley had explained this (but better), starting in the Tuples section of his "Python Essential Reference". With permission, this could perhaps be adopted wherever it best fits in the documentation. Jeff Allen On 17/04/2014 20:49, Leandro Pereira de Lima e Silva wrote: This looks like an issue to be addressed at PEP-8 since it looks like a styling issue. I haven't seen any other recommendations there on how to use a certain data structure, though. Cheers, Leandro Em 17/04/2014 16:24, "Guido van Rossum" <mailto:[email protected]>> escreveu: It's definitely something that should be put in some documentation, probably at the point when people have learned enough to be designing their own programs where this issue comes up -- before they're wizards but well after they have learned the semantic differences between lists and tuples. On Thu, Apr 17, 2014 at 11:49 AM, Brett Cannon mailto:[email protected]>> wrote: On Thu Apr 17 2014 at 2:43:35 PM, Leandro Pereira de Lima e Silva mailto:[email protected]>> wrote: Hello there! I've stumbled upon this discussion on python-dev about what the choice between using a list or a tuple is all about in 2003: 1. https://mail.python.org/pipermail/python-dev/2003-March/033962.html 2. https://mail.python.org/pipermail/python-dev/2003-March/034029.html There's a vague comment about it on python documentation but afaik there the discussion hasn't made into any PEPs. Is there an understanding about it? Think of tuples like a struct in C, lists like an array. That's just out of Guido's head so I don't think we have ever bothered to write it down somewhere as an important distinction of the initial design that should be emphasized. ___ Python-Dev mailing list [email protected] <mailto:[email protected]> https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/guido%40python.org -- --Guido van Rossum (python.org/~guido <http://python.org/%7Eguido>) ___ Python-Dev mailing list [email protected] https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Internal representation of strings and Micropython
Jython uses UTF-16 internally -- probably the only sensible choice in a Python that can call Java. Indexing is O(N), fundamentally. By "fundamentally", I mean for those strings that have not yet noticed that they contain no supplementary (>0x) characters. I've toyed with making this O(1) universally. Like Steven, I understand this to be a freedom afforded to implementers, rather than an issue of conformity. Jeff Allen On 04/06/2014 02:17, Steven D'Aprano wrote: There is a discussion over at MicroPython about the internal representation of Unicode strings. ... My own feeling is that O(1) string indexing operations are a quality of implementation issue, not a deal breaker to call it a Python. I can't see any requirement in the docs that str[n] must take O(1) time, but perhaps I have missed something. ___ Python-Dev mailing list [email protected] https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Arbitrary non-identifier string keys when using **kwargs
On 10/10/2018 00:06, Steven D'Aprano wrote: On Tue, Oct 09, 2018 at 09:37:48AM -0700, Jeff Hardy wrote: ... From an alternative implementation point of view, CPython's behaviour *is* the spec. Practicality beats purity and all that. Are you speaking on behalf of all authors of alternate implementations, or even of some of them? It certainly is not true that CPython's behaviour "is" the spec. PyPy keeps a list of CPython behaviour they don't match, either because they choose not to for other reasons, or because they believe that the CPython behaviour is buggy. I daresay IronPython and Jython have similar. While agreeing with the principle, unless it is one of the fundamental differences (GC, GIL), Jython usually lets practicality beat purity. When faced with a certain combination of objects, one has to do something, and it is least surprising to do what CPython does. It's also easier than keeping a record. Rarely, we manage to exceed CPython (in consistency or coverage) by a tiny amount. Jeff Allen ___ Python-Dev mailing list [email protected] https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Need discussion for a PR about memory and objects
I found this (very good) summary ended in a surprising conclusion. On 18/11/2018 12:32, Nick Coghlan wrote: On Sun, 4 Nov 2018 at 23:33, Steven D'Aprano wrote: On Sun, Nov 04, 2018 at 11:43:50AM +0100, Stephane Wirtel wrote: In this PR [https://github.com/python/cpython/pull/3382] "Remove reference to address from the docs, as it only causes confusion", opened by Chris Angelico, there is a discussion about the right term to use for the address of an object in memory. Why do we need to refer to the address of objects in memory? ... Chris's initial suggestion was to use "license number" or "social security number" (i.e. numbers governments assign to people), but I'm thinking a better comparison might be to vehicle registration numbers, ... On the other hand, we're talking about the language reference here, not the tutorial, and understanding memory addressing seems like a reasonable assumed pre-requisite in that context. Cheers, Nick. It is a good point that this is in the language reference, not a tutorial. Could we not expect readers of that to be prepared for a notion of object identity as the abstraction of what we mean by "the same object" vs "a distinct object"? If it were necessary to be explicit about what Python means by it, one could unpack the idea by its properties: distinct names may be given to the same object (is-operator); distinct objects may have the same value (==-operator); an object may change in value (if allowed by its type) while keeping its identity. And then there is the id() function. That is an imperfect reflection of the identity. id() guarantees that for a given object (identity) it will always return the same integer during the life of that object, and a different integer for any distinct object (distinct identity) with an overlapping lifetime. We note that, in an implementation of Python where objects are fixed in memory for life, a conformant id() may return the object's address. Jeff Allen ___ Python-Dev mailing list [email protected] https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Experiment an opt-in new C API for Python? (leave current API unchanged)
On 19/11/2018 15:08, Victor Stinner wrote: ... For me, the limited API should be functions available on all Python implementations. Does it make sense to provide PyFloat_Pack4() in ..., Jython, ... ? Or is it something more specific to CPython? I don't know the answer. I'd say it's a CPython thing. It is helpful to copy a lot of things from the reference implementation, but generally the lexical conventions of the C-API would seem ludicrous in Java, where scope is already provided by a class. And then there's the impossibility of a C-like pointer to byte. Names related to C-API have mnemonic value, though, in translation. Maybe "static void PyFloat.pack4(double, ByteBuffer, boolean)" would do the trick. It makes sense for JyNI to supply it by the exact C API name, and all other API that C extensions are likely to use. Jeff Allen ___ Python-Dev mailing list [email protected] https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Need discussion for a PR about memory and objects
On 20/11/2018 00:14, Chris Barker via Python-Dev wrote: On Mon, Nov 19, 2018 at 1:41 AM Antoine Pitrou <mailto:[email protected]>> wrote: I'd rather keep the reference to memory addressing than start doing car analogies in the reference documentation. I agree -- and any of the car analogies will probably be only valid in some jurisdictions, anyway. I think being a bit more explicit about what properties an ID has, and how the id() function works, and we may not need an anlogy at all, it's not that difficult a concept. And methions that in c_python the id is (currently) the memory address is a good idea for those that will wonder about it, and if there is enough explanation, folks that don't know about memory addresses will not get confused. ... I suggest something like the following: """ Every object has an identity, a type and a value. An object’s identity uniquely identifies the object. It will remain the same as long as that object exists. No two different objects will have the same id at the same time, but the same id may be re-used for future objects once one has been deleted. The ‘is’ operator compares the identity of two objects; the id() function returns an integer representing its identity. ``id(object_a) == id(object_b)`` if and only if they are the same object. **CPython implementation detail:** For CPython, id(x) is the memory address where x is stored. """ I agree that readers of a language reference should be able to manage without the analogies. I want to suggest that identity and id() are different things. The notion of identity in Python is what we access in phrases like "two different objects" and "the same object" in the text above. For me it defies definition, although one may make statements about it. A new object, wherever stored, is identically different from all objects preceding it. Any Python has to implement the concept of identity in order to refer, without confusion, to objects in structures and bound by names. In practice, Python need only identify an object while the object exists in the interpreter, and the object exists as long as something refers to it in this way. To this extent, the identifier in the implementation need not be unique for all time. The id() function returns an integer approximating this second idea. There is no mechanism to reach the object itself from the result, since it does not keep the object in existence, and worse, it may now be the id of a different object. In defining id(), I think it is confusing to imply that this number *is* the identity of the object that provided it. Jeff Allen ___ Python-Dev mailing list [email protected] https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] drop jython support in mock backport?
Cross-posting to jython-users for obvious reasons. Jeff Allen On 30/04/2019 10:26, Chris Withers wrote: [resending to python-dev in case there are Jython users here...] Hi All, If you need Jython support in the mock backport, please shout now: https://github.com/testing-cabal/mock/issues/453 cheers, Chris ___ Python-Dev mailing list [email protected] https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Online Devguide mostly not working
Is it suspicious that in the detailed log we see: 'canonical_url': 'http://devguide.python.org/', ? I guess this comes from project admin configuration at RTD, additional to your conf.py. https://docs.readthedocs.io/en/stable/canonical.html (Just guessing really.) Jeff Allen On 12/05/2019 08:10, Wes Turner wrote: https://cpython-devguide.readthedocs.io/ seems to work but https://devguide.python.org <https://devguide.python.org/>/* does not https://readthedocs.org/projects/cpython-devguide/ lists maintainers, who I've cc'd AFAIU, there's no reason that the HTTP STS custom domain CNAME support would've broken this: https://github.com/rtfd/readthedocs.org/issues/4135 On Saturday, May 11, 2019, Jonathan Goble <mailto:[email protected]>> wrote: Confirming that I also cannot access the Getting Started page. I'm in Ohio, if it matters. On Sat, May 11, 2019 at 6:26 PM Terry Reedy mailto:[email protected]>> wrote: > > https://devguide.python.org gives the intro page with TOC on sidebar and > at end. Clicking anything, such as Getting Started, which tries to > display https://devguide.python.org/setup/ <https://devguide.python.org/setup/>, returns a Read the Docs page > "Sorry This page does not exist yet." 'Down for everyone' site also > cannot access. > > -- > Terry Jan Reedy > > ___ > Python-Dev mailing list > [email protected] <mailto:[email protected]> > https://mail.python.org/mailman/listinfo/python-dev <https://mail.python.org/mailman/listinfo/python-dev> > Unsubscribe: https://mail.python.org/mailman/options/python-dev/jcgoble3%40gmail.com <https://mail.python.org/mailman/options/python-dev/jcgoble3%40gmail.com> ___ Python-Dev mailing list [email protected] <mailto:[email protected]> https://mail.python.org/mailman/listinfo/python-dev <https://mail.python.org/mailman/listinfo/python-dev> Unsubscribe: https://mail.python.org/mailman/options/python-dev/wes.turner%40gmail.com <https://mail.python.org/mailman/options/python-dev/wes.turner%40gmail.com> ___ Python-Dev mailing list [email protected] https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/ja.py%40farowl.co.uk ___ Python-Dev mailing list [email protected] https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Online Devguide mostly not working
Hiccups cured thanks to Mariatta and Carol https://github.com/python/devguide/pull/484. (And my guess was wrong.) Jeff Allen On 14/05/2019 04:49, Brett Cannon wrote: It's working for me, so it was probably just a hiccup. On Sun, May 12, 2019 at 6:19 AM Jeff Allen <mailto:[email protected]>> wrote: Is it suspicious that in the detailed log we see: 'canonical_url': 'http://devguide.python.org/', ? I guess this comes from project admin configuration at RTD, additional to your conf.py. https://docs.readthedocs.io/en/stable/canonical.html (Just guessing really.) Jeff Allen On 12/05/2019 08:10, Wes Turner wrote: https://cpython-devguide.readthedocs.io/ seems to work but https://devguide.python.org <https://devguide.python.org/>/* does not https://readthedocs.org/projects/cpython-devguide/ lists maintainers, who I've cc'd AFAIU, there's no reason that the HTTP STS custom domain CNAME support would've broken this: https://github.com/rtfd/readthedocs.org/issues/4135 On Saturday, May 11, 2019, Jonathan Goble mailto:[email protected]>> wrote: Confirming that I also cannot access the Getting Started page. I'm in Ohio, if it matters. On Sat, May 11, 2019 at 6:26 PM Terry Reedy mailto:[email protected]>> wrote: > > https://devguide.python.org gives the intro page with TOC on sidebar and > at end. Clicking anything, such as Getting Started, which tries to > display https://devguide.python.org/setup/, returns a Read the Docs page > "Sorry This page does not exist yet." 'Down for everyone' site also > cannot access. > > -- > Terry Jan Reedy > > ___ > Python-Dev mailing list > [email protected] <mailto:[email protected]> > https://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: https://mail.python.org/mailman/options/python-dev/jcgoble3%40gmail.com ___ Python-Dev mailing list [email protected] <mailto:[email protected]> https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/wes.turner%40gmail.com ___ Python-Dev mailing list [email protected] <mailto:[email protected]> https://mail.python.org/mailman/listinfo/python-dev Unsubscribe:https://mail.python.org/mailman/options/python-dev/ja.py%40farowl.co.uk ___ Python-Dev mailing list [email protected] <mailto:[email protected]> https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/brett%40python.org ___ Python-Dev mailing list [email protected] https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
[Python-Dev] Re: Comparing dict.values()
On 23/07/2019 21:59, Kristian Klette wrote:
Hi!
During the sprints after EuroPython, I made an attempt at adding support for
comparing the results from `.values()` of two dicts.
Currently the following works as expected:
```
d = {'a': 1234}
d.keys() == d.keys()
d.items() == d.items()
```
but `d.values() == d.values()` does not return the expected
results. It always returns `False`. The symmetry is a bit off.
In the bug trackers[0] and the Github PR[1], I was asked
to raise the issue on the python-dev mailing list to find
a consensus on what comparing `.values()` should do.
The request was to establish a consensus on a reasonable semantic. I
don't think that can be adequately addressed by such a simple example
and the criterion "works as expected". What is expected of:
>>> x = dict(a=1, b=2)
>>> y = dict(b=2, a=1)
>>> x == y
True
Two superficially reasonable semantics are to compare the list or the
set of the values:
>>> set(x.values()) == set(y.values())
True
>>> list(x.values()) == list(y.values())
False
Terry points out some implementation and definitional problems
(unhashable values) with set demantics. Steven proposes (essentially)
list semantics, but isn't it surprising that equal dictionaries should
not have equal .values()?
Jeff Allen
___
Python-Dev mailing list -- [email protected]
To unsubscribe send an email to [email protected]
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at
https://mail.python.org/archives/list/[email protected]/message/4GKPJHRMV5QKSVZ3T44MG4NOUOUISNST/
[Python-Dev] Re: unittest of sequence equality
On 22/12/2020 19:08, Brett Cannon wrote: ... The fact that numpy chooses to implement __eq__ in such a way that its result would be surprising if used in an `if` guard I think is more a design choice/issue of numpy than a suggestion that you can't trust `==` in testing because it _can_ be something other than True/False. +1 In addition to NumPy's regularly surprising interpretation of operators, it is evident from Ivan Pozdeev's investigation (other branch) that part of the problem lies with bool(np.array) being an error. I can see why that might be sensible. You can have one or the other, but not both. I wondered if Python had become stricter here after NumPy made its choices, but a little mining turns up: "New in version 2.1. These are the so-called ``rich comparison'' methods, and are called for comparison operators in preference to __cmp__() below. The correspondence between operator symbols and method names is as follows: |xy| call |x.__ne__(y)|, |x>y| calls |x.__gt__(y)|, and |x>=y| calls |x.__ge__(y)|. These methods can return any value, but if the comparison operator is used in a Boolean context, the return value should be interpretable as a Boolean value, else a TypeError will be raised. By convention, |0| is used for false and |1| for true. " https://docs.python.org/release/2.1/ref/customization.html The combination of choices makes the result of a comparison, about which there is some freedom, not interpretable as a boolean value. We are warned that this should not be expected to work. Later docs (from v2.6) refer explicitly to calling bool() as a definition of "interpretable". bool() is there from v2.3. Jeff Allen ___ Python-Dev mailing list -- [email protected] To unsubscribe send an email to [email protected] https://mail.python.org/mailman3/lists/python-dev.python.org/ Message archived at https://mail.python.org/archives/list/[email protected]/message/WZWT6NZXM4EILAVVJRJQ2A2LK7BBNMFV/ Code of Conduct: http://python.org/psf/codeofconduct/
[Python-Dev] Re: Steering Council update for February
On 10/03/2021 01:30, Inada Naoki wrote: On Wed, Mar 10, 2021 at 10:10 AM Ivan Pozdeev via Python-Dev wrote: Anyway, this is yet another SJW non-issue (countries other than US don't have a modern history of slavery) so this change is a political statement rather than has any technical merit. Yes. If we don't change the name, we need to pay our energy to same discussion every year. It is not productive. Let's change the name and stop further discussion. +1 for this analysis. It is a modern shibboleth, but let's not invite people to leave who are reluctant to make the right noises -- that's an unfortunate response. A bit off topic ... It is surprising to read that slavery is unique to US history. Maybe institutionally, amongst large democracies, the US was late to abolish it. But even in the UK, where we are proud that decency overcame self-interest, peacefully and slightly ahead of the US, in practice it remains a problem today. (https://www.gov.uk/government/collections/modern-slavery) Anything directly helpful is likely to be done outside the framework of the PSF, and not because we changed a branch name. However, it's odds on that those tackling it here are using Python for data science. Jeff Allen ___ Python-Dev mailing list -- [email protected] To unsubscribe send an email to [email protected] https://mail.python.org/mailman3/lists/python-dev.python.org/ Message archived at https://mail.python.org/archives/list/[email protected]/message/Y5B5SD2X7PYR5NAIHDKW5LB2GXOEHPN3/ Code of Conduct: http://python.org/psf/codeofconduct/
[Python-Dev] Re: In what tense should the changelog be written?
On 30/04/2021 13:15, Steven D'Aprano wrote:
On Thu, Apr 29, 2021 at 09:52:14PM -0700, Larry Hastings wrote:
D'oh! I have a second draft already.
Your NEWS entry should be written in the /present tense,/ and should
start with a verb:
Without a subject of the sentence, that's not present tense, it is the
imperative mood.
Tense and mood are different dimensions. This is the present tense, and
the imperative mood:
"Fix buffalo.spam ..."
is a command or suggestion. The imperative is suitable for a list of
things which should be done, a TODO list, not a list of things which
have already been done.
https://grammar.collinsdictionary.com/easy-learning/the-imperative
The reference is not very good at explaining this, perhaps because the
mood in English is not obvious. (Sometimes you have to be French.)
The instruction "in the present tense, and should start with a verb"
doesn't pin it down, at least if you consider context and are liberal
about punctuation:
# present tense, indicative mood
"bpo-41056: Fixes a reference to deallocated stack ..." -> "bpo-41056 fixes
...
# present tense, imperative mood
"bpo-41094: Fix decoding errors with audit ... "
The preference for the imperative mood probably begins with the title of
a change *request*, where the imperative is the one obvious choice,
don't you think? I think I prefer it, but if blurb does not mine the
commit/PR for text, it's not a constraint. If the trail starts with a
bug report ("buffalo.spam borks the weeble when x is negative" (present
indicative)) then that makes a confusing commit or news message as Guido
points out.
Jeff
___
Python-Dev mailing list -- [email protected]
To unsubscribe send an email to [email protected]
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at
https://mail.python.org/archives/list/[email protected]/message/KZPKYFWTFOFL5EXRLY7KOEDO576AXF7A/
Code of Conduct: http://python.org/psf/codeofconduct/
[Python-Dev] Re: Anyone else gotten bizarre personal replies to mailing list posts?
Yes, I got one from the same address today. Thanks for pointing out these are individual peformances: it was annoying when I thought it was spam to the list. Although Hoi Lam Poon is a real (female) name, it may signify a generated lampoon. Jeff Allen On 23/04/2021 16:38, Nathaniel Smith wrote: I just got the reply below sent directly to my personal account, and I'm confused about what's going on. If it's just a one off I'll chalk it up to random internet weirdness, but if other folks are getting these too it might be something the list admins should look into? Or... something? -- Forwarded message - From: *Hoi lam Poon* <mailto:[email protected]>> ___ Python-Dev mailing list -- [email protected] To unsubscribe send an email to [email protected] https://mail.python.org/mailman3/lists/python-dev.python.org/ Message archived at https://mail.python.org/archives/list/[email protected]/message/3EA53I3Y72ACEPDVG467NMNTXHRL3NXL/ Code of Conduct: http://python.org/psf/codeofconduct/
[Python-Dev] Re: name for new Enum decorator
On 28/05/2021 04:24, Ethan Furman wrote: The flags RED, GREEN, and BLUE are all canonical, while PURPLE and WHITE are aliases for certain flag combinations. But what if we have something like: class Color(Flag): RED = 1 # 0001 BLUE = 4 # 0100 WHITE = 7 # 0111 ... So, like the enum.unique decorator that can be used when duplicate names should be an error, I'm adding a new decorator to verify that a Flag has no missing aliased values that can be used when the programmer thinks it's appropriate... but I have no idea what to call it. Any nominations? The propery you are looking for IIUC is that if a bit position is 1 in any member, then there is a member with only that bit set. I am seeing these members as sets of elements (bit positions) and therefore you want optionally to ensure that your enumeration has a name for every singleton set, into which any member could be analysed. Words like "basis", "complete", "analytic", or "indicator" (as in indicator function) come to mind. I find "singletonian" attractive, but no-one will know what it means, and I just made it up. Jeff ___ Python-Dev mailing list -- [email protected] To unsubscribe send an email to [email protected] https://mail.python.org/mailman3/lists/python-dev.python.org/ Message archived at https://mail.python.org/archives/list/[email protected]/message/7VN5Z5FSN3CH33KKQELX63L7JW6WEB2L/ Code of Conduct: http://python.org/psf/codeofconduct/
[Python-Dev] Re: Problems with dict subclassing performance
On 16/08/2021 08:41, Federico Salerno wrote: "Pretendere" in Italian means "to demand", it's a false friend with the English "pretend". I don't know whether Marco is Italian (the false friend might also be there between Spanish or whatever other romance language he speaks and English, for all I know). From a native Italian speaker's perspective, what he meant was very clear to me, but it's also clear that an English speaker with no experience of Italian would not be expected to understand the meaning necessarily. I reached the same conclusion with the help of: https://it.wiktionary.org/wiki/pretendere#Etimologia_/_Derivazione . Pop open the translations (Traduzioine). English has diverged from the latin, as explained elsewhere on the thread. It's all easily explained by a misunderstanding of Steven's remark about upvotes (to agree with the answer provided by Monica) and subsequent frustration in being unable to make oneself understood. Either way, from an outsider's perspective this whole bickering over such a small thing seems unfit for a list where adults talk about technical details. It seems to me that someone should swallow their pride and let this thread drop once and for all, it's not bringing anything useful or relevant. Quite possibly. But I've offered a grown-up response in a separate post. -- Jeff Allen ___ Python-Dev mailing list -- [email protected] To unsubscribe send an email to [email protected] https://mail.python.org/mailman3/lists/python-dev.python.org/ Message archived at https://mail.python.org/archives/list/[email protected]/message/ZYRNFVFCF2AJQUMUHXHWYBDZMZGLTB4I/ Code of Conduct: http://python.org/psf/codeofconduct/
[Python-Dev] Re: Problems with dict subclassing performance
On 06/08/2021 20:29, Marco Sulla wrote: I've done an answer on SO about why subclassing `dict` makes the subclass so much slower than `dict`. The answer is interesting: https://stackoverflow.com/questions/59912147/why-does-subclassing-in-python-slow-things-down-so-much What do you think about? I have spent a lot of time reading typeobject.c over the years I've been looking at an alternative implementation. It's quite difficult to follow, and full of tweaks for special circumstances. So I'm impressed with the understanding that "user2357112 supports Monica" brings to the subject. (Yes, I want to call them Monica too, but I don't think that's their actual name. ) I don't think I understand it better than they but here's my reading of that, informed by my reading of typeobject.c, in case it helps. When a built-in type like dict is defined in C, pointers to its C implementation functions are hard-coded into slots in the type object. In order to make each appear as a method to Python, a descriptor is created when building the type that delegates to the slot (so sq_contains generates a descriptor __contains__ in the dictionary of the type. Conversely, if in a sub-class you define __contains__, then the type builder will insert a function pointer in the slot of the new type that arranges a call to __contains__. This will overwrite whatever was in the slot. In a C implementation, you can also define methods (by creating a PyMethodDef the tp_methods table) that become descriptors in the dictionary of the type. You would not normally define both a C function to place in the slot *and* the corresponding method via a PyMethodDef. If you do, the version from the dictionary of the type will win the slot, *unless* you mark the method definition (in its PyMethodDef) as METH_COEXIST. This exception is used in the special case of dict (and hardly anywhere else but set I think). I assume this is because some important code calls __contains__ via the descriptor, rather than via the slot (which would be quicker), and because an explicit definition is faster than a descriptor created automatically to wrap the slot. Now, when you create a sub-class, the table of slots is copied first, then the type is checked for definitions of special methods, and these are allowed to overwrite the slot, unless they are slot wrappers on the same function pointer the slot already contains. I think at this point the slot is re-written to contain a wrapper on __contains__, which has been inherited from dict.__contains__, because it isn't a *slot wrapper* on the same function. For example: >>> dict.__contains__ >>> str.__contains__ >>> class S(str): pass >>> S.__contains__ >>> D.__contains__ I think that when filling the slots of a sub-class, one could check for the METH_COEXIST flag at the point one checks to see whether the definition from look-up on the type is a PyWrapperDescr on the same pointer. One might have to know that the slot and descriptor come from the same base. I'm not suggesting this would be worthwhile. FYI, in the approach I am toying with, the slot wrapper descriptor is always created from the function definition, then the slot is filled from the available definitions by lookup. Defining __contains__ twice would be impossible or an error. I think this has the semantics required by Python, but we'll have to wait for proof. -- Jeff Allen ___ Python-Dev mailing list -- [email protected] To unsubscribe send an email to [email protected] https://mail.python.org/mailman3/lists/python-dev.python.org/ Message archived at https://mail.python.org/archives/list/[email protected]/message/ZKMVZ5M3V76SOZH7FOURQ66VFZQY2BTG/ Code of Conduct: http://python.org/psf/codeofconduct/
[Python-Dev] Should I care what -3.14 // inf produces?
Is an implementation of Python free to make up its own answers to
division and modulus operatons in the case of inf and nan arguments?
The standard library documentation says only that // is "rounded towards
minus infinity". The language reference says that ||:
1. |x == (x//y)*y + (x%y)|,
2. the modulus has the same sign as y, and
3. division by (either kind of) zero raises |ZeroDivisionError| .
It's consistent, but it doesn't define the operator over the full range
of potential arguments. In Python 3.8 and 3.9:
>>> from decimal import Decimal
>>> Decimal('-3.14') // Decimal('Infinity')
Decimal('-0')
>>> -3.14 // float("inf")
-1.0
>>> import math
>>> math.floor(-3.14 / float("inf"))
0
I can see sense in all three answers, as possible interpretations of
"rounded towards minus infinity", but I quite like decimal's. There seem
to be no regression tests for floor division of floats, and for modulus
only with finite arguments, perhaps intentionally.
--
Jeff Allen
___
Python-Dev mailing list -- [email protected]
To unsubscribe send an email to [email protected]
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at
https://mail.python.org/archives/list/[email protected]/message/BXYBSUMNSP6AAAS6OL23ANSML4IOARVB/
Code of Conduct: http://python.org/psf/codeofconduct/
[Python-Dev] Re: Should I care what -3.14 // inf produces?
On 30/09/2021 08:57, Serhiy Storchaka wrote: Decimals use a different rule than integers and floats: the modulus has the same sign as the dividend. It was discussed using this rule for floats (perhaps there is even FAQ or HOWTO for this), there are advantages of using it for floats (the result is more accurate). But the current rule (the modulus has the same sign as the divisor) is much much more convenient for integers, and having different rules for integers and floats is a source of bugs. Thanks Serhiy and Victor. I hadn't realised decimal was so different from float. So decimal is not useful as a comparator. It's not an idealisation of intended float behaviour. The question is about floor-division of two Python built-in floats, involving non-finite operands, and whether this is standardised in Python the language. I couldn't find a FAQ/HOW-TO and nothing in the IEEE standard bears directly on floor division. I found an interesting discussion (https://mail.python.org/pipermail/python-dev/2007-January/070707.html) but it is having so much trouble with finite arguments that it barely mentions extended values a float might take. Tim makes good sense as always. Observing behaviour (Windows and Linux), it is consistent now but was divergent in the past. In Python 2.7.16 (Windows): >>> -3.14 // inf nan In 3.8 (Windows and Linux) and 2.7 (Linux): >>> -3.14 // inf -1.0 I would put the change down to improving fmod conformance in MSC, rather than a Python language change. But the cause doesn't matter. The fact that both were acceptable suggests that floor division is not standardised for non-finite operands. Pragmatically, however, it is seldom a good idea to differ from CPython. A bit of extra work at run-time, to check the divsor, is not a big penalty. -- Jeff Allen ___ Python-Dev mailing list -- [email protected] To unsubscribe send an email to [email protected] https://mail.python.org/mailman3/lists/python-dev.python.org/ Message archived at https://mail.python.org/archives/list/[email protected]/message/S62T3SHVDEVW4ZWDDKSE76KFYWK5TAQT/ Code of Conduct: http://python.org/psf/codeofconduct/
[Python-Dev] Recent changes to TextIOWrapper and its tests
I'm pulling recent changes in the io module across to Jython. I am looking for help understanding the changes in http://hg.python.org/cpython/rev/19a33ef3821d That change set is about what should happen if the underlying buffer does not return bytes when read, but instead, for example, unicode characters. The test test_read_nonbytes() constructs a pathological text stream reader t where the usual BytesIO or BufferedReader is replaced with a StringIO. It then checks that r.read(1) and t.readlines() raise a TypeError, that is, it tests that TextIOWrapper checks the type of what it reads from the buffer. The puzzle is that it requires t.read() to succeed. When I insert a check for bytes type in all the places it seems necessary in my code, I pass the first two conditions, but since t.read() also raises TypeError, the overall test fails. Is reading the stream with read() intended to succeed? Why is this desired? Jeff Allen ___ Python-Dev mailing list [email protected] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Recent changes to TextIOWrapper and its tests
On 19/03/2013 08:03, Serhiy Storchaka wrote: On 18.03.13 22:26, Jeff Allen wrote: The puzzle is that it requires t.read() to succeed. When I insert a check for bytes type in all the places it seems necessary in my code, I pass the first two conditions, but since t.read() also raises TypeError, the overall test fails. Is reading the stream with read() intended to succeed? Why is this desired? An alternative option is to change C implementation of TextIOWrapper.read() to raise an exception in this case. However I worry that it can break backward compatibility. Thanks for this and the previous note. It is good to get it from the horse's mouth. I was surprised that the Python 3 version of the test was different here. I'd looked at the source of textio.c and found no test for bytes type in the n<0 branch of textiowrapper_read. Having tested it just now, I see that the TypeError is raised by the decoder in Py3k, because the input (when it is a unicode string) does not bear the buffer API, and not by a type test in TextIOWrapper.read() at all. For Jython, I shall make TextIOWrapper raise TypeError and our version of test_io will check for it. Any incompatibility relates only to whether a particular mistake sometimes goes undetected, so I feel pretty compatible. Added to which, this is the behaviour of Python 3 and we feel safe anticipating Py3k in small ways. Are there other tests (in other test files) which fail with a new Jython TextIOWrapper? I don't think there is anything else specific to TextIOWrapper, but if there is I will first treat that as a fault in our implementation. This is the general approach, to emulate the CPython implementation unless a test is clearly specific to arbitrary implementation choices. (There's a general exclusion for garbage collection.) In this case the test appeared to reflect an accident of implementation, but might just have been deliberate. Parts of the Jython implementation of io not yet implemented in Java are supplied by a Python module _jyio. This is essentially a copy of the corresponding parts of _pyio, except that it has to pass the C* tests, not the Py* tests. In places _jyio is therefore closer to _io than is _pyio. For example, it makes the type tests just discussed, and passes CTextIOWrapperTest.test_illegal_decoder and test_initialization. _jyio.StringIO has getstate and setstate methods lacking in _pyio counterparts to pass pickling tests in test_memoryio. This might be of interest to CPython for _pyio. Jeff Allen ___ Python-Dev mailing list [email protected] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Destructors and Closing of File Objects
On 29/04/2013 15:42, Armin Rigo wrote: On Sat, Apr 27, 2013 at 4:39 AM, Nikolaus Rath wrote: It's indeed very informative, but it doesn't fully address the question because of the _pyio module which certainly can't use any custom C code. Does that mean that when I'm using x = _pyio.BufferedWriter(), I could loose data in the write buffer when the interpreter exits without me calling x.close(), but when using x = io.BufferedWriter(), the buffer is guaranteed to get flushed? I actually described the behavior of CPython 2 while not realizing that CPython 3 silently dropped this guarantee. (I also never realized that Jython/IronPython don't have the same guarantee; they could, if they implement 'atexit', like we did in PyPy. On 29/04/2013 17:02, Antoine Pitrou wrote: It is dropped in the case of reference cycles, since there's no general way to decide in which order the tp_clear calls have to be done. Thus in the following layered situation: a TextIOWrapper on top of a BufferedWriter on top of a FileIO, if BufferedWriter.tp_clear is called first, it will flush and then close itself, closing the FileIO at the same time, and when TextIOWrapper.tp_clear will be called it will be too late to flush its own buffer. (I have to investigate a bit to confirm it is what happens) I will try to think of a scheme to make flushing more reliable, but nothing springs to my mind right now. In Jython, objects are not "cleared" immediately they become unreachable and if the JVM does not collect them before it shuts down, no programmed finalization may be called. To get round this, files in need of closing are hooked to a list that is worked off as the JVM shuts down, the equivalent of atexit (I assume). It has the unfortunate effect that forgotten files may live even longer, making it even more desirable that the user remember to close them. (The io tests themselves are not good at this!) But at least the close comes eventually. After discussion on jython-dev, I recently changed this mechanism (aiming at v2.7) so that every layer e.g. TextIOWrapper, BufferedWriter, FileIO is separately hooked to the list, and these are closed in reverse order of creation. Since close invokes flush when it matters, this will nearly always mean data is flushed down the stack before the path to disk gets severed, and always if you used open() to create the stack. I couldn't think of a perfect solution that didn't mean change to the API. This idea, and some tidying up I did in the io tests, might be of use in CPython. Jeff ___ Python-Dev mailing list [email protected] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] O(1) deletes from the front of bytearray (was: Re: Adding bytes.frombuffer() constructor to PEP 467 (was: [Python-ideas] Adding bytes.frombuffer() constructor)
On 13/10/2016 11:41, Serhiy Storchaka wrote: On 13.10.16 00:14, Nathaniel Smith wrote: AFAIK basically the only project that would be affected by this is PyPy, And MicroPython. And Jython, except that from the start its implementation of bytearray deferred resizing until the proportion unused space reaches some limit. I think that should make it O(log N) on average to delete (or add) a byte, at either end of a buffer of size N,. However, observations with timeit() look constant up to the point I run out of heap. Jeff Allen ___ Python-Dev mailing list [email protected] https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Removing memoryview object patch from Python 2.7
Hi Sesha: memoryview is part of the language. Even if you could hide or remove the feature, you would be running a specially broken version of Python, which can't be good. There is surely a better way to fix the code. If it helps any, you're landing here: https://hg.python.org/cpython/file/v2.7.12/Objects/stringobject.c#l819 in a function used to convert strings to an array of bytes within built-in functions. So something that expected a string is being given a memoryview object. But it's not possible to guess what or why, and this isn't the place to explore your code. Python-dev is about developing the language. Python-list is the place to ask questions about using the language. However, good hunting! Jeff Allen On 14/12/2016 12:09, Sesha Narayanan Subbiah wrote: Thanks Rob. I will try upgrade to 2.7.12. Any idea of this memory view object that has been back ported to 2.7 can be disabled in any way? Thanks Regards Sesha ___ Python-Dev mailing list [email protected] https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] "Micro-optimisations can speed up CPython"
On 30/05/2017 16:38, Guido van Rossum wrote: On Mon, May 29, 2017 at 11:16 PM, Serhiy Storchaka mailto:[email protected]>> wrote: 30.05.17 09:06, Greg Ewing пише: Steven D'Aprano wrote: What does "tp" stand for? Type something, I guess. I think it's just short for "type". There's an old tradition in C of giving member names a short prefix reminiscent of the type they belong to. Not sure why, maybe someone thought it helped readability. In early ages of C structures didn't create namespaces, and member names were globals. That's nonsense. The reason is greppability. It does seem that far enough back, struct member names were all one space, standing for little more than their offset and type: "Two structures may share a common initial sequence of members; that is, the same member may appear in two different structures if it has the same type in both and if all previous members are the same in both. (Actually, the compiler checks only that a name in two different structures has the same type and offset in both, ... )" -- The C Programming Language, K&R 1978 (p197). With these Python name spaces, you're really spoiling us, Mr BDFL. Jeff Allen ___ Python-Dev mailing list [email protected] https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
[Python-Dev] Re: Missing license for file Modules/ossaudiodev.c
This is undoubtedly the right answer for someone wanting to *use* code *from* CPython. When one signs up to contribute code to the PSF, one is asked to write on contributed software that it has been "Licensed to the PSF under a Contributor Agreement" (see https://www.python.org/psf/contrib/contrib-form/). The XXX comment may signal an intention to return and insert such words. You cannot find many instances of those words in CPython (and Jython is worse). Many of the files pre-date the fomula, most contributions are a change to existing code, and adding it later to someone else's work doesn't feel right. (The situation is clear for pristene code.) I have wondered if it's an issue. Jeff Jeff Allen On 19/08/2019 15:35, Guido van Rossum wrote: The LICENSE file at the top level of the repo covers everything. On Mon, Aug 19, 2019 at 7:33 AM mihaela olteanu via Python-Dev mailto:[email protected]>> wrote: Hello, Could you please let me know what is the license for the file Modules/ossaudiodev.c ? Inside the description there is a statement :"XXX need a license statement" which creates some confusion. Can we simply disregard that statement as for the other .c files which do not have such statements? Please note that we need this information for our OSS clearance report. Thank you, Mihaela ___ Python-Dev mailing list -- [email protected] <mailto:[email protected]> To unsubscribe send an email to [email protected] <mailto:[email protected]> https://mail.python.org/mailman3/lists/python-dev.python.org/ Message archived at https://mail.python.org/archives/list/[email protected]/message/BOG2EPE77PZPM52JCXMPZVYQRNL2XXN7/ -- --Guido van Rossum (python.org/~guido <http://python.org/~guido>) /Pronouns: he/him/his //(why is my pronoun here?)/ <http://feministing.com/2015/02/03/how-using-they-as-a-singular-pronoun-can-change-the-world/> ___ Python-Dev mailing list -- [email protected] To unsubscribe send an email to [email protected] https://mail.python.org/mailman3/lists/python-dev.python.org/ Message archived at https://mail.python.org/archives/list/[email protected]/message/O2VUG4CFGAKQGMVBLEOEFKZCJD3KSIAI/ ___ Python-Dev mailing list -- [email protected] To unsubscribe send an email to [email protected] https://mail.python.org/mailman3/lists/python-dev.python.org/ Message archived at https://mail.python.org/archives/list/[email protected]/message/G6KTZD375L4M3W7HMJGBPBP3XN7VPAOT/
[Python-Dev] Re: Missing license for file Modules/ossaudiodev.c
On 19/08/2019 21:30, Terry Reedy wrote: On 8/19/2019 3:19 PM, Jeff Allen wrote: When one signs up to contribute code to the PSF, one is asked to write on contributed software that it has been "Licensed to the PSF under a Contributor Agreement" (see https://www.python.org/psf/contrib/contrib-form/). The XXX comment may signal an intention to return and insert such words. The form says specifically "adjacent to Contributor's valid copyright notice". *If* the contribution comes with a separate explicit copyright notice (most do not), then it should be followed by the contribution notice. Ah, ok. I hadn't read it that way: rather a request to add both. That is useful, thanks: occasionally, I have to guide contributors to Jython, who sign the same form. Jeff ___ Python-Dev mailing list -- [email protected] To unsubscribe send an email to [email protected] https://mail.python.org/mailman3/lists/python-dev.python.org/ Message archived at https://mail.python.org/archives/list/[email protected]/message/3CE5UQCOLKJ2GI3EZPBIGDJAECI6J77K/
[Python-Dev] Re: Adding a scarier warning to object.__del__?
On 02/01/2020 02:53, Yonatan Zunger wrote: Oh, I'm absolutely thinking about clarity. ... Could any revision also be clear what is *required of Python the language* vs. what is a CPython implementation detail? I always appreciate this care. There is good practice here and elsewhere in the existing documentation, but drift is easy for those steeped in CPython implementation. In the present case, it's a matter of avoiding an explicit requirement for a reference-counting approach to lifecycle management. Here the places in the proposal a change could achieve that. The precise semantics of when __del__ is called on an object are implementation-dependent. For example: * It might be invoked during the normal interpreter flow at a moment like function return, ... We should continue here "... immediately the object is no nonger referenced;" (It might not be called immediately, but that's implied by your implementation-dependent "might be".) Note that del x doesn’t directly call x.__del__() — the former decrements the reference count for x by one, and the latter is only called when x’s reference count reaches zero. Depending on the implementation, it is possible for a reference cycle to prevent the reference count of an object from going to zero. (e.g., in CPython, a common cause of reference cycles is when an exception is caught and stored in a local variable; the exception contains a reference to the traceback, which in turn references the locals of all frames caught in the traceback.) In this case, the cycle will be later detected and deleted by the cyclic garbage collector. I realise that most of this paragraph is existing text rearranged, and currently it fails to make the distinction I'm looking for in the "note" part. But it is clear in the next paragraph. I think it better to say, closer to the current text: """Note:: ``del x`` does not call ``x.__del__()`` directly. After ``del x``, variable ``x`` is undefined (or unbound). If the object it referenced is now no longer referenced at all, that object's ``__del__()`` might be called, immediately or later, subject to the caveats already given. *CPython implementation detail:* ``del x`` decrements the reference count of the object by one, and if that makes it zero, ``x.__del__()`` will be called immediately. It is possible for a reference cycle to prevent the reference count of any object in it from going to zero. A common cause of reference cycles is when an exception is caught and stored in a local variable; the exception contains a reference to the traceback, which in turn references the locals of all frames caught in the traceback. In this case, the cycle will be detected later and its objects deleted by the cyclic garbage collector.""" If a base class has a __del__() method, the derived class’s __del__() method, if any, must explicitly call it to ensure proper deletion of the base class part of the instance. Possibly this thought belings with the "implementations of __del__(): * Must ..." paragraph. But also, while I think there is scope for a better guidance, this is getting a bit long. Should there be a "HOW TO write a __del__ method (and how to avoid it)" to contain the advisory points being made? In-lining advice here, on how to survive the infernal circles of __del__, dilutes the scariness of the warning not to enter at all. --- Jeff Allen ___ Python-Dev mailing list -- [email protected] To unsubscribe send an email to [email protected] https://mail.python.org/mailman3/lists/python-dev.python.org/ Message archived at https://mail.python.org/archives/list/[email protected]/message/A32QEUP4R6XTY5LQW56LKWJ3XBUZCHOR/ Code of Conduct: http://python.org/psf/codeofconduct/
[Python-Dev] Re: Questions about CPython's behavior on addition operator -- binary_op1 (abstract.c) and nb_add (typeobject.c)
On 16/01/2020 11:27, Raphaël Monat wrote: Hi all, I'm looking at CPython's behavior when an addition is called. From what I understand, binary_op1 <https://github.com/python/cpython/blob/3.8/Objects/abstract.c#L786> is eventually called, and it calls either slotv or slotw, which seems to be the binaryfunc defined as nb_add in the field tp_as_number of respectively v / w. I'm also keen to understand this and think I can elucidate a little, from my study of the code. The perspective of someone who *didn't* write it might help, but I defer to the authors' version when it appears. A crucial observation is that there is only one nb_add slot in a type definition. Think about adding a int(1) + float(2). Where it lands in long_add (v->ob_type->tp_as_number->nb_add) will return NotImplemented, because it does not understand the float right-hand argument, but float_add (w->ob_type->tp_as_number->nb_add) is able to give an answer, since it can float the int, behaving as float.__radd__(f, i). So the slots have to implement both __add__ and __radd__. I have a few questions: 1) In the default case, tp_as_number->nb_add is defined by the function slot_nb_add <https://github.com/python/cpython/blob/3.8/Objects/typeobject.c#L6312> itself stemming from the macro expansion SLOT1BINFULL <https://github.com/python/cpython/blob/3.8/Objects/typeobject.c#L6140> defined in typeobject.c. Both binary_op1(v, w) and slot_nb_add(v, w) appear to perform similar checks (if their second argument is a subtype of the first, etc), to decide if v's add or w's reverse add must be called and in which order. I find this repetition weird, and I guess I'm missing something... Any ideas? The logic is the same as binary_op1, but the function has to deal with the possibility that one or other type may already provide a special function wrapper function in its nb_add slot, which is what the test tp_as_number->SLOTNAME == TESTFUNC is about. TESTFUNC is nearly always the same as FUNCNAME. Then it also deals with quite a complex decision in method_is_overloaded(). The partial repetition of the logic, which I think is now nested (because binary_op1() may have called slot_nb_add) is necessary to insert the more complex version into the decision tree. But this is roughly where my ability to visualise the paths runs out. 2) From the SLOT1BINFULL macro, both __add__ and __radd__ are defined by the slot_nb_add function (with some argument swapping done by wrap_binaryfunc_l / wrap_binaryfunc_r). If I want to define a different behavior for the reverse operator during a definition with a PyTypeObject, I guess I should add an "__radd__" method? I would not say they are "defined" by the slot_nb_add function. Rather, if one or other has been defined (in Python), v.__add__ or w.__radd__ is called by the single function slot_nb_add. They cannot be called directly from C, but call_maybe() supplies the mechanism. A confusing factor is that for types defined in C and filling the nb_add slot, the slot function (float_add, or whatever) has to be wrapped by two descriptors that can then sit in the type's dictionary as "__add__" and "__radd__". In that case these *are* defined by a wrapped C function, but that function is the function in the implementation of the type (float_add, say), not slot_nb_add. There are two kinds of wrapper: one used "Python-side out", so Python-calling "__add__" leads to nb_add's behaviour, and one "C-side out" so C-calling via nb_add leads to "__add__". 3) If I create a user-defined class A, having different methods __add__ and __radd__, these methods are added in A's dictionary. From what I understand, the function update_one_slot <https://github.com/python/cpython/blob/3.8/Objects/typeobject.c#L7209> is then called to change A's tp_as_number->nb_add to point to the methods defined by __add__ and __radd__? From the code documentation, I think that "a wrapper for the special methods is installed". Where exactly is this wrapper applied, and how does it know when to dispatch to __add__ or __radd__? I *think* it is changed to contain slot_nb_add as defined by the macro. I hope that is somewhere near accurate. Jeff Allen ___ Python-Dev mailing list -- [email protected] To unsubscribe send an email to [email protected] https://mail.python.org/mailman3/lists/python-dev.python.org/ Message archived at https://mail.python.org/archives/list/[email protected]/message/DZQX3LILNVMKXR2UHFBBQN5ENPATYKFY/ Code of Conduct: http://python.org/psf/codeofconduct/
[Python-Dev] Re: Are PyObject_RichCompareBool shortcuts part of Python or just CPython quirks?
On 24/01/2020 21:01, Terry Reedy wrote: On 1/24/2020 7:09 AM, Steven D'Aprano wrote: On Fri, Jan 24, 2020 at 05:45:35AM -0500, Terry Reedy wrote: On 1/24/2020 3:36 AM, Victor Stinner wrote: CPython current behavior rely on the fact that it's possible to get the memory address of an object. No, this behavior relies on the language specification that all objects have temporally unique integer ids that can be compared with 'is'. In particular, Jython and IronPython have to implement Python's 'id' and 'is'. I believe 'id' is a bit of a nuisance. One way is a permanent index into a list of mutable addresses. Once 'id' is done, 'is' should be easy. Just for interest, in Jython, making id() an integer is enough of a nuisance that we don't invent an answer unless you ask for one. It involves a custom weak map from the abstract (Java) identity to integers allocated one up. "is" is relatively easy (with bit of delicacy about proxies to the same Java object), and is based on Java ==. It is not implemented using id(), but behaves as if it might be. The statement "An Object’s identity is determined using the |id()| function" is a little misleading but it is difficult to express the abstract concept of object identity. Jeff Allen ___ Python-Dev mailing list -- [email protected] To unsubscribe send an email to [email protected] https://mail.python.org/mailman3/lists/python-dev.python.org/ Message archived at https://mail.python.org/archives/list/[email protected]/message/65JITJFSG2MU4624N4TE4DMZJ4H6GEKO/ Code of Conduct: http://python.org/psf/codeofconduct/
[Python-Dev] Re: PEP 617: New PEG parser for CPython
The PEP gives a good exposition of the problem and proposed solution, thanks. If I understand correctly, the proposal is that the PEG grammar should become the definitive grammar for Python at some point, probably for Python 3.10, so it may evolve without the LL(1) restrictions. I'd like to raise some points with respect to that, which perhaps the migration section could answer. When definitive, the grammar would not then just be for CPython, and would also appear as user documentation of the language. Whether that change leaves Python with a more useful (readable) grammar seems an important test of the idea. I'm looking at https://github.com/we-like-parsers/cpython/blob/pegen/Grammar/python.gram , and assuming that is indicative of a future definitive grammar. That may be incorrect, as it has these issues in my view: 1. It is decorated with actions in C. If a decorated grammar is offered as definitive, one with Python actions (operations on the AST) is preferable, as implementation neutral, although still hostage to AST changes that are not language changes. Maybe one stripped of actions is best. 2. It's quite long, and not at first glance more readable than the LL(1) grammar. I had understood ugliness in the LL(1) grammar to result from skirting limitations that PEG eliminates. The PEG one is twice as long, but recognising about half of it is actions, let's just say that as a grammar it's no shorter. 3. There is some manual guidance by means of &-guards, only necessary (I think) as a speed-up or to force out meaningful syntax errors. That would be noise to the reader. (This goes away if the PEG parser generator generate guards from the first set at a simple "no backtracking" marker.) 4. In some places, expansive alternatives seem to be motivated by the difference between actions, for a start, wherever async pops up. Maybe it is also why the definition of lambda is so long. That could go away with different support code (e.g. is_async as an argument), but if improvements to the support change grammar rules, when the language has not changed, that's a danger sign too. All that I think means that the "operational" grammar from which you build the parser is going to be quite unlike the one with which you communicate the language. At present ~/Grammar/Grammar both generates the parser (I thought) and appears as documentation. I take it to be the ideal that we use a single, human-readable definition. For example ANTLR 4 has worked hard to facilitate a grammar in which actions are implicit, and the generation of an AST from the parse tree/events can be elsewhere. (I'm not plugging ANTLR specifically as a solution.) Jeff Allen On 02/04/2020 19:10, Guido van Rossum wrote: Since last fall's core sprint in London, Pablo Galindo Salgado, Lysandros Nikolaou and myself have been working on a new parser for CPython. We are now far enough along that we present a PEP we've written: https://www.python.org/dev/peps/pep-0617/ Hopefully the PEP speaks for itself. We are hoping for a speedy resolution so we can land the code we've written before 3.9 beta 1. If people insist I can post a copy of the entire PEP here on the list, but since a lot of it is just background information on the old LL(1) and the new PEG parsing algorithms, I figure I'd spare everyone the need of reading through that. Below is a copy of the most relevant section from the PEP. I'd also like to point out the section on performance (which you can find through the above link) -- basically performance is on a par with that of the old parser. == Migration plan == This section describes the migration plan when porting to the new PEG-based parser if this PEP is accepted. The migration will be executed in a series of steps that allow initially to fallback to the previous parser if needed: 1. Before Python 3.9 beta 1, include the new PEG-based parser machinery in CPython with a command-line flag and environment variable that allows switching between the new and the old parsers together with explicit APIs that allow invoking the new and the old parsers independently. At this step, all Python APIs like ``ast.parse`` and ``compile`` will use the parser set by the flags or the environment variable and the default parser will be the current parser. 2. After Python 3.9 Beta 1 the default parser will be the new parser. 3. Between Python 3.9 and Python 3.10, the old parser and related code (like the "parser" module) will be kept until a new Python release happens (Python 3.10). In the meanwhile and until the old parser is removed, **no new Python Grammar addition will be added that requires the peg parser**. This means that the grammar will be kept LL(1) until the old parser is removed. 4. In Python 3.10, remove the old parser,
[Python-Dev] Re: Latest PEP 554 updates.
On 05/05/2020 16:45, Eric Snow wrote: On Mon, May 4, 2020 at 11:30 AM Eric Snow wrote: Further feedback is welcome, though I feel like the PR is ready (or very close to ready) for pronouncement. Thanks again to all. FYI, after consulting with the steering council I've decided to change the target release to 3.10, when we expect to have per-interpreter GIL landed. That will help maximize the impact of the module and avoid any confusion. I'm undecided on releasing a 3.9-only module on PyPI. If I do it will only be for folks to try it out early and I probably won't advertise it much. -eric Eric: Many thanks for working on this so carefully for so long. I'm happy to see the per-interpreter GIL will now be studied fully before final commitment to subinterpreters in the stdlib. I would have chipped in in those terms to the review, but others succesfully argued for "provisional" inclusion, and I was content with that. My reason for worrying about this is that, while the C-API has been there for some time, it has not had heavy use in taxing cases AFAIK, and I think there is room for it to be incorrect. I am thinking more about Jython than CPython, but ideally they are the same structures. When I put the structures to taxing use cases on paper, they don't seem quite to work. Jython has been used in environments with thread-pools, concurrency, and multiple interpreters, and this aspect has had to be "fixed" several times. My use cases include sharing objects between interpreters, which I know the PEP doesn't. The C-API docs acknowledge that object sharing can't be prevented, but do their best to discourage it because of the hazards around allocation. Trouble is, I think it can happen unawares. The fact that Java takes on lifecycle management suggests it shouldn't be a fundamental problem in Jython. I know from other discussion it's where many would like to end up, even in CPython. This is all theory: I don't have even a model implementation, so I won't pontificate. However, I do have pictures, without which I find it impossible to think about this subject. I couldn't find your pictures, so share mine here (WiP): https://the-very-slow-jython-project.readthedocs.io/en/latest/architecture/interpreter-structure.html#runtime-thread-and-interpreter-cpython I would be interested in how you solve the problem of finding the current interpreter, discussed in the article. My preferred answer is: https://the-very-slow-jython-project.readthedocs.io/en/latest/architecture/interpreter-structure.html#critical-structures-revisited That's the API change I think is needed. It might not have a visible effect on the PEP, but it's worth bearing in mind the risk of exposing a thing you might shortly find you want to change. Jeff Allen ___ Python-Dev mailing list -- [email protected] To unsubscribe send an email to [email protected] https://mail.python.org/mailman3/lists/python-dev.python.org/ Message archived at https://mail.python.org/archives/list/[email protected]/message/E2BMM2IVKMDJGWOWQWCSDZCNPZOKEJMJ/ Code of Conduct: http://python.org/psf/codeofconduct/
[Python-Dev] Re: Latest PEP 554 updates.
On 06/05/2020 21:52, Eric Snow wrote: On Wed, May 6, 2020 at 2:25 PM Jeff Allen wrote: ... My reason for worrying about this is that, while the C-API has been there for some time, it has not had heavy use in taxing cases AFAIK, and I think there is room for it to be incorrect. I am thinking more about Jython than CPython, but ideally they are the same structures. When I put the structures to taxing use cases on paper, they don't seem quite to work. Jython has been used in environments with thread-pools, concurrency, and multiple interpreters, and this aspect has had to be "fixed" several times. That insight would be super helpful and much appreciated. :) Is that all on the docs you've linked? As far as it goes. I intended to (will eventually) elaborate the more complex cases, such as concurrency and application server, where I think a Thread may have "history" in a runtime that should be ignored. There's more on my local repo, but not about this yet. I have linked you into one page of a large and rambling (at times) account of experiments I'm doing. Outside be dragons. The other thing I might point to would be Jython bugs that may be clues something is still wrong conceptually, or at least justify getting those concepts clear (https://bugs.jython.org issues 2642, 2507, 2513, 2846, 2465, 2107 to name a few). This is great stuff, Jeff! Thanks for sharing it. I was able to skim through but don't have time to dig in at the moment. I'll reply in detail as soon as I can. Thanks. I hope it's a positive contribution. Isn't PlantUML awesome? The key argument (or where I'm mistaken) is that, once you start sharing objects, only the function you call knows the right Interpreter (import context) to use, so in principle, it is different in every frame. You can't get to it from the current thread. Jeff ___ Python-Dev mailing list -- [email protected] To unsubscribe send an email to [email protected] https://mail.python.org/mailman3/lists/python-dev.python.org/ Message archived at https://mail.python.org/archives/list/[email protected]/message/OGMJULX5RIVP2GFIX3G2TAUZAYQKAA5D/ Code of Conduct: http://python.org/psf/codeofconduct/
[Python-Dev] Re: My take on multiple interpreters (Was: Should we be making so many changes in pursuit of PEP 554?)
On 12/06/2020 12:55, Eric V. Smith wrote: On 6/11/2020 6:59 AM, Mark Shannon wrote: Different interpreters need to operate in their own isolated address space, or there will be horrible race conditions. Regardless of whether that separation is done in software or hardware, it has to be done. I realize this is true now, but why must it always be true? Can't we fix this? At least one solution has been proposed: passing around a pointer to the current interpreter. I realize there issues here, like callbacks and signals that will need to be worked out. But I don't think it's axiomatically true that we'll always have race conditions with multiple interpreters in the same address space. Eric Axiomatically? No, but let me rise to the challenge. If (1) interpreters manage the life-cycle of objects, and (2) a race condition arises when the life-cycle or state of an object is accessed by the interpreter that did not create it, and (3) an object will sometimes be passed to an interpreter that did not create it, and (4) an interpreter with a reference to an object will sometimes access its life-cycle or state, then (5) a race condition will sometimes arise. This seems to be true (as a deduction) if all the premises hold. (1) and (2) are true in CPython as we know it. (3) is prevented (completely?) by the Python API, but not at all by the C API. (4) is implicit in an interpreter having access to an object, the way CPython and its extensions are written, so (5) follows in the case that the C API is used. You could change (1) and/or (2), maybe (4). "Passing around a pointer to the current interpreter" sounds like an attempt to break (2) or maybe (4). But I don't understand "current". What you need at any time is the interpreter (state and life-cycle manager) for the object you're about to handle, so that the receiving interpreter can delegate the action, instead of crashing ahead itself. This suggests a reference to the interpreter must be embedded in each object, but it could be implicit in the memory address. There is then still an issue that the owning interpreter has to be thread-safe (if there are threads) in the sense that it can serialise access to object state or life-cycle. If serialisation is by a GIL, the receiving interpreter must take the GIL of the owning interpreter, and we are somewhat back where we started. Note that the "current interpreter" is not a function of the current thread (or vice-versa). The current thread is running in both interpreters, and by hypothesis, so are the competing threads. Can I just point out that, while most of this argument concerns a particular implementation, we have a reason in Python (the language) for an interpreter construct: it holds the current module context, so that whenever code is executing, we can give definite meaning to the 'import' statement. Here "current interpreter" does have a meaning, and I suggest it needs to be made a property of every function object as it is defined, and picked up when the execution frame is created. This *may* help with the other, internal, use of interpreter, for life-cycle and state management, because it provides a recognisable point (function call) where one may police object ownership, but that isn't why you need it. Jeff Allen ___ Python-Dev mailing list -- [email protected] To unsubscribe send an email to [email protected] https://mail.python.org/mailman3/lists/python-dev.python.org/ Message archived at https://mail.python.org/archives/list/[email protected]/message/GACVQJNCZLT4P3YX5IISRBOQTXXTJVMB/ Code of Conduct: http://python.org/psf/codeofconduct/
[Python-Dev] Re: My take on multiple interpreters (Was: Should we be making so many changes in pursuit of PEP 554?)
On 17/06/2020 19:28, Eric V. Smith wrote: On 6/17/2020 12:07 PM, Jeff Allen wrote: If (1) interpreters manage the life-cycle of objects, and (2) a race condition arises when the life-cycle or state of an object is accessed by the interpreter that did not create it, and (3) an object will sometimes be passed to an interpreter that did not create it, and (4) an interpreter with a reference to an object will sometimes access its life-cycle or state, then (5) a race condition will sometimes arise. This seems to be true (as a deduction) if all the premises hold. I'm assuming that passing an object between interpreters would not be supported. It would require that the object somehow be marshalled between interpreters, so that no object would be operated on outside the interpreter that created it. So 2-5 couldn't happen in valid code. The Python level doesn't support it, prevents it I think, and perhaps the implementation doesn't support it, but nothing can stop C actually doing it. I would agree that with sufficient discipline in the code it should be possible to prevent the worlds from colliding. But it is difficult, so I think that is why Mark is arguing for a separate address space. Marshalling the value across is supported, but that's just the value, not a shared object. Sorry for being loose with terms. If I want to create an interpreter and execute it, then I'd allocate and initialize an interpreter state object, then call it, passing the interpreter state object in to whatever Python functions I want to call. They would in turn pass that pointer to whatever they call, or access the state through it directly. That pointer is the "current interpreter". I think that can work if you have disciplined separation, which you are assuming. I think you would pass the function to the interpreter, not the other way around. I'm assuming this is described from the perspective of some C code and your Python functions are PyFunction objects, not just text? What, however, prevents you creating that function in one interpreter and giving it to another? The function, and any closure or defaults are owned by the creating interpreter. There's a lot of state per interpreter, including the module state. See "struct _is" in Include/internal/pycore_interp.h. So much more than when I last looked! Look back in time and interpreter state mostly contains the module context (in a broad sense that includes shortcuts to sys, builtins, codec state, importlib). Ok, there's some stuff about exit handling and debugging too. The recent huge growth is to shelter previously singleton object allocation mechanisms, a consequence of the implementation choice that gives the interpreter object that responsibility too. I'm not saying this is wrong, just that it's not a concept in Python-the-language, while the module state is. Jeff ___ Python-Dev mailing list -- [email protected] To unsubscribe send an email to [email protected] https://mail.python.org/mailman3/lists/python-dev.python.org/ Message archived at https://mail.python.org/archives/list/[email protected]/message/ESNK7A5UFBQOQXKUDWCUMS2372AL7ZPU/ Code of Conduct: http://python.org/psf/codeofconduct/
[Python-Dev] Re: [Python-ideas] Re: Amend PEP-8 to require clear, understandable comments instead of Strunk & White Standard English comments
On 01/07/2020 21:01, Ethan Furman wrote: A not-great article, White Fears of Dispossession: Dreyer's English, The Elements of Style,and the Racial Mapping of English Discourse, here: http://radicalteacher.library.pitt.edu/ojs/radicalteacher/issue/view/19/25 Thanks for posting this. (What a lot of work you must've done to find it.) As a result I feel I have a much better understanding of the environment in which these thought processes (those displayed in the commit message) would be considered rational, even admirable. Food for thought. E. B. White was born in New York -- I believe that's in the northern part of the United States, otherwise known as "The North" or the side that fought to end slavery. E. B. White was educated at Cornell. We should acknowledge that he famously showed an interest in web development and invented a sort of mouse. ;-) Jeff Allen ___ Python-Dev mailing list -- [email protected] To unsubscribe send an email to [email protected] https://mail.python.org/mailman3/lists/python-dev.python.org/ Message archived at https://mail.python.org/archives/list/[email protected]/message/STGOKTEBL54YG7NPCLXIFRFJZWL3HQOI/ Code of Conduct: http://python.org/psf/codeofconduct/
[Python-Dev] Re: Memory address vs serial number in reprs
On 19/07/2020 16:38, Serhiy Storchaka wrote: I have problem with the location of hexadecimal memory address in custom reprs. I agree they are "noise" mostly and difficult to distinguish when you need to. What if use serial numbers to differentiate instances? where the serial number starts with 1 and increased for every new instance of that type. What would happen at a __class__ assignment? IIUC class assignability is an equivalence relation amongst types: serial numbers would have to be unique within the equivalence class, not within the type. Otherwise, they would have to change (unlike id()), may not round-trip if __class__ were assigned there and back. Jeff Allen ___ Python-Dev mailing list -- [email protected] To unsubscribe send an email to [email protected] https://mail.python.org/mailman3/lists/python-dev.python.org/ Message archived at https://mail.python.org/archives/list/[email protected]/message/THVJFA3INNGWW2CXDKGFTMASH3UURAYG/ Code of Conduct: http://python.org/psf/codeofconduct/
[Python-Dev] Re: Procedure for trivial PRs
On 13/08/2020 21:32, Facundo Batista wrote: El jue., 13 de ago. de 2020 a la(s) 16:55, Mariatta ([email protected]) escribió: On Thu, Aug 13, 2020 at 12:51 PM Facundo Batista wrote: It's waiting for a "core review", which is a good thing (and by all means welcomed). But as we're saturated of PRs, the fix is small, and I'm a core developer myself... shall I wait for a review from another core developer, or should I just land it? As a core dev you can still merge it yourself without needing to wait for review, even when it has the "awaiting core review" label. Awesome, thanks for the help. This link may be helpful, to the relevant part of the dev-guide is this one: https://devguide.python.org/devcycle/#beta . Peer reviews become mandatory at rc1 and are at your discretion before that, is how I read it. It will seem odd that I should know or care about this, not having any relevant rights over CPython, but I'm trying to adopt it in the projects where I do. Jeff Allen ___ Python-Dev mailing list -- [email protected] To unsubscribe send an email to [email protected] https://mail.python.org/mailman3/lists/python-dev.python.org/ Message archived at https://mail.python.org/archives/list/[email protected]/message/5BV2CUCH446ETX2ADL3VV2QHLSIM5LKL/ Code of Conduct: http://python.org/psf/codeofconduct/
[Python-Dev] Re: Understanding why object defines rich comparison methods
On 22/09/2020 12:13, Serhiy Storchaka wrote: Because object.__eq__ and object.__ne__ exist. I didn't quite get the logic of this. In case anyone (not Steven) is still puzzled as I was, I think one could say: ... because tp_richcompare is filled, *so that* object.__eq__ and object.__ne__ will exist to support default (identity) object comparison. And then __lt__, etc. get defined because, as Serhiy says, ... If you define slot tp_richcompare in C, it is exposed as 6 methods __eq__, __ne__, __lt__, __le__, __gt__ and __ge__. By "exposed" we mean each descriptor (PyWrapperDescrObject) points to a different C function (generated by the RICHCMP_WRAPPER macro). The function summons the tp_richcompare slot function in the left object's type, with arguments (left, right, op). Simple (not). Jeff Allen ___ Python-Dev mailing list -- [email protected] To unsubscribe send an email to [email protected] https://mail.python.org/mailman3/lists/python-dev.python.org/ Message archived at https://mail.python.org/archives/list/[email protected]/message/XANM3QX3I6VHOEGVPMGIU4MMJHKJGYKA/ Code of Conduct: http://python.org/psf/codeofconduct/
[Python-Dev] Re: [python-committers] Resignation from Stefan Krah
On 10/10/2020 00:56, Brett Cannon wrote: On Fri, Oct 9, 2020 at 2:55 PM Toshio Kuratomi <mailto:[email protected]>> wrote: One thing i would suggest, though, is documenting and, in general, following a sequence of progressively more strict interventions by the steering committee. I think that just as it is harmful to the community to let bad behavior slide, it is also harmful to the community to not know that the steering committee's enforcement is in measured steps which will telegraph the committee's intentions and the member's responsibilities well in advance. Documenting exact steps is really hard when it comes to a Code of Conduct. Every case is unique and so rigid rules don't typically work well, e.g. requiring everyone to get a warning first would mean I could [...] way more and still be here without technical ramifications because we said, "you always get a warning first". This is so painful I'm reluctant to add to the pile, so I'll be succinct (at risk of sounding brusque). Personally I find it a weak argument that the SC should not codify a system of warnings because some cases go bad so quickly that you have to act immediately. This may be necessary for drive-by trolls with a point to make. It would be rare in anyone with significant standing in the PSF. Anyway, you can have both. I realise that core developer status is not employment, but I think there is a model worth considering in this: https://www.gov.uk/dismiss-staff/dismissals-on-capability-or-conduct-grounds#disciplinary-procedures . This is guidance, not law over here, but an employment tribunal would take it as a definition of reasonable, so most decent employers adopt it as a policy. I have been asked personally and privately multiple times over the years to step in and mediate conduct issues with Stefan over the years. Tack on a Conduct WG warning from just earlier this year and the multiple incidents subsequently and that's how I at least reached my decision that this was a reasonable approach to take. Sounds like you were doing roughly as Toshio recommends anyway (the decent thing as I'd expect), but maybe explicit is better? Jeff Allen ___ Python-Dev mailing list -- [email protected] To unsubscribe send an email to [email protected] https://mail.python.org/mailman3/lists/python-dev.python.org/ Message archived at https://mail.python.org/archives/list/[email protected]/message/FBUJDYPYTCTJDZUQTY2WX3PW4OEZ2XZR/ Code of Conduct: http://python.org/psf/codeofconduct/
[Python-Dev] Understanding the buffer API
I'm implementing the buffer API and some of memoryview for Jython. I have read with interest, and mostly understood, the discussion in Issue #10181 that led to the v3.3 re-implementation of memoryview and much-improved documentation of the buffer API. Although Jython is targeting v2.7 at the moment, and 1-D bytes (there's no Jython NumPy), I'd like to lay a solid foundation that benefits from the recent CPython work. I hope that some of the complexity in memoryview stems from legacy considerations I don't have to deal with in Jython. I am puzzled that PEP 3118 makes some specifications that seem unnecessary and complicate the implementation. Would those who know the API inside out answer a few questions? My understanding is this: When a consumer requests a buffer from the exporter it specifies using flags how it intends to navigate it. If the buffer actually needs more apparatus than the consumer proposes, this raises an exception. If the buffer needs less apparatus than the consumer proposes, the exporter has to supply what was asked for. For example, if the consumer sets PyBUF_STRIDES, and the buffer can only be navigated by using suboffsets (PIL-style) this raises an exception. Alternatively, if the consumer sets PyBUF_STRIDES, and the buffer is just a simple byte array, the exporter has to supply shape and strides arrays (with trivial values), since the consumer is going to use those arrays. Is there any harm is supplying shape and strides when they were not requested? The PEP says: "PyBUF_ND ... If this is not given then shape will be NULL". It doesn't stipulate that strides will be null if PyBUF_STRIDES is not given, but the library documentation says so. suboffsets is different since even when requested, it will be null if not needed. Similar, but simpler, the PEP says "PyBUF_FORMAT ... If format is not explicitly requested then the format must be returned as NULL (which means "B", or unsigned bytes)". What would be the harm in returning "B"? One place where this really matters is in the implementation of memoryview. PyMemoryView requests a buffer with the flags PyBUF_FULL_RO, so even a simple byte buffer export will come with shape, strides and format. A consumer (of the memoryview's buffer API) might specify PyBUF_SIMPLE: according to the PEP I can't simply give it the original buffer since required fields (that the consumer will presumably not access) are not NULL. In practice, I'd like to: what could possibly go wrong? Jeff Allen ___ Python-Dev mailing list [email protected] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Understanding the buffer API
Thanks for a swift reply: you're just the person I hoped would do so. On 04/08/2012 10:11, Stefan Krah wrote: You are right that the PEP does not explicitly state that rule for strides. However, NULL always has an implied meaning: format=NULL -> treat the buffer as unsigned bytes. shape=NULL -> one-dimensional AND treat the buffer as unsigned bytes. strides=NULL -> C-contiguous I think relaxing the NULL rule for strides would complicate things, since it would introduce yet another special case. ... Ok, I think I see that how the absence of certain arrays is used to deduce structural simplicity, over and above their straightforward use in navigating the data. So although no shape array is (sort of) equivalent to ndim==1, shape[0]==len, it also means I can call simpler code instead of using the arrays for navigation. I still don't see why, if the consumer says "I'm assuming 1-D unsigned bytes", and that's what the data is, memoryview_getbuf could not provide a shape and strides that agree with the data. Is the catch perhaps that there is code (in abstract.c etc.) that does not know what the consumer promised not to use/look at? Would it actually break, e.g. not treat it as bytes, or just be inefficient? Because of all the implied meanings of NULL, I think the safest way is to implement memoryview_getbuf() for Jython. After all the PEP describes a protocol, so everyone should really be doing the same thing. I'll look carefully at what you've written (snipped here) because it is these "consumer expectations" that are most important. The Jython buffer API is necessarily a lot different from the C one: some things are not possible in Java (pointer arithmetic) and some are just un-Javan activities (allocate a struct and have the library fill it in). I'm only going for a logical conformance to the PEP: the same navigational and other attributes, that mean the same things for the consumer. When you say such-and-such is disallowed, but the PEP or the data structures seem to provide for it, you mean memoryview_getbuf() disallows it, since you've concluded it is not sensible? I think the protocol would benefit from changing the getbuffer rules to: a) The buffer gets a 'flags' field that can store properties like PyBUF_SIMPLE, PyBUF_C_CONTIGUOUS etc. b) The exporter must *always* provide full information. c) If a buffer can be exported as unsigned bytes but has a different layout, the exporter must perform a full cast so that the above mentioned invariants are kept. Just like PyManagedBuffer mbuf and its sister view in memoryview? I've thought the same things, but the tricky part is to do it compatibly. a) I think I can achieve this. As I have interfaces and polymorphism on my side, and a commitment only to logical equivalence to CPython, I can have the preserved flags stashed away inside to affect behaviour. But it's not as simple as saving the consumer's request, and I'm still trying to work it out what to do, e.g. when the consumer didn't ask for C-contiguity, but in this case it happens to be true. In the same way, functions you have in abstract.c etc. can be methods that, rather than work out by inspection of a struct how to navigate the data on this call, already know what kind of buffer they are in. So SimpleBuffer.isContiguous(char order) can simply return true. b) What I'm hoping can work, but maybe not. c) Java will not of course give you raw memory it thinks is one thing, to treat as another, so this aspect is immature in my thinking. I got as far as accommodating multi-byte items, but have no use for them as yet. Thanks again for the chance to test my ideas. Jeff Allen ___ Python-Dev mailing list [email protected] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Understanding the buffer API
- Summary:
The PEP, or sometimes just the documentation, definitely requires that
features not requested shall be NULL.
The API would benefit from:
a. stored flags that tell you the actual structural features.
b. requiring exporters to provide full information (e.g. strides =
{1}, format = "B") even when trivial.
It could and possibly should work this way in Python 4.0.
Nick thinks we could *allow* exporters to behave this way (PEP change)
in Python 3.x. Stefan thinks not, because "Perhaps there is code that
tests for shape==NULL to determine C-contiguity."
Jython exporters should return full information unconditionally from the
start: "any implementation that doesn't use the Py_buffer struct
directly in a C-API should just always return a full buffer" (Stefan);
"I think that's the way Jython should go: *require* that those fields be
populated appropriately" (Nick).
- But what I now think is:
_If the only problem really is_ "code that tests for shape==NULL to
determine C-contiguity", or makes similar deductions, I agree that
providing unasked-for information is_safe_. I think the stipulation in
PEP/documentation has some efficiency value: on finding shape!=NULL the
code has to do a more complicated test, as inPyBuffer_IsContiguous(). I
have the option to provide an isContiguous that has the answer written
down already, so the risk is only from/to ported code. If it is only a
risk to the efficiency of ported code, I'm relaxed: I hesitate only to
check that there's no circumstance that logically requires nullity for
correctness. Whether it was safe that was the key question.
In the hypothetical Python 4.0 buffer API (and in Jython) where feature
flags are provided, the efficiency is still useful, but complicated
deductive logic in the consumer should be deprecated in favour of
(functions for) interrogating the flags.
An example illustrating the semantics would then be:
1. consumer requests a buffer, saying "I can cope with a strided arrays"
(PyBUF_STRIDED);
2. exporter provides a strides array, but in the feature flags
STRIDED=0, meaning "you don't need the strides array";
3. exporter (optionally) uses efficient, non-strided access.
_I do not think_ that full provision by the exporter has to be
_mandatory_, as the discussion has gone on to suggest. I know your
experience is that you have often had to regenerate the missing
information to write generic code, but I think this does not continue
once you have the feature flags. An example would be:
1. consumer requests a buffer, saying "I can cope with a N-dimensional
but not strided arrays" (PyBUF_ND);
2. exporter sets strides=NULL, and the feature flag STRIDED=0;
3. exporter accesses the data, without reference to the strides array,
as it planned;
4. new generic code that respects the feature flag STRIDED=0, does not
reference the strides array;
5. old generic code, ignorant of the feature flags, finds the
strides=NULL and so does not dereference strides.
Insofar as it is not necessary, there is some efficiency in not
providing it. There would only be a problem with broken code that both
ignores the feature flag and uses the strides array unchecked. But this
code was always broken.
Really useful discussion this.
Jeff
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe:
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Jython roadmap
On 21/08/2012 06:34, [email protected] wrote: Zitat von "Juancarlo Añez (Apalala)" : It seems that Jython is under the Python Foundation, but I can't find a roadmap, a plan, or instructions about how to contribute to it reaching 2.7 and 3.3. Are there any pages that describe the process? Hi Juanca, These questions are best asked on the jython-dev mailing list, see Hi Juancarlo: I'm cross-posting this for you on jython-dev as Martin is right. Let's continue there. Jython does need new helpers and I agree it isn't very easy to get started. And we could do with a published roadmap. I began by fixing a few bugs (about a year ago now), as that seemed to be the suggestion on-line and patches can be offered unilaterally. (After a bit of nagging) some of these got reviewed and I'd won my spurs. I found the main difficulty to be understanding the source, or rather the architecture: there is too little documentation and some of what you can find is out of date (svn?). A lot of basic stuff is still a complete mystery to me. As I've discovered things I've put them on the Jython Wiki ( http://wiki.python.org/jython/JythonDeveloperGuide ) in the hope of speeding others' entry, including up-to-date description of how to get the code to build in Eclipse. One place to look, that may not occur to you immediately, is Frank Wierzbicki's blog ( http://fwierzbicki.blogspot.co.uk/ ). Frank is the project manager for Jython, an author of the Jython book, and has worked like a Trojan (the good kind, not the horse) over the last 6 months. Although Frank has shared inklings of a roadmap, it must be difficult to put dates to things that depend on a small pool of volunteers working in their spare time -- especially perfectionist volunteers who write more Javadoc than actual code, then delete it all because they've had a better idea :-). Direction of travel is easier: 2.5.3 is out, we're trying to get to 2.7b, but with an eye on 3.3. I haven't seen anything systematic on what's still to do, who's doing it, and where the gaps are, which is probably what you're looking for. ... Frank? Jeff Allen ___ Python-Dev mailing list [email protected] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Py_buffer.obj documentation
On 29/08/2012 22:28, Alexander Belopolsky wrote: I am trying to reconcile this section in 3.3 documentation: """ void *obj A new reference to the exporting object. The reference is owned by the consumer and automatically decremented and set to NULL by PyBuffer_Release(). with the following comment in the code (Objects/memoryobject.c:762): /* info->obj is either NULL or a borrowed reference. This reference should not be decremented in PyBuffer_Release(). */ I've studied this code in the interests of reproducing something similar for Jython. The comment is in the context of PyMemoryView_FromBuffer(Py_buffer *info), at a point where the whole info struct is being copied to mbuf->master, then the code sets mbuf->master.obj = NULL. I think the comment means that the caller, which is in the role of consumer to the original exporter, owns the info struct and therefore the reference info.obj. That caller will eventually call PyBuffer_Release(info), which will result in a DECREF(obj) matching the INCREF(obj) that happened during bf_getbuffer(info). In this sense obj is a borrowed reference as far as the memoryview is concerned. mbuf->master must not also keep a reference, or it risks making a second call to DECREF(obj). Jeff Allen ___ Python-Dev mailing list [email protected] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] [Python-checkins] cpython: Issue #16592: stringlib_bytes_join doesn't raise MemoryError on allocation
On 02/12/2012 07:08, Nick Coghlan wrote: On Sun, Dec 2, 2012 at 4:56 PM, christian.heimes mailto:[email protected]>> wrote: ... <http://hg.python.org/cpython/rev/9af5a2611202> diff --git a/Misc/NEWS b/Misc/NEWS ... +- Issue #16592: stringlib_bytes_join doesn't raise MemoryError on allocation + failure. Please don't write NEWS entries in past tense like this - they're annoyingly ambiguous, as it isn't clear whether the entry is describing the reported problem or the fix for the problem. Describing just the new behaviour or the original problem and the fix is much easier to follow. For example: - Issue #16592: stringlib_bytes_join now correctly raises MemoryError on allocation failure. - Issue #16592: stringlib_bytes_join was triggering SystemError on allocation failure. It now correctly raises MemoryError. Issue titles for actual bugs generally don't make good NEWS entries, as they're typically a summary of the problem rather than the solution (RFE's are different, as there the issue title is often a good summary of the proposed change) You mean please do (re-)write such statements in the past tense, when the news is that the statement is no longer true. I agree about the ambiguity that arises here, but there's a simple alternative to re-writing. Surely all that has been forgotten here is an enclosing "The following issues have been resolved:"? I think there's a lot to be said for cut and paste of actual titles on grounds of accuracy and speed (and perhaps scriptability). E.g. http://hg.python.org/jython/file/661a6baa10da/NEWS Jeff Allen ___ Python-Dev mailing list [email protected] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Understanding the buffer API
On 08/08/2012 11:47, Stefan Krah wrote: Nick Coghlan wrote: It does place a constraint on consumers that they can't assume those fields will be NULL just because they didn't ask for them, but I'm struggling to think of any reason why a client would actually *check* that instead of just assuming it. Can we continue this discussion some other time, perhaps after 3.3 is out? I'd like to respond, but need a bit more time to think about it than I have right now (for this issue). Those who contributed to the design of it through discussion here may be interested in how this has turned out in Jython. Although Jython is still at a 2.7 alpha, the buffer API has proved itself in a few parts of the core now and feels reasonably solid. It works for bytes in one dimension. There's a bit of description here: http://wiki.python.org/jython/BufferProtocol Long story short, I took the route of providing all information, which makes the navigational parts of the flags argument unequivocally a statement of what navigation the client is assuming will be sufficient. (The exception if thrown says explicitly that it won't be enough.) It follows that if two clients want a view on the same object, an exporter can safely give them the same one. Buffers take care of export counting for the exporter (as in the bytearray resize lock), and buffers can give you a sliced view of themselves without help from the exporter. The innards of memoryview are much simpler for all this and enable it to implement slicing (as in CPython 3.3) in one dimension. There may be ideas worth stealing here if the CPython buffer is revisited. N dimensional arrays and indirect addressing, while supported in principle, have no implementation. I'm fairly sure multi-byte items, as a way to export arrays of other types, makes no sense in Java where type security is strict and a parallel but type-safe approach will be needed. Jeff Allen ___ Python-Dev mailing list [email protected] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 433: Add cloexec argument to functions creating file descriptors
On 13/01/2013 00:41, Victor Stinner wrote: PEP: 433 Title: Add cloexec argument to functions creating file descriptors Status: Draft The PEP is still a draft. I'm sending it to python-dev to get a first review. The main question is the choice between the 3 different options: * don't set close-on-exec flag by default * always set close-on-exec flag * add sys.setdefaultcloexec() to leave the choice to the application Victor Nice clear explanation. I think io, meaning _io and _pyio really, would be amongst the impacted modules, and should perhaps be in the examples. (I am currently working on the Jython implementation of the _io module.) It seems to me that io.open, and probably all the constructors, such as _io.FileIO, would need the extra information as a mode or a boolean argument like closefd. This may be a factor in your choice above. Other things I noticed were minor, and I infer that they should wait until principles are settled. Jeff Allen ___ Python-Dev mailing list [email protected] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 433: Add cloexec argument to functions creating file descriptors
On 17/01/2013 13:02, Victor Stinner wrote:
2013/1/13 Jeff Allen:
I think io, meaning _io and _pyio really, would be amongst the impacted
modules, and should perhaps be in the examples. (I am currently working on
the Jython implementation of the _io module.) It seems to me that io.open,
and probably all the constructors, such as _io.FileIO, would need the extra
information as a mode or a boolean argument like closefd. This may be a
factor in your choice above.
open() is listed in the PEP: open is io.open. I plan to add cloexec
parameter to open() and FileIO constructor (and so io.open and
_pyio.open). Examples:
rawfile = io.FileIO("test.txt", "r", cloexec=True)
textfile = open("text.txt", "r", cloexec=True)
Ok it fell under "too obvious to mention". And my ignorance of v3.x,
sorry. (We're still working on 2.7 in Jython, where open is from
__builtin__.) Thanks for the reply.
Jeff Allen
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe:
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
