Re: [Python-Dev] Add a new "locale" codec?
2012/2/8 Simon Cross :
> Is the idea to have:
>
> b"foo".decode("locale")
>
> be roughly equivalent to
>
> encoding = locale.getpreferredencoding(False)
> b"foo".decode(encoding)
>
> ?
Yes. Whereas:
b"foo".decode(sys.getfilesystemencoding())
is equivalent to
encoding = locale.getpreferredencoding(True)
b"foo".decode(encoding)
Victor
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe:
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Add a new "locale" codec?
I think I'm -1 on a "locale" encoding because it refers to different actual encodings depending on where and when it's run, which seems surprising, and there's already a more explicit way to achieve the same effect. The documentation on .getpreferredencoding() says some scary things about needing to call .setlocale() sometimes but doesn't really say when or why. Could any of those cases make "locale" do weird things because it doesn't call setlocale()? ___ Python-Dev mailing list [email protected] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] folding cElementTree behind ElementTree in 3.3
On Wed, Feb 8, 2012 at 08:37, Stefan Behnel wrote: > I didn't get a response from him to my e-mails since early 2010. Maybe > others have more luck if they try, but I don't have the impression that > waiting another two years gets us anywhere interesting. > > Given that it was two months ago that I started the "Fixing the XML > batteries" thread (and years since I brought up the topic for the first > time), it seems to be hard enough already to get anyone on python-dev > actually do something for Python's XML support, instead of just actively > discouraging those who invest time and work into it. I concur. It's important that we consider Fredrik's ownership of the modules, but if he fails to reply to email and doesn't update his repositories, there should be enough cause for python-dev to go on and appropriate the stdlib versions of those modules. Cheers, Dirkjan ___ Python-Dev mailing list [email protected] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] folding cElementTree behind ElementTree in 3.3
On Wed, Feb 8, 2012 at 11:36, Dirkjan Ochtman wrote: > On Wed, Feb 8, 2012 at 08:37, Stefan Behnel wrote: >> I didn't get a response from him to my e-mails since early 2010. Maybe >> others have more luck if they try, but I don't have the impression that >> waiting another two years gets us anywhere interesting. >> >> Given that it was two months ago that I started the "Fixing the XML >> batteries" thread (and years since I brought up the topic for the first >> time), it seems to be hard enough already to get anyone on python-dev >> actually do something for Python's XML support, instead of just actively >> discouraging those who invest time and work into it. > > I concur. It's important that we consider Fredrik's ownership of the > modules, but if he fails to reply to email and doesn't update his > repositories, there should be enough cause for python-dev to go on and > appropriate the stdlib versions of those modules. > +1. That said, I think that the particular change discussed in this thread can be made anyway, since it doesn't really modify ET's APIs or functionality, just the way it gets imported from stdlib. Eli ___ Python-Dev mailing list [email protected] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Add a new "locale" codec?
On 2012-02-08 09:28, Simon Cross wrote: I think I'm -1 on a "locale" encoding because it refers to different actual encodings depending on where and when it's run, which seems surprising, and there's already a more explicit way to achieve the same effect. I'd agree that this is undesirable, and I don't really want locale-specific behaviour to leak out in other places that accept a encoding name (eg ), but we already have this behaviour with the "mbcs" encoding on Windows which refers to the locale-specific 'ANSI' code page. -- And Clover mailto:[email protected] http://www.doxdesk.com/ gtalk:[email protected] ___ Python-Dev mailing list [email protected] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] folding cElementTree behind ElementTree in 3.3
On 8 February 2012 09:49, Eli Bendersky wrote:
>> I concur. It's important that we consider Fredrik's ownership of the
>> modules, but if he fails to reply to email and doesn't update his
>> repositories, there should be enough cause for python-dev to go on and
>> appropriate the stdlib versions of those modules.
>
> +1.
>
> That said, I think that the particular change discussed in this thread
> can be made anyway, since it doesn't really modify ET's APIs or
> functionality, just the way it gets imported from stdlib.
I would suggest that, assuming python-dev want to take ownership of
the module, one last-ditch attempt be made to contact Fredrik. We
should email him, and copy python-dev (and maybe even python-list)
asking for his view, and ideally his blessing on the stdlib version
being forked and maintained independently going forward. Put a time
limit on responses ("if we don't hear by XXX, we'll assume Fredrik is
either uncontactable or not interested, and therefore we can go ahead
with maintaining the stdlib version independently").
It's important to respect Fredrik's wishes and ownership, but we can't
leave part of the stdlib frozen and abandoned just because he's not
available any longer.
Paul.
PS The only other options I can see are to remove elementtree from the
stdlib altogether, or explicitly document it as frozen and no longer
maintained.
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe:
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] folding cElementTree behind ElementTree in 3.3
On Wed, 8 Feb 2012 11:11:07 +
Paul Moore wrote:
> On 8 February 2012 09:49, Eli Bendersky wrote:
> >> I concur. It's important that we consider Fredrik's ownership of the
> >> modules, but if he fails to reply to email and doesn't update his
> >> repositories, there should be enough cause for python-dev to go on and
> >> appropriate the stdlib versions of those modules.
> >
> > +1.
> >
> > That said, I think that the particular change discussed in this thread
> > can be made anyway, since it doesn't really modify ET's APIs or
> > functionality, just the way it gets imported from stdlib.
>
> I would suggest that, assuming python-dev want to take ownership of
> the module, one last-ditch attempt be made to contact Fredrik. We
> should email him, and copy python-dev (and maybe even python-list)
> asking for his view, and ideally his blessing on the stdlib version
> being forked and maintained independently going forward. Put a time
> limit on responses ("if we don't hear by XXX, we'll assume Fredrik is
> either uncontactable or not interested, and therefore we can go ahead
> with maintaining the stdlib version independently").
>
> It's important to respect Fredrik's wishes and ownership, but we can't
> leave part of the stdlib frozen and abandoned just because he's not
> available any longer.
It's not frozen, it's actually maintained.
Regards
Antoine.
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe:
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] folding cElementTree behind ElementTree in 3.3
On Wed, Feb 8, 2012 at 10:04 PM, Antoine Pitrou wrote: > On Wed, 8 Feb 2012 11:11:07 + > Paul Moore wrote: >> It's important to respect Fredrik's wishes and ownership, but we can't >> leave part of the stdlib frozen and abandoned just because he's not >> available any longer. > > It's not frozen, it's actually maintained. Indeed, it sounds like the most appropriate course (if we don't hear otherwise from Fredrik) may be to just update PEP 360 to acknowledge current reality (i.e. the most current release of ElementTree is actually the one maintained by Florent in the stdlib). I'll note that this change isn't *quite* as simple as Eli's description earlier in the thread may suggest, though - the test suite also needs to be updated to ensure that the Python version is still fully exercised without the C acceleration applied. And such an an alteration would definitely be an explicit fork, even though the user facing API doesn't change - we're changing the structure of the code in a way that means some upstream deltas (if they happen to occur) may not apply cleanly. Regards, Nick. -- Nick Coghlan | [email protected] | Brisbane, Australia ___ Python-Dev mailing list [email protected] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] [Python-checkins] Daily reference leaks (140f7de4d2a5): sum=888
On Tue, Feb 7, 2012 at 2:34 PM, wrote: > results for 140f7de4d2a5 on branch "default" > > > test_capi leaked [296, 296, 296] references, sum=888 This appears to have started shortly after Benjamin's _PyExc_Init bltinmod refcounting change to fix Brett's crash when bootstrapping importlib. Perhaps we have a leak in import.c that was being masked by the DECREF in _PyExc_Init? Cheers, Nick. -- Nick Coghlan | [email protected] | Brisbane, Australia ___ Python-Dev mailing list [email protected] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] folding cElementTree behind ElementTree in 3.3
>> It's not frozen, it's actually maintained. > > Indeed, it sounds like the most appropriate course (if we don't hear > otherwise from Fredrik) may be to just update PEP 360 to acknowledge > current reality (i.e. the most current release of ElementTree is > actually the one maintained by Florent in the stdlib). > > I'll note that this change isn't *quite* as simple as Eli's > description earlier in the thread may suggest, though - the test suite > also needs to be updated to ensure that the Python version is still > fully exercised without the C acceleration applied. Sure thing. I suppose similar machinery already exists for things like pickle / cPickle. I still maintain that it's a simple change :-) > And such an an > alteration would definitely be an explicit fork, even though the user > facing API doesn't change - we're changing the structure of the code > in a way that means some upstream deltas (if they happen to occur) may > not apply cleanly. This is a very minimal delta, however. I think it can even be made simpler by replacing ElementTree with a facade module that either imports _elementtree or the Python ElementTree. So the delta vs. upstream would only be in file placement. But these are two conflicting discussions - if changes were made in stdlib *already* that were not propagated upstream, what use is a clean delta? Eli ___ Python-Dev mailing list [email protected] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] [Python-checkins] Daily reference leaks (140f7de4d2a5): sum=888
2012/2/8 Nick Coghlan : > On Tue, Feb 7, 2012 at 2:34 PM, wrote: >> results for 140f7de4d2a5 on branch "default" >> >> >> test_capi leaked [296, 296, 296] references, sum=888 > > This appears to have started shortly after Benjamin's _PyExc_Init > bltinmod refcounting change to fix Brett's crash when bootstrapping > importlib. Perhaps we have a leak in import.c that was being masked by > the DECREF in _PyExc_Init? According to test_capi, it's expected to leak? -- Regards, Benjamin ___ Python-Dev mailing list [email protected] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Add a new "locale" codec?
2012/2/8 Simon Cross : > I think I'm -1 on a "locale" encoding because it refers to different > actual encodings depending on where and when it's run, which seems > surprising, and there's already a more explicit way to achieve the > same effect. The following code is just an example to explain how locale is supposed to work, but the implementation is completly different: encoding = locale.getpreferredencoding(False) ... execute some code ... text = bytes.decode(encoding) bytes = text.encode(encoding) The current locale is process-wide: if a thread changes the locale, all threads are affected. Some functions have to use the current locale encoding, and not the locale encoding read at startup. Examples with C functions: strerror(), strftime(), tzname, etc. My codec implementation uses mbstowcs() and wcstombs() which don't touch the current locale, but just use it. Said diffferently, the locale codec would just give access to these functions. > The documentation on .getpreferredencoding() says some scary things > about needing to call .setlocale() sometimes but doesn't really say > when or why. locale.getpreferredencoding() always call setlocale() by default. locale.getpreferredencoding(False) doesn't call setlocale(). setlocale() is not called on Windows or if locale.CODESET is not available (it is available on FreeBSD, Mac OS X, Linux, etc.). > Could any of those cases make "locale" do weird things because it doesn't > call setlocale()? Sorry, I don't understand what do you mean by "weird things". The "locale" codec doesn't touch the locale. ___ Python-Dev mailing list [email protected] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Add a new "locale" codec?
On Wed, Feb 8, 2012 at 3:25 PM, Victor Stinner wrote: > Sorry, I don't understand what do you mean by "weird things". The > "locale" codec doesn't touch the locale. Sorry for being unclear. My question was about the following lines from http://docs.python.org/library/locale.html#locale.getpreferredencoding: """On some systems, it is necessary to invoke setlocale() to obtain the user preferences, so this function is not thread-safe. If invoking setlocale is not necessary or desired, do_setlocale should be set to False.""" So my question was about what happens on such systems where invoking setlocale is necessary to obtain the user preferences? Schiavo Simon ___ Python-Dev mailing list [email protected] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Add a new "locale" codec?
On Wed, Feb 8, 2012 at 3:25 PM, Victor Stinner wrote: > The current locale is process-wide: if a thread changes the locale, > all threads are affected. Some functions have to use the current > locale encoding, and not the locale encoding read at startup. Examples > with C functions: strerror(), strftime(), tzname, etc. Could a core part of Python breaking because of a sequence like: 1) Encode unicode to bytes using locale codec. 2) Silly third-party library code changes the locale codec. 3) Attempt to decode bytes back to unicode using the locale codec (which is now a different underlying codec). ? Schiavo Simon ___ Python-Dev mailing list [email protected] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] folding cElementTree behind ElementTree in 3.3
On 8 February 2012 12:21, Nick Coghlan wrote: > On Wed, Feb 8, 2012 at 10:04 PM, Antoine Pitrou wrote: >> On Wed, 8 Feb 2012 11:11:07 + >> Paul Moore wrote: >>> It's important to respect Fredrik's wishes and ownership, but we can't >>> leave part of the stdlib frozen and abandoned just because he's not >>> available any longer. >> >> It's not frozen, it's actually maintained. > > Indeed, it sounds like the most appropriate course (if we don't hear > otherwise from Fredrik) may be to just update PEP 360 to acknowledge > current reality (i.e. the most current release of ElementTree is > actually the one maintained by Florent in the stdlib). Ah, OK. My apologies, I had misunderstood the previous discussion. In which case I agree with Nick, lets' update PEP 360 and move forward. On that basis, +1 to Eli's suggestion of making cElementTree a transparent accelerator. Paul ___ Python-Dev mailing list [email protected] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] A new dictionary implementation
Hi, Version 2 is now available. Version 2 makes as few changes to tunable constants as possible, and generally does not change iteration order (so repr() is unchanged). All tests pass (the only changes to tests are for sys.getsizeof() ). Repository: https://bitbucket.org/markshannon/cpython_new_dict Issue http://bugs.python.org/issue13903 Performance changes are basically zero for non-OO code. Average -0.5% speed change on 2n3 benchamrks, a few benchmarks show a small reduction in memory use. (see notes below) GCbench uses 47% less memory and is 12% faster. 2to3, which seems to be the only "realistic" benchmark that runs on Py3, shows no change in speed and uses 10% less memory. All benchmarks and tests performed on old, slow 32bit machine with linux. Do please try it on your machine(s). If accepted, the new dict implementation will allow a useful optimisation of the LOAD_GLOBAL (and possibly LOAD_ATTR) bytecode: By testing to see if the (immutable) keys-tables is the expected table, the value can accessed directly by index, rather than by name. Cheers, Mark. Notes: All benchmarks from http://hg.python.org/benchmarks/ using the -m flag to get memory usage data. I've ignored the json benchmarks which shows unstable behaviour on my machine. Tiny changes to the dict being serialized or to the random seed can change the relative speed of my implementation vs CPython from -25% to +10%. ___ Python-Dev mailing list [email protected] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] requirements for moving __import__ over to importlib?
On Tue, Feb 7, 2012 at 17:42, Antoine Pitrou wrote: > On Tue, 7 Feb 2012 17:24:21 -0500 > Brett Cannon wrote: > > > > IOW you want the sys.modules case fast, which I will never be able to > match > > compared to C code since that is pure execution with no I/O. > > Why wouldn't continue using C code for that? It's trivial (just a dict > lookup). > Sure, but it's all the code between the function call and hitting sys.modules which would also need to get shoved into the C code. As I said, I have not tried to optimize anything yet (and unfortunately a lot of the upfront costs are over stupid things like checking if __import__ is being called with a string for the module name). ___ Python-Dev mailing list [email protected] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] requirements for moving __import__ over to importlib?
On Tue, Feb 7, 2012 at 18:08, Antoine Pitrou wrote:
> On Tue, 7 Feb 2012 17:16:18 -0500
> Brett Cannon wrote:
> >
> > > > IOW I really do not look forward to someone saying "importlib is so
> much
> > > > slower at importing a module containing ``pass``" when (a) that never
> > > > happens, and (b) most programs do not spend their time importing but
> > > > instead doing interesting work.
> > >
> > > Well, import time is so important that the Mercurial developers have
> > > written an on-demand import mechanism, to reduce the latency of
> > > command-line operations.
> > >
> >
> > Sure, but they are a somewhat extreme case.
>
> I don't think Mercurial is extreme. Any command-line tool written in
> Python applies. For example, yum (Fedora's apt-get) is written in
> Python. And I'm sure many people do small administration scripts in
> Python. These tools may then be run in a loop by whatever other script.
>
> > > But it's not only important for Mercurial and the like. Even if you're
> > > developing a Web app, making imports slower will make restarts slower,
> > > and development more tedious in the first place.
> > >
> > >
> > Fine, startup cost from a hard crash I can buy when you are getting 1000
> > QPS, but development more tedious?
>
> Well, waiting several seconds when reloading a development server is
> tedious. Anyway, my point was that other cases (than command-line
> tools) can be negatively impacted by import time.
>
> > > > So, if there is going to be some baseline performance target I need
> to
> > > hit
> > > > to make people happy I would prefer to know what that (real-world)
> > > > benchmark is and what the performance target is going to be on a
> > > non-debug
> > > > build.
> > >
> > > - No significant slowdown in startup time.
> > >
> >
> > What's significant and measuring what exactly? I mean startup already
> has a
> > ton of imports as it is, so this would wash out the point of measuring
> > practically anything else for anything small.
>
> I don't understand your sentence. Yes, startup has a ton of imports and
> that's why I'm fearing it may be negatively impacted :)
>
> ("a ton" being a bit less than 50 currently)
>
So you want less than a 50% startup cost on the standard startup benchmarks?
>
> > This is why I said I want a
> > benchmark to target which does actual work since flat-out startup time
> > measures nothing meaningful but busy work.
>
> "Actual work" can be very small in some cases. For example, if you run
> "hg branch" I'm quite sure it doesn't do a lot of work except importing
> many modules and then reading a single file in .hg (the one named
> ".hg/branch" probably, but I'm not a Mercurial dev).
>
> In the absence of more "real world" benchmarks, I think the startup
> benchmarks in the benchmarks repo are a good baseline.
>
> That said you could also install my 3.x port of Twisted here:
> https://bitbucket.org/pitrou/t3k/
>
> and then run e.g. "python3 bin/trial -h".
>
> > I would get more out of code
> > that just stat'ed every file in Lib since at least that did some work.
>
> stat()ing files is not really representative of import work. There are
> many indirections in the import machinery.
> (actually, even import.c appears quite slower than a bunch of stat()
> calls would imply)
>
> > > - Within 25% of current performance when importing, say, the "struct"
> > > module (Lib/struct.py) from bytecode.
> > >
> >
> > Why struct? It's such a small module that it isn't really a typical
> module.
>
> Precisely to measure the overhead. Typical module size will vary
> depending on development style. Some people may prefer writing many
> small modules. Or they may be using many small libraries, or using
> libraries that have adoptes such a development style.
>
> Measuring the overhead on small modules will make sure we aren't overly
> confident.
>
> > The median file size of Lib is 11K (e.g. tabnanny.py), not 238 bytes
> (which
> > is barely past Hello World). And is this just importing struct or is this
> > from startup, e.g. ``python -c "import struct"``?
>
> Just importing struct, as with the timeit snippets in the other thread.
OK, so less than 25% slowdown when importing a module with pre-existing
bytecode that is very small.
And here I was worrying you were going to suggest easy goals to reach for.
;)
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe:
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] requirements for moving __import__ over to importlib?
On Tue, Feb 7, 2012 at 21:27, PJ Eby wrote: > > > On Tue, Feb 7, 2012 at 5:24 PM, Brett Cannon wrote: > >> >> On Tue, Feb 7, 2012 at 16:51, PJ Eby wrote: >> >>> On Tue, Feb 7, 2012 at 3:07 PM, Brett Cannon wrote: >>> So, if there is going to be some baseline performance target I need to hit to make people happy I would prefer to know what that (real-world) benchmark is and what the performance target is going to be on a non-debug build. And if people are not worried about the performance then I'm happy with that as well. =) >>> >>> One thing I'm a bit worried about is repeated imports, especially ones >>> that are inside frequently-called functions. In today's versions of >>> Python, this is a performance win for "command-line tool platform" systems >>> like Mercurial and PEAK, where you want to delay importing as long as >>> possible, in case the code that needs the import is never called at all... >>> but, if it *is* used, you may still need to use it a lot of times. >>> >>> When writing that kind of code, I usually just unconditionally import >>> inside the function, because the C code check for an already-imported >>> module is faster than the Python "if" statement I'd have to clutter up my >>> otherwise-clean function with. >>> >>> So, in addition to the things other people have mentioned as performance >>> targets, I'd like to keep the slowdown factor low for this type of scenario >>> as well. Specifically, the slowdown shouldn't be so much as to motivate >>> lazy importers like Mercurial and PEAK to need to rewrite in-function >>> imports to do the already-imported check ourselves. ;-) >>> >>> (Disclaimer: I haven't actually seen Mercurial's delayed/dynamic import >>> code, so I can't say for 100% sure if they'd be affected the same way.) >>> >> >> IOW you want the sys.modules case fast, which I will never be able to >> match compared to C code since that is pure execution with no I/O. >> > > Couldn't you just prefix the __import__ function with something like this: > > ... > try: > module = sys.modules[name] > except KeyError: > # slow code path > > (Admittedly, the import lock is still a problem; initially I thought you > could just skip it for this case, but the problem is that another thread > could be in the middle of executing the module.) > I practically do already. As of right now there are some 'if' checks that come ahead of it that I could shift around to fast path this even more (since who cares about types and such if the module name happens to be in sys.modules), but it isn't that far off as-is. ___ Python-Dev mailing list [email protected] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] requirements for moving __import__ over to importlib?
Le mercredi 08 février 2012 à 11:01 -0500, Brett Cannon a écrit : > > > On Tue, Feb 7, 2012 at 17:42, Antoine Pitrou > wrote: > On Tue, 7 Feb 2012 17:24:21 -0500 > Brett Cannon wrote: > > > > IOW you want the sys.modules case fast, which I will never > be able to match > > compared to C code since that is pure execution with no I/O. > > > Why wouldn't continue using C code for that? It's trivial > (just a dict > lookup). > > > Sure, but it's all the code between the function call and hitting > sys.modules which would also need to get shoved into the C code. As I > said, I have not tried to optimize anything yet (and unfortunately a > lot of the upfront costs are over stupid things like checking if > __import__ is being called with a string for the module name). I guess my point was: why is there a function call in that case? The "import" statement could look up sys.modules directly. Or the built-in __import__ could still be written in C, and only defer to importlib when the module isn't found in sys.modules. Practicality beats purity. Regards Antoine. ___ Python-Dev mailing list [email protected] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] requirements for moving __import__ over to importlib?
On Tue, Feb 7, 2012 at 22:47, Nick Coghlan wrote: > On Wed, Feb 8, 2012 at 12:54 PM, Terry Reedy wrote: > > On 2/7/2012 9:35 PM, PJ Eby wrote: > >> It's just that not everything I write can depend on Importing. > >> Throw an equivalent into the stdlib, though, and I guess I wouldn't have > >> to worry about dependencies... > > > > And that is what I think (agree?) should be done to counteract the likely > > slowdown from using importlib. > > Yeah, this is one frequently reinvented wheel that could definitely do > with a standard implementation. Christian Heimes made an initial > attempt at such a thing years ago with PEP 369, but an importlib based > __import__ would let the implementation largely be pure Python (with > all the increase in power and flexibility that implies). > > I'll see if I can come up with a pure Python way to handle setting attributes on the module since that is the one case that my importers project code can't handle. > I'm not sure such an addition would help much with the base > interpreter start up time though - most of the modules we bring in are > because we're actually using them for some reason. > It wouldn't. This would be for third-parties only. > > The other thing that shouldn't be underrated here is the value in > making the builtin import system PEP 302 compliant from a > *documentation* perspective. I've made occasional attempts at fully > documenting the import system over the years, and I always end up > giving up because the combination of the pre-PEP 302 builtin > mechanisms in import.c and the PEP 302 compliant mechanisms for things > like zipimport just degenerate into a mess of special cases that are > impossible to justify beyond "nobody got around to fixing this yet". > The fact that we have an undocumented PEP 302 based reimplementation > of imports squirrelled away in pkgutil to make pkgutil and runpy work > is sheer insanity (replacing *that* with importlib might actually be a > good first step towards full integration). > I actually have never bothered to explain import as it is currently implemented in any of my PyCon import talks precisely because it is such a mess. It's just easier to explain from a PEP 302 perspective since you can actually comprehend that. ___ Python-Dev mailing list [email protected] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] requirements for moving __import__ over to importlib?
On Tue, Feb 7, 2012 at 22:47, Nick Coghlan wrote [SNIP] > The fact that we have an undocumented PEP 302 based reimplementation > of imports squirrelled away in pkgutil to make pkgutil and runpy work > is sheer insanity (replacing *that* with importlib might actually be a > good first step towards full integration). > It easily goes beyond runpy. You could ditch much of imp's C code (e.g. load_module()), you could write py_compile and compileall using importlib, you could rewrite zipimport, etc. Anything that touches import could be refactored to (a) use just Python code, and (b) reshare code so as to not re-invent the wheel constantly. ___ Python-Dev mailing list [email protected] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] requirements for moving __import__ over to importlib?
On Tue, Feb 7, 2012 at 18:26, Alex Gaynor wrote:
> Brett Cannon python.org> writes:
>
>
> > IOW you want the sys.modules case fast, which I will never be able to
> match
> compared to C code since that is pure execution with no I/O.
> >
>
>
> Sure you can: have a really fast Python VM.
>
> Constructive: if you can run this code under PyPy it'd be easy to just:
>
> $ pypy -mtimeit "import struct"
> $ pypy -mtimeit -s "import importlib" "importlib.import_module('struct')"
>
> Or whatever the right API is.
I'm not worried about PyPy. =) I assume you will just flat-out use
importlib regardless of what happens with CPython since it is/will be fully
compatible and is already written for you.
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe:
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] requirements for moving __import__ over to importlib?
On Wed, Feb 8, 2012 at 11:09, Antoine Pitrou wrote: > Le mercredi 08 février 2012 à 11:01 -0500, Brett Cannon a écrit : > > > > > > On Tue, Feb 7, 2012 at 17:42, Antoine Pitrou > > wrote: > > On Tue, 7 Feb 2012 17:24:21 -0500 > > Brett Cannon wrote: > > > > > > IOW you want the sys.modules case fast, which I will never > > be able to match > > > compared to C code since that is pure execution with no I/O. > > > > > > Why wouldn't continue using C code for that? It's trivial > > (just a dict > > lookup). > > > > > > Sure, but it's all the code between the function call and hitting > > sys.modules which would also need to get shoved into the C code. As I > > said, I have not tried to optimize anything yet (and unfortunately a > > lot of the upfront costs are over stupid things like checking if > > __import__ is being called with a string for the module name). > > I guess my point was: why is there a function call in that case? The > "import" statement could look up sys.modules directly. > Because people like to do wacky stuff with their imports and so fully bypassing __import__ would be bad. > Or the built-in __import__ could still be written in C, and only defer > to importlib when the module isn't found in sys.modules. > Practicality beats purity. It's a possibility, although that would require every function call to fetch the PyInterpreterState to get at the cached __import__ (so the proper sys and imp modules are used) and I don't know how expensive that would be (probably as not as expensive as calling out to Python code but I'm thinking out loud). ___ Python-Dev mailing list [email protected] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] requirements for moving __import__ over to importlib?
On Wed, Feb 8, 2012 at 11:15, Brett Cannon wrote: > > > On Tue, Feb 7, 2012 at 22:47, Nick Coghlan wrote > > [SNIP] > > >> The fact that we have an undocumented PEP 302 based reimplementation >> of imports squirrelled away in pkgutil to make pkgutil and runpy work >> is sheer insanity (replacing *that* with importlib might actually be a >> good first step towards full integration). >> > > It easily goes beyond runpy. You could ditch much of imp's C code (e.g. > load_module()), you could write py_compile and compileall using importlib, > you could rewrite zipimport, etc. Anything that touches import could be > refactored to (a) use just Python code, and (b) reshare code so as to not > re-invent the wheel constantly. > And taking it even farther, all of the blackbox aspects of import go away. For instance, the implicit, hidden importers for built-in modules, frozen modules, extensions, and source could actually be set on sys.path_hooks. The Meta path importer that handles sys.path could actually exist on sys.meta_path. ___ Python-Dev mailing list [email protected] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Add a new "locale" codec?
>> The current locale is process-wide: if a thread changes the locale, >> all threads are affected. Some functions have to use the current >> locale encoding, and not the locale encoding read at startup. Examples >> with C functions: strerror(), strftime(), tzname, etc. > > Could a core part of Python breaking because of a sequence like: > > 1) Encode unicode to bytes using locale codec. > 2) Silly third-party library code changes the locale codec. > 3) Attempt to decode bytes back to unicode using the locale codec > (which is now a different underlying codec). When you decode data from the OS, you have to use the current locale encoding. If you use a variable to store the encoding and the locale is changed, you have to update your variable or you get mojibake. Example with Python 2: lisa$ python2.7 Python 2.7.2+ (default, Oct 4 2011, 20:06:09) >>> import locale >>> encoding=locale.getpreferredencoding(False) >>> encoding 'ANSI_X3.4-1968' >>> encoding, os.strerror(23).decode(encoding) u'Too many open files in system' >>> locale.setlocale(locale.LC_ALL, '') # set the locale 'fr_FR.UTF-8' >>> os.strerror(23).decode(encoding) Traceback (most recent call last): ... UnicodeDecodeError: 'ascii' codec can't decode byte 0xc3 in position 37: ordinal not in range(128) >>> encoding=locale.getpreferredencoding(False) >>> encoding 'UTF-8' >>> os.strerror(23).decode(encoding) u'Trop de fichiers ouverts dans le syst\xe8me' You have to update manually encoding because setlocale() changed LC_MESSAGES locale category (message language) but also LC_CTYPE locale category (encoding). Using the "locale" encoding, you always get the current locale encoding. In some cases, you must use sys.getfilesystemencoding() (e.g. write into the console or encode/decode filenames), in other cases, you must use the current locale encoding (e.g. sterror() or strftime()). Python 3 does most of the work for me, so you don't have to care of the locale encoding (you just manipulate Unicode, it decodes bytes or encode back to bytes for you). But in some cases, you have to decode or encode manually using the right encoding. In this case, the "locale" codec can help you. The documentation will have to explain exactly what this new codec is, because as expected, it is confusing :-) Victor ___ Python-Dev mailing list [email protected] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
[Python-Dev] Code review tool uses my old email address
Hi, I changed my email address (about a year ago) and updated my bug tracker settings to my new address (late last year). However, the code review tool still shows my old email address. How do I change it? Cheers, Mark. ___ Python-Dev mailing list [email protected] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Code review tool uses my old email address
This may be a bug in the tracker, possibly related to http://psf.upfronthosting.co.za/roundup/meta/issue402 - it seems like changes to a user's details on bugs.python.org are not propagated to the review tool. Cheers, Nadeem ___ Python-Dev mailing list [email protected] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
[Python-Dev] PEP for new dictionary implementation
Proposed PEP for new dictionary implementation, PEP 410?
is attached.
Cheers,
Mark.
PEP: XXX
Title: Key-Sharing Dictionary
Version: $Revision$
Last-Modified: $Date$
Author: Mark Shannon
Status: Draft
Type: Standards Track
Content-Type: text/x-rst
Created: 08-Feb-2012
Python-Version: 3.3 or 3.4
Post-History: 08-Feb-2012
Abstract
This PEP proposes a change in the implementation of the builtin dictionary
type ``dict``. The new implementation allows dictionaries which are used as
attribute dictionaries (the ``__dict__`` attribute of an object) to share
keys with other attribute dictionaries of instances of the same class.
Motivation
==
The current dictionary implementation uses more memory than is necessary
when used as a container for object attributes as the keys are
replicated for each instance rather than being shared across many instances
of the same class.
Despite this, the current dictionary implementation is finely tuned and
performs very well as a general-purpose mapping object.
By separating the keys (and hashes) from the values it is possible to share
the keys between multiple dictionaries and improve memory use.
By ensuring that keys are separated from the values only when beneficial,
it is possible to retain the high-performance of the current dictionary
implementation when used as a general-purpose mapping object.
Behaviour
=
The new dictionary behaves in the same way as the old implementation.
It fully conforms to the Python API, the C API and the ABI.
Performance
===
Memory Usage
Reduction in memory use is directly related to the number of dictionaries
with shared keys in existence at any time. These dictionaries are typically
half the size of the current dictionary implementation.
Benchmarking shows that memory use is reduced by 10% to 20% for
object-oriented programs with no significant change in memory use
for other programs.
Speed
-
The performance of the new implementation is dominated by memory locality
effects. When keys are not shared (for example in module dictionaries
and dictionary explicitly created by dict() or {} ) then performance is
unchanged (within a percent or two) from the current implementation.
For the shared keys case, the new implementation tends to separate keys
from values, but reduces total memory usage. This will improve performance
in many cases as the effects of reduced memory usage outweigh the loss of
locality, but some programs may show a small slow down.
Benchmarking shows no significant change of speed for most benchmarks.
Object-oriented benchmarks show small speed ups when they create large
numbers of objects of the same class (the gcbench benchmark shows a 10%
speed up; this is likely to be an upper limit).
Implementation
==
Both the old and new dictionaries consist of a fixed-sized dict struct and
a re-sizeable table.
In the new dictionary the table can be further split into a keys table and
values array.
The keys table holds the keys and hashes and (for non-split tables) the
values as well. It differs only from the original implementation in that it
contains a number of fields that were previously in the dict struct.
If a table is split the values in the keys table are ignored, instead the
values are held in a separate array.
Split-Table dictionaries
When dictionaries are created to fill the __dict__ slot of an object, they are
created in split form. The keys table is cached in the type, potentially
allowing all attribute dictionaries of instances of one class to share keys.
In the event of the keys of these dictionaries starting to diverge,
individual dictionaries will lazily convert to the combined-table form.
This ensures good memory use in the common case, and correctness in all cases.
Combined-Table dictionaries
---
Explicit dictionaries (dict() or {}), module dictionaries and most other
dictionaries are created as combined-table dictionaries.
A combined-table dictionary never becomes a split-table dictionary.
Combined tables are laid out in much the same way as the tables in the old
dictionary, resulting in very similar performance.
Implementation
==
The new dictionary implementation is available at [1]_.
Pros and Cons
=
Pros
Significant memory savings for object-oriented applications.
Small improvement to speed for programs which create lots of objects.
Cons
Change to data structures:
Third party modules which meddle with the internals of the dictionary
implementation will break.
Changes to repr() output and iteration order:
For most cases, this will be unchanged.
However for some split-table dictionaries the iteration order will
change.
Neither of these cons should be a problem.
Modules which meddle with the internals of the dictionary
implementation are already broken and should be fixed to use the API.
The iteration order of dictionaries was never defined and has always
Re: [Python-Dev] requirements for moving __import__ over to importlib?
On 2/8/2012 11:13 AM, Brett Cannon wrote: On Tue, Feb 7, 2012 at 22:47, Nick Coghlan I'm not sure such an addition would help much with the base interpreter start up time though - most of the modules we bring in are because we're actually using them for some reason. It wouldn't. This would be for third-parties only. such as hg. That is what I had in mind. Would the following work? Treat a function as a 'loop' in that it may be executed repeatedly. Treat 'import x' in a function as what it is, an __import__ call plus a local assignment. Apply a version of the usual optimization: put a sys.modules-based lazy import outside of the function (at the top of the module?) and leave the local assignment "x = sys.modules['x']" in the function. Change sys.modules.__delattr__ to replace a module with a dummy, so the function will still work after a deletion, as it does now. -- Terry Jan Reedy ___ Python-Dev mailing list [email protected] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] requirements for moving __import__ over to importlib?
On Wed, 8 Feb 2012 11:07:10 -0500
Brett Cannon wrote:
> >
> > > > > So, if there is going to be some baseline performance target I need
> > to
> > > > hit
> > > > > to make people happy I would prefer to know what that (real-world)
> > > > > benchmark is and what the performance target is going to be on a
> > > > non-debug
> > > > > build.
> > > >
> > > > - No significant slowdown in startup time.
> > > >
> > >
> > > What's significant and measuring what exactly? I mean startup already
> > has a
> > > ton of imports as it is, so this would wash out the point of measuring
> > > practically anything else for anything small.
> >
> > I don't understand your sentence. Yes, startup has a ton of imports and
> > that's why I'm fearing it may be negatively impacted :)
> >
> > ("a ton" being a bit less than 50 currently)
> >
>
> So you want less than a 50% startup cost on the standard startup benchmarks?
No, ~50 is the number of imports at startup.
I think startup time should grow by less than 10%.
(even better if it shrinks of course :))
> And here I was worrying you were going to suggest easy goals to reach for.
> ;)
He. Well, if importlib enabled user-level functionality, I guess it
could be attractive to trade a slice of performance against it. But
from an user's point of view, bootstrapping importlib is mostly an
implementation detail with not much of a positive impact.
Regards
Antoine.
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe:
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] requirements for moving __import__ over to importlib?
On Wed, Feb 8, 2012 at 14:57, Terry Reedy wrote: > On 2/8/2012 11:13 AM, Brett Cannon wrote: > >> On Tue, Feb 7, 2012 at 22:47, Nick Coghlan > > > I'm not sure such an addition would help much with the base >>interpreter start up time though - most of the modules we bring in are >>because we're actually using them for some reason. >> > > It wouldn't. This would be for third-parties only. >> > > such as hg. That is what I had in mind. > > Would the following work? Treat a function as a 'loop' in that it may be > executed repeatedly. Treat 'import x' in a function as what it is, an > __import__ call plus a local assignment. Apply a version of the usual > optimization: put a sys.modules-based lazy import outside of the function > (at the top of the module?) and leave the local assignment "x = > sys.modules['x']" in the function. Change sys.modules.__delattr__ to > replace a module with a dummy, so the function will still work after a > deletion, as it does now. Probably, but I would hate to force people to code in a specific way for it to work. ___ Python-Dev mailing list [email protected] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] requirements for moving __import__ over to importlib?
On 2/8/2012 3:16 PM, Brett Cannon wrote: On Wed, Feb 8, 2012 at 14:57, Terry Reedy The intent of what I proposed it to be transparent for imports within functions. It would be a minor optimization if anything, but it would mean that there is a lazy mechanism in place. For top-level imports, unless *all* are made lazy, then there *must* be some indication in the code of whether to make it lazy or not. -- Terry Jan Reedy ___ Python-Dev mailing list [email protected] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] peps: Update with bugfix releases.
Am 05.02.2012 21:34, schrieb Ned Deily: > In article > <[email protected]>, > [email protected] wrote: > >>> I understand that but, to me, it makes no sense to send out truly >>> broken releases. Besides, the hash collision attack is not exactly >>> new either. Another few weeks can't make that much of a difference. >> >> Why would the release be truly broken? It surely can't be worse than >> the current releases (which apparently aren't truly broken, else >> there would have been no point in releasing them back then). > > They were broken by the release of OS X 10.7 and Xcode 4.2 which were > subsequent to the previous releases. None of the currently available > python.org installers provide a fully working system on OS X 10.7, or on > OS X 10.6 if the user has installed Xcode 4.2 for 10.6. In what way are the current releases not fully working? Are you referring to issues with building extension modules? If it's that, I wouldn't call that "truly broken". Plus, the releases continue to work fine on older OS X releases. So when you build a bug fix release, just build it with the same tool chain as the previous bug fix release, and all is fine. Regards, Martin ___ Python-Dev mailing list [email protected] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] which C language standard CPython must conform to
> Some quick searching shows that there is at least hope Microsoft is on > board with C++11x (not so surprising, their crown jewels are written > in C++). We should at some point demand a C++ compiler for CPython > and pick of subset of C++ features to allow use of but that is likely > reserved for the Python 4 timeframe (a topic for another thread and > time entirely, it isn't feasible for today's codebase). See my earlier post on building Python as a Windows 8 Metro App. As one strategy, I tried compiling Python as C++ code (as it wasn't clear whether C is fully supported; this is now resolved). It is actually feasible to change Python so that it compiles with a C++ compiler and still continues to compile as C also, with just a few ifdefs. This is, of course, off-topic wrt. the original question: even C++11 compilers often don't support non-ASCII identifiers. Regards, Martin ___ Python-Dev mailing list [email protected] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] requirements for moving __import__ over to importlib?
On Wed, Feb 8, 2012 at 15:31, Terry Reedy wrote: > On 2/8/2012 3:16 PM, Brett Cannon wrote: > >> On Wed, Feb 8, 2012 at 14:57, Terry Reedy >Would the following work? Treat a function as a 'loop' in that it >>may be executed repeatedly. Treat 'import x' in a function as what >>it is, an __import__ call plus a local assignment. Apply a version >>of the usual optimization: put a sys.modules-based lazy import >>outside of the function (at the top of the module?) and leave the >>local assignment "x = sys.modules['x']" in the function. Change >>sys.modules.__delattr__ to replace a module with a dummy, so the >>function will still work after a deletion, as it does now. >> >> Probably, but I would hate to force people to code in a specific way for >> it to work. >> > > The intent of what I proposed it to be transparent for imports within > functions. It would be a minor optimization if anything, but it would mean > that there is a lazy mechanism in place. > > For top-level imports, unless *all* are made lazy, then there *must* be > some indication in the code of whether to make it lazy or not. Not true; importlib would make it dead-simple to whitelist what modules to make lazy (e.g. your app code lazy but all stdlib stuff not, etc.). ___ Python-Dev mailing list [email protected] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] peps: Update with bugfix releases.
In article <[email protected]>, "Martin v. Lowis" wrote: > Am 05.02.2012 21:34, schrieb Ned Deily: > > In article > > <[email protected]>, > > [email protected] wrote: > > > >>> I understand that but, to me, it makes no sense to send out truly > >>> broken releases. Besides, the hash collision attack is not exactly > >>> new either. Another few weeks can't make that much of a difference. > >> > >> Why would the release be truly broken? It surely can't be worse than > >> the current releases (which apparently aren't truly broken, else > >> there would have been no point in releasing them back then). > > > > They were broken by the release of OS X 10.7 and Xcode 4.2 which were > > subsequent to the previous releases. None of the currently available > > python.org installers provide a fully working system on OS X 10.7, or on > > OS X 10.6 if the user has installed Xcode 4.2 for 10.6. > > In what way are the current releases not fully working? Are you > referring to issues with building extension modules? Yes > If it's that, I wouldn't call that "truly broken". Plus, the releases > continue to work fine on older OS X releases. If not "truly", then how about "seriously broken"? And it's not quite the case that the releases work fine on older OS X releases. The installers in question, the 64-/32-bit installer variants, work only on OS X 10.6 and above. If the user installed the optional Xcode 4.2 for 10.6, then they have the same problem with building extension modules as 10.7 users do. > So when you build a bug fix release, just build it with the same tool > chain as the previous bug fix release, and all is fine. I am not proposing changing the build tool chain for 3.2.x and 2.7.x bug fix releases. But, users not being able to build extension modules out of the box with the default vendor-supplied build tools as they have in the past is not a case of of all is fine, IMO. However, this may all be a moot point now as I've subsequently proposed a patch to Distutils to smooth over the problem by checking for the case of gcc-4.2 being required but not available and, if so, automatically substituting clang instead. (http://bugs.python.org/issue13590) This trades off a certain risk of using clang for extension modules against the 100% certainty of users being unable to build extension modules. -- Ned Deily, [email protected] ___ Python-Dev mailing list [email protected] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP for new dictionary implementation
On 2/8/2012 2:18 PM, Mark Shannon wrote:
A pretty clear draft PEP.
Changes to repr() output and iteration order:
For most cases, this will be unchanged.
However for some split-table dictionaries the iteration order will
change.
Neither of these cons should be a problem.
Modules which meddle with the internals of the dictionary
implementation are already broken and should be fixed to use the API.
So are modules that depend on set and dict iteration order and the
consequent representations.
The iteration order of dictionaries was never defined and has always been
arbitrary; it is different for Jython and PyPy.
I am pretty sure iteration order has changed between CPython versions in
the past (and that when it did, people got caught). The documentation
for doctest has section 25.2.3.6. Warnings. It starts with this very issue!
'''
doctest is serious about requiring exact matches in expected output. If
even a single character doesn’t match, the test fails. This will
probably surprise you a few times, as you learn exactly what Python does
and doesn’t guarantee about output. For example, when printing a dict,
Python doesn’t guarantee that the key-value pairs will be printed in any
particular order, so a test like
>>> foo()
{"Hermione": "hippogryph", "Harry": "broomstick"}
is vulnerable! One workaround is to do
>>> foo() == {"Hermione": "hippogryph", "Harry": "broomstick"}
True
instead. Another is to do
>>> d = sorted(foo().items())
>>> d
[('Harry', 'broomstick'), ('Hermione', 'hippogryph')]
'''
(Object addresses and full-precision float representations are also
discussed.)
--
Terry Jan Reedy
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe:
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP for new dictionary implementation
Terry Reedy wrote:
On 2/8/2012 2:18 PM, Mark Shannon wrote:
A pretty clear draft PEP.
Changes to repr() output and iteration order:
For most cases, this will be unchanged.
However for some split-table dictionaries the iteration order will
change.
Neither of these cons should be a problem.
Modules which meddle with the internals of the dictionary
implementation are already broken and should be fixed to use the API.
So are modules that depend on set and dict iteration order and the
consequent representations.
The iteration order of dictionaries was never defined and has always been
arbitrary; it is different for Jython and PyPy.
I am pretty sure iteration order has changed between CPython versions in
the past (and that when it did, people got caught). The documentation
for doctest has section 25.2.3.6. Warnings. It starts with this very issue!
'''
doctest is serious about requiring exact matches in expected output. If
even a single character doesn’t match, the test fails. This will
probably surprise you a few times, as you learn exactly what Python does
and doesn’t guarantee about output. For example, when printing a dict,
Python doesn’t guarantee that the key-value pairs will be printed in any
particular order, so a test like
>>> foo()
{"Hermione": "hippogryph", "Harry": "broomstick"}
is vulnerable! One workaround is to do
>>> foo() == {"Hermione": "hippogryph", "Harry": "broomstick"}
True
instead. Another is to do
>>> d = sorted(foo().items())
>>> d
[('Harry', 'broomstick'), ('Hermione', 'hippogryph')]
'''
(Object addresses and full-precision float representations are also
discussed.)
There are a few things in the standard lib that rely on dict repr ordering:
http://bugs.python.org/issue13907
http://bugs.python.org/issue13909
I expect that the long-awaited fix to the hash-collision security issue
will expose a few more.
Version 2 of the new dict passes all these tests,
but that doesn't mean the tests are correct.
Cheers,
Mark.
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe:
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] A new dictionary implementation
Just more info: changeset is: 74843:20702d1acf17 Cheers, francis ___ Python-Dev mailing list [email protected] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
[Python-Dev] ctypes/utils.py problem
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1
Hi everyone,
I'm working with the LTTng (Linux Tracing) team and we came across a problem
with our user-space tracer and Python default behavior. We provide a libc
wrapper that instrument free() and malloc() with a simple ld_preload of that
lib.
This lib *was* named "liblttng-ust-libc.so" and we came across python software
registering to our trace registry daemon (meaning that somehow the python binary
is using our in-process library). We dig a bit and found this:
Lib/ctypes/utils.py:
def _findLib_ldconfig(name):
# XXX assuming GLIBC's ldconfig (with option -p)
expr = r'/[^\(\)\s]*lib%s\.[^\(\)\s]*' % re.escape(name)
res = re.search(expr,
os.popen('/sbin/ldconfig -p 2>/dev/null').read())
and, at least, also found in _findLib_gcc(name) and _findSoname_ldconfig(name).
This cause Python to use any library ending with "libc.so" to be loaded
I don't know the reasons behind this but we are concerned about "future issues"
that can occur with this kind of behavior.
For now, we renamed our lib so everything is fine.
Thanks a lot guys.
David
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.10 (GNU/Linux)
iQEcBAEBAgAGBQJPMv9BAAoJEELoaioR9I02jwkIALmLg0esubJL+TrZFEahNwz7
85RUKSa/GKDx2sagsi62PWy5RfvRABs5Ij6ldtyQoszyuZuOlM5B7rMrpDvO588P
WqO1lzT6rdO9uyq2B6vPZRjjAr++StLKyIBbQodQd8PJkEsdN0kJISdRgIrSFL/E
0+2aUllrRgsVxc/oOF2LG+u7828iAYPfB71pC4euj2PgiwffZZ6J5gH4Q+mrUqt0
KiYU5X+vCEzWLv+ZLtq+h2rVrLNk8cFTL5N092iMwFfooSC70urD5a0cTR6pf/iI
UfFvuIVROsqiT2MwQxHApyChkrLnX0eWDPdeZZAFjnWVm4QPy8q09m6qX5eHloA=
=9wj8
-END PGP SIGNATURE-
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe:
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] A new dictionary implementation
Hi Mark, I've just cloned : Repository: https://bitbucket.org/markshannon/cpython_new_dict Do please try it on your machine(s). that's a: Linux random 3.1.0-1-amd64 #1 SMP Tue Jan 10 05:01:58 UTC 2012 x86_64 GNU/Linux and I'm getting: gcc -pthread -c -Wno-unused-result -g -O0 -Wall -Wstrict-prototypes -I. -I./Include -DPy_BUILD_CORE -o Objects/dictobject.o Objects/dictobject.c gcc -pthread -c -Wno-unused-result -g -O0 -Wall -Wstrict-prototypes -I. -I./Include -DPy_BUILD_CORE -o Objects/memoryobject.o Objects/memoryobject.c Objects/dictobject.c: In function ‘dict_popitem’: Objects/dictobject.c:2208:5: error: ‘PyDictKeyEntry’ has no member named ‘me_value’ make: *** [Objects/dictobject.o] Error 1 make: *** Waiting for unfinished jobs Cheers francis ___ Python-Dev mailing list [email protected] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Add a new "locale" codec?
Simon Cross wrote: I think I'm -1 on a "locale" encoding because it refers to different actual encodings depending on where and when it's run, which seems surprising Why is it surprising? Surely that's the whole point of a locale encoding: to use the locale encoding, whatever that happens to be at the time. Perhaps I'm missing something, but I don't see how this proposal is any more surprising than the fact that (say) Decimal uses a global context if you don't specify one explicitly. Only this should be *less* surprising, because Decimal uses the global context by default, while this will use the global locale encoding only if you explicitly tell it to. -- Steven ___ Python-Dev mailing list [email protected] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] folding cElementTree behind ElementTree in 3.3
Paul Moore wrote: I would suggest that, assuming python-dev want to take ownership of the module, one last-ditch attempt be made to contact Fredrik. We should email him, I wouldn't call email to be "last-ditch". I call email "first-ditch". I would expect that a last-ditch attempt would include trying to call him by phone, sending him a dead-tree letter by post, and if important enough, actually driving out to his home or place of work and trying to see him face to face. (All depending on the importance of making contact, naturally.) -- Steven ___ Python-Dev mailing list [email protected] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] ctypes/utils.py problem
Could you file a bug at bugs.python.org, David, so we don't lose track of
this?
On Wed, Feb 8, 2012 at 18:03, David Goulet wrote:
> -BEGIN PGP SIGNED MESSAGE-
> Hash: SHA1
>
> Hi everyone,
>
> I'm working with the LTTng (Linux Tracing) team and we came across a
> problem
> with our user-space tracer and Python default behavior. We provide a libc
> wrapper that instrument free() and malloc() with a simple ld_preload of
> that lib.
>
> This lib *was* named "liblttng-ust-libc.so" and we came across python
> software
> registering to our trace registry daemon (meaning that somehow the python
> binary
> is using our in-process library). We dig a bit and found this:
>
> Lib/ctypes/utils.py:
>
> def _findLib_ldconfig(name):
># XXX assuming GLIBC's ldconfig (with option -p)
>expr = r'/[^\(\)\s]*lib%s\.[^\(\)\s]*' % re.escape(name)
>res = re.search(expr,
>os.popen('/sbin/ldconfig -p 2>/dev/null').read())
>
> and, at least, also found in _findLib_gcc(name) and
> _findSoname_ldconfig(name).
>
> This cause Python to use any library ending with "libc.so" to be loaded
>
> I don't know the reasons behind this but we are concerned about "future
> issues"
> that can occur with this kind of behavior.
>
> For now, we renamed our lib so everything is fine.
>
> Thanks a lot guys.
> David
> -BEGIN PGP SIGNATURE-
> Version: GnuPG v1.4.10 (GNU/Linux)
>
> iQEcBAEBAgAGBQJPMv9BAAoJEELoaioR9I02jwkIALmLg0esubJL+TrZFEahNwz7
> 85RUKSa/GKDx2sagsi62PWy5RfvRABs5Ij6ldtyQoszyuZuOlM5B7rMrpDvO588P
> WqO1lzT6rdO9uyq2B6vPZRjjAr++StLKyIBbQodQd8PJkEsdN0kJISdRgIrSFL/E
> 0+2aUllrRgsVxc/oOF2LG+u7828iAYPfB71pC4euj2PgiwffZZ6J5gH4Q+mrUqt0
> KiYU5X+vCEzWLv+ZLtq+h2rVrLNk8cFTL5N092iMwFfooSC70urD5a0cTR6pf/iI
> UfFvuIVROsqiT2MwQxHApyChkrLnX0eWDPdeZZAFjnWVm4QPy8q09m6qX5eHloA=
> =9wj8
> -END PGP SIGNATURE-
> ___
> Python-Dev mailing list
> [email protected]
> http://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe:
> http://mail.python.org/mailman/options/python-dev/brett%40python.org
>
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe:
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] requirements for moving __import__ over to importlib?
On Wed, Feb 8, 2012 at 4:08 PM, Brett Cannon wrote: > > On Wed, Feb 8, 2012 at 15:31, Terry Reedy wrote: > >> For top-level imports, unless *all* are made lazy, then there *must* be >> some indication in the code of whether to make it lazy or not. >> > > Not true; importlib would make it dead-simple to whitelist what modules to > make lazy (e.g. your app code lazy but all stdlib stuff not, etc.). > There's actually only a few things stopping all imports from being lazy. "from x import y" immediately de-lazies them, after all. ;-) The main two reasons you wouldn't want imports to *always* be lazy are: 1. Changing sys.path or other parameters between the import statement and the actual import 2. ImportErrors are likewise deferred until point-of-use, so conditional importing with try/except would break. ___ Python-Dev mailing list [email protected] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] requirements for moving __import__ over to importlib?
On Thu, Feb 9, 2012 at 2:09 AM, Antoine Pitrou wrote:
> I guess my point was: why is there a function call in that case? The
> "import" statement could look up sys.modules directly.
> Or the built-in __import__ could still be written in C, and only defer
> to importlib when the module isn't found in sys.modules.
> Practicality beats purity.
I quite like the idea of having builtin __import__ be a *very* thin
veneer around importlib that just does the "is this in sys.modules
already so we can just return it from there?" checks and delegates
other more complex cases to Python code in importlib.
Poking around in importlib.__import__ [1] (as well as
importlib._gcd_import), I'm thinking what we may want to do is break
up the logic a bit so that there are multiple helper functions that a
C version can call back into so that we can optimise certain simple
code paths to not call back into Python at all, and others to only do
so selectively.
Step 1: separate out the "fromlist" processing from __import__ into a
separate helper function
def _process_fromlist(module, fromlist):
# Perform any required imports as per existing code:
#
http://hg.python.org/cpython/file/aba513307f78/Lib/importlib/_bootstrap.py#l987
Step 2: separate out the relative import resolution from _gcd_import
into a separate helper function.
def _resolve_relative_name(name, package, level):
assert hasattr(name, 'rpartition')
assert hasattr(package, 'rpartition')
assert level > 0
name = # Recalculate as per the existing code:
#
http://hg.python.org/cpython/file/aba513307f78/Lib/importlib/_bootstrap.py#l889
return name
Step 3: Implement builtin __import__ in C (pseudo-code below):
def __import__(name, globals={}, locals={}, fromlist=[], level=0):
if level > 0:
name = importlib._resolve_relative_import(name)
try:
module = sys.modules[name]
except KeyError:
# Not cached yet, need to invoke the full import machinery
# We already resolved any relative imports though, so
# treat it as an absolute import
return importlib.__import__(name, globals, locals, fromlist, 0)
# Got a hit in the cache, see if there's any more work to do
if not fromlist:
# Duplicate relevant importlib.__import__ logic as C code
# to find the right module to return from sys.modules
elif hasattr(module, "__path__"):
importlib._process_fromlist(module, fromlist)
return module
This would then be similar to the way main.c already works when it
interacts with runpy - simple cases are handled directly in C, more
complex cases get handed over to the Python module.
Cheers,
Nick.
[1] http://hg.python.org/cpython/file/default/Lib/importlib/_bootstrap.py#l950
--
Nick Coghlan | [email protected] | Brisbane, Australia
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe:
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] requirements for moving __import__ over to importlib?
On Thu, Feb 9, 2012 at 11:28 AM, PJ Eby wrote: > The main two reasons you wouldn't want imports to *always* be lazy are: > > 1. Changing sys.path or other parameters between the import statement and > the actual import > 2. ImportErrors are likewise deferred until point-of-use, so conditional > importing with try/except would break. 3. Module level code may have non-local side effects (e.g. installing codecs, pickle handlers, atexit handlers) A white-listing based approach to lazy imports would let you manage all those issues without having to change all the code that actually *does* the imports. Cheers, Nick. -- Nick Coghlan | [email protected] | Brisbane, Australia ___ Python-Dev mailing list [email protected] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] [Python-checkins] cpython: PEP 410
On Thu, Feb 9, 2012 at 7:52 AM, victor.stinner wrote: > http://hg.python.org/cpython/rev/f8409b3d6449 > changeset: 74832:f8409b3d6449 > user: Victor Stinner > date: Wed Feb 08 14:31:50 2012 +0100 > summary: > PEP 410 Ah, even when written by a core dev, a PEP should still be at Accepted before we check anything in. PEP 410 is still at Draft. Did Guido accept this one by private email? (He never made me his delegate, and without that, my agreement doesn't count as acceptance of the PEP). Cheers, Nick. -- Nick Coghlan | [email protected] | Brisbane, Australia ___ Python-Dev mailing list [email protected] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] [Python-checkins] cpython: PEP 410
On Thu, Feb 9, 2012 at 2:48 PM, Nick Coghlan wrote: > On Thu, Feb 9, 2012 at 7:52 AM, victor.stinner > wrote: >> http://hg.python.org/cpython/rev/f8409b3d6449 >> changeset: 74832:f8409b3d6449 >> user: Victor Stinner >> date: Wed Feb 08 14:31:50 2012 +0100 >> summary: >> PEP 410 > > Ah, even when written by a core dev, a PEP should still be at Accepted > before we check anything in. PEP 410 is still at Draft. Never mind, I just saw the checkin that reverted the change. Cheers, Nick. -- Nick Coghlan | [email protected] | Brisbane, Australia ___ Python-Dev mailing list [email protected] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Add a new "locale" codec?
On Thu, Feb 9, 2012 at 2:35 AM, Steven D'Aprano wrote: > Simon Cross wrote: >> >> I think I'm -1 on a "locale" encoding because it refers to different >> actual encodings depending on where and when it's run, which seems >> surprising > > > Why is it surprising? Surely that's the whole point of a locale encoding: to > use the locale encoding, whatever that happens to be at the time. I think there's a general expectation that if you encode something with one codec you will be able to decode it with the same codec. That's not necessarily true for the locale encoding. ___ Python-Dev mailing list [email protected] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
