Re: [Python-Dev] Add a new "locale" codec?

2012-02-08 Thread Victor Stinner
2012/2/8 Simon Cross :
> Is the idea to have:
>
>  b"foo".decode("locale")
>
> be roughly equivalent to
>
>  encoding = locale.getpreferredencoding(False)
>  b"foo".decode(encoding)
>
> ?

Yes. Whereas:

b"foo".decode(sys.getfilesystemencoding())

is equivalent to

encoding = locale.getpreferredencoding(True)
b"foo".decode(encoding)

Victor
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Add a new "locale" codec?

2012-02-08 Thread Simon Cross
I think I'm -1 on a "locale" encoding because it refers to different
actual encodings depending on where and when it's run, which seems
surprising, and there's already a more explicit way to achieve the
same effect.

The documentation on .getpreferredencoding() says some scary things
about needing to call .setlocale() sometimes but doesn't really say
when or why. Could any of those cases make "locale" do weird things
because it doesn't call setlocale()?
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] folding cElementTree behind ElementTree in 3.3

2012-02-08 Thread Dirkjan Ochtman
On Wed, Feb 8, 2012 at 08:37, Stefan Behnel  wrote:
> I didn't get a response from him to my e-mails since early 2010. Maybe
> others have more luck if they try, but I don't have the impression that
> waiting another two years gets us anywhere interesting.
>
> Given that it was two months ago that I started the "Fixing the XML
> batteries" thread (and years since I brought up the topic for the first
> time), it seems to be hard enough already to get anyone on python-dev
> actually do something for Python's XML support, instead of just actively
> discouraging those who invest time and work into it.

I concur. It's important that we consider Fredrik's ownership of the
modules, but if he fails to reply to email and doesn't update his
repositories, there should be enough cause for python-dev to go on and
appropriate the stdlib versions of those modules.

Cheers,

Dirkjan
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] folding cElementTree behind ElementTree in 3.3

2012-02-08 Thread Eli Bendersky
On Wed, Feb 8, 2012 at 11:36, Dirkjan Ochtman  wrote:
> On Wed, Feb 8, 2012 at 08:37, Stefan Behnel  wrote:
>> I didn't get a response from him to my e-mails since early 2010. Maybe
>> others have more luck if they try, but I don't have the impression that
>> waiting another two years gets us anywhere interesting.
>>
>> Given that it was two months ago that I started the "Fixing the XML
>> batteries" thread (and years since I brought up the topic for the first
>> time), it seems to be hard enough already to get anyone on python-dev
>> actually do something for Python's XML support, instead of just actively
>> discouraging those who invest time and work into it.
>
> I concur. It's important that we consider Fredrik's ownership of the
> modules, but if he fails to reply to email and doesn't update his
> repositories, there should be enough cause for python-dev to go on and
> appropriate the stdlib versions of those modules.
>

+1.

That said, I think that the particular change discussed in this thread
can be made anyway, since it doesn't really modify ET's APIs or
functionality, just the way it gets imported from stdlib.

Eli
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Add a new "locale" codec?

2012-02-08 Thread And Clover

On 2012-02-08 09:28, Simon Cross wrote:

I think I'm -1 on a "locale" encoding because it refers to different
actual encodings depending on where and when it's run, which seems
surprising, and there's already a more explicit way to achieve the
same effect.


I'd agree that this is undesirable, and I don't really want 
locale-specific behaviour to leak out in other places that accept a 
encoding name (eg ), but we already have this 
behaviour with the "mbcs" encoding on Windows which refers to the 
locale-specific 'ANSI' code page.


--
And Clover
mailto:[email protected]
http://www.doxdesk.com/
gtalk:[email protected]
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] folding cElementTree behind ElementTree in 3.3

2012-02-08 Thread Paul Moore
On 8 February 2012 09:49, Eli Bendersky  wrote:
>> I concur. It's important that we consider Fredrik's ownership of the
>> modules, but if he fails to reply to email and doesn't update his
>> repositories, there should be enough cause for python-dev to go on and
>> appropriate the stdlib versions of those modules.
>
> +1.
>
> That said, I think that the particular change discussed in this thread
> can be made anyway, since it doesn't really modify ET's APIs or
> functionality, just the way it gets imported from stdlib.

I would suggest that, assuming python-dev want to take ownership of
the module, one last-ditch attempt be made to contact Fredrik. We
should email him, and copy python-dev (and maybe even python-list)
asking for his view, and ideally his blessing on the stdlib version
being forked and maintained independently going forward. Put a time
limit on responses ("if we don't hear by XXX, we'll assume Fredrik is
either uncontactable or not interested, and therefore we can go ahead
with maintaining the stdlib version independently").

It's important to respect Fredrik's wishes and ownership, but we can't
leave part of the stdlib frozen and abandoned just because he's not
available any longer.

Paul.

PS The only other options I can see are to remove elementtree from the
stdlib altogether, or explicitly document it as frozen and no longer
maintained.
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] folding cElementTree behind ElementTree in 3.3

2012-02-08 Thread Antoine Pitrou
On Wed, 8 Feb 2012 11:11:07 +
Paul Moore  wrote:
> On 8 February 2012 09:49, Eli Bendersky  wrote:
> >> I concur. It's important that we consider Fredrik's ownership of the
> >> modules, but if he fails to reply to email and doesn't update his
> >> repositories, there should be enough cause for python-dev to go on and
> >> appropriate the stdlib versions of those modules.
> >
> > +1.
> >
> > That said, I think that the particular change discussed in this thread
> > can be made anyway, since it doesn't really modify ET's APIs or
> > functionality, just the way it gets imported from stdlib.
> 
> I would suggest that, assuming python-dev want to take ownership of
> the module, one last-ditch attempt be made to contact Fredrik. We
> should email him, and copy python-dev (and maybe even python-list)
> asking for his view, and ideally his blessing on the stdlib version
> being forked and maintained independently going forward. Put a time
> limit on responses ("if we don't hear by XXX, we'll assume Fredrik is
> either uncontactable or not interested, and therefore we can go ahead
> with maintaining the stdlib version independently").
> 
> It's important to respect Fredrik's wishes and ownership, but we can't
> leave part of the stdlib frozen and abandoned just because he's not
> available any longer.

It's not frozen, it's actually maintained.

Regards

Antoine.


___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] folding cElementTree behind ElementTree in 3.3

2012-02-08 Thread Nick Coghlan
On Wed, Feb 8, 2012 at 10:04 PM, Antoine Pitrou  wrote:
> On Wed, 8 Feb 2012 11:11:07 +
> Paul Moore  wrote:
>> It's important to respect Fredrik's wishes and ownership, but we can't
>> leave part of the stdlib frozen and abandoned just because he's not
>> available any longer.
>
> It's not frozen, it's actually maintained.

Indeed, it sounds like the most appropriate course (if we don't hear
otherwise from Fredrik) may be to just update PEP 360 to acknowledge
current reality (i.e. the most current release of ElementTree is
actually the one maintained by Florent in the stdlib).

I'll note that this change isn't *quite* as simple as Eli's
description earlier in the thread may suggest, though - the test suite
also needs to be updated to ensure that the Python version is still
fully exercised without the C acceleration applied. And such an an
alteration would definitely be an explicit fork, even though the user
facing API doesn't change - we're changing the structure of the code
in a way that means some upstream deltas (if they happen to occur) may
not apply cleanly.

Regards,
Nick.

-- 
Nick Coghlan   |   [email protected]   |   Brisbane, Australia
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] [Python-checkins] Daily reference leaks (140f7de4d2a5): sum=888

2012-02-08 Thread Nick Coghlan
On Tue, Feb 7, 2012 at 2:34 PM,   wrote:
> results for 140f7de4d2a5 on branch "default"
> 
>
> test_capi leaked [296, 296, 296] references, sum=888

This appears to have started shortly after Benjamin's _PyExc_Init
bltinmod refcounting change to fix Brett's crash when bootstrapping
importlib. Perhaps we have a leak in import.c that was being masked by
the DECREF in _PyExc_Init?

Cheers,
Nick.

-- 
Nick Coghlan   |   [email protected]   |   Brisbane, Australia
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] folding cElementTree behind ElementTree in 3.3

2012-02-08 Thread Eli Bendersky
>> It's not frozen, it's actually maintained.
>
> Indeed, it sounds like the most appropriate course (if we don't hear
> otherwise from Fredrik) may be to just update PEP 360 to acknowledge
> current reality (i.e. the most current release of ElementTree is
> actually the one maintained by Florent in the stdlib).
>
> I'll note that this change isn't *quite* as simple as Eli's
> description earlier in the thread may suggest, though - the test suite
> also needs to be updated to ensure that the Python version is still
> fully exercised without the C acceleration applied.

Sure thing. I suppose similar machinery already exists for things like
pickle / cPickle. I still maintain that it's a simple change :-)

> And such an an
> alteration would definitely be an explicit fork, even though the user
> facing API doesn't change - we're changing the structure of the code
> in a way that means some upstream deltas (if they happen to occur) may
> not apply cleanly.

This is a very minimal delta, however. I think it can even be made
simpler by replacing ElementTree with a facade module that either
imports _elementtree or the Python ElementTree. So the delta vs.
upstream would only be in file placement.

But these are two conflicting discussions - if changes were made in
stdlib *already* that were not propagated upstream, what use is a
clean delta?

Eli
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] [Python-checkins] Daily reference leaks (140f7de4d2a5): sum=888

2012-02-08 Thread Benjamin Peterson
2012/2/8 Nick Coghlan :
> On Tue, Feb 7, 2012 at 2:34 PM,   wrote:
>> results for 140f7de4d2a5 on branch "default"
>> 
>>
>> test_capi leaked [296, 296, 296] references, sum=888
>
> This appears to have started shortly after Benjamin's _PyExc_Init
> bltinmod refcounting change to fix Brett's crash when bootstrapping
> importlib. Perhaps we have a leak in import.c that was being masked by
> the DECREF in _PyExc_Init?

According to test_capi, it's expected to leak?



-- 
Regards,
Benjamin
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Add a new "locale" codec?

2012-02-08 Thread Victor Stinner
2012/2/8 Simon Cross :
> I think I'm -1 on a "locale" encoding because it refers to different
> actual encodings depending on where and when it's run, which seems
> surprising, and there's already a more explicit way to achieve the
> same effect.

The following code is just an example to explain how locale is
supposed to work, but the implementation is completly different:

encoding = locale.getpreferredencoding(False)
... execute some code ...
text = bytes.decode(encoding)
bytes = text.encode(encoding)

The current locale is process-wide: if a thread changes the locale,
all threads are affected. Some functions have to use the current
locale encoding, and not the locale encoding read at startup. Examples
with C functions: strerror(), strftime(), tzname, etc.

My codec implementation uses mbstowcs() and wcstombs() which don't
touch the current locale, but just use it. Said diffferently, the
locale codec would just give access to these functions.

> The documentation on .getpreferredencoding() says some scary things
> about needing to call .setlocale() sometimes but doesn't really say
> when or why.

locale.getpreferredencoding() always call setlocale() by default.
locale.getpreferredencoding(False) doesn't call setlocale().
setlocale() is not called on Windows or if locale.CODESET is not
available (it is available on FreeBSD, Mac OS X, Linux, etc.).

> Could any of those cases make "locale" do weird things because it doesn't 
> call setlocale()?

Sorry, I don't understand what do you mean by "weird things". The
"locale" codec doesn't touch the locale.
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Add a new "locale" codec?

2012-02-08 Thread Simon Cross
On Wed, Feb 8, 2012 at 3:25 PM, Victor Stinner
 wrote:
> Sorry, I don't understand what do you mean by "weird things". The
> "locale" codec doesn't touch the locale.

Sorry for being unclear. My question was about the following lines
from http://docs.python.org/library/locale.html#locale.getpreferredencoding:

"""On some systems, it is necessary to invoke setlocale() to obtain
the user preferences, so this function is not thread-safe. If invoking
setlocale is not necessary or desired, do_setlocale should be set to
False."""

So my question was about what happens on such systems where invoking
setlocale is necessary to obtain the user preferences?

Schiavo
Simon
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Add a new "locale" codec?

2012-02-08 Thread Simon Cross
On Wed, Feb 8, 2012 at 3:25 PM, Victor Stinner
 wrote:
> The current locale is process-wide: if a thread changes the locale,
> all threads are affected. Some functions have to use the current
> locale encoding, and not the locale encoding read at startup. Examples
> with C functions: strerror(), strftime(), tzname, etc.

Could a core part of Python breaking because of a sequence like:

1) Encode unicode to bytes using locale codec.
2) Silly third-party library code changes the locale codec.
3) Attempt to decode bytes back to unicode using the locale codec
(which is now a different underlying codec).

?

Schiavo
Simon
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] folding cElementTree behind ElementTree in 3.3

2012-02-08 Thread Paul Moore
On 8 February 2012 12:21, Nick Coghlan  wrote:
> On Wed, Feb 8, 2012 at 10:04 PM, Antoine Pitrou  wrote:
>> On Wed, 8 Feb 2012 11:11:07 +
>> Paul Moore  wrote:
>>> It's important to respect Fredrik's wishes and ownership, but we can't
>>> leave part of the stdlib frozen and abandoned just because he's not
>>> available any longer.
>>
>> It's not frozen, it's actually maintained.
>
> Indeed, it sounds like the most appropriate course (if we don't hear
> otherwise from Fredrik) may be to just update PEP 360 to acknowledge
> current reality (i.e. the most current release of ElementTree is
> actually the one maintained by Florent in the stdlib).

Ah, OK. My apologies, I had misunderstood the previous discussion. In
which case I agree with Nick, lets' update PEP 360 and move forward.

On that basis, +1 to Eli's suggestion of making cElementTree a
transparent accelerator.
Paul
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] A new dictionary implementation

2012-02-08 Thread Mark Shannon

Hi,

Version 2 is now available.

Version 2 makes as few changes to tunable constants as possible, and 
generally does not change iteration order (so repr() is unchanged).

All tests pass (the only changes to tests are for sys.getsizeof() ).

Repository: https://bitbucket.org/markshannon/cpython_new_dict
Issue http://bugs.python.org/issue13903

Performance changes are basically zero for non-OO code.
Average -0.5% speed change on 2n3 benchamrks, a few benchmarks show
a small reduction in memory use. (see notes below)

GCbench uses 47% less memory and is 12% faster.
2to3, which seems to be the only "realistic" benchmark that runs on Py3,
shows no change in speed and uses 10% less memory.

All benchmarks and tests performed on old, slow 32bit machine
with linux.
Do please try it on your machine(s).

If accepted, the new dict implementation will allow a useful 
optimisation of the LOAD_GLOBAL (and possibly LOAD_ATTR) bytecode:

By testing to see if the (immutable) keys-tables is the expected table,
the value can accessed directly by index, rather than by name.

Cheers,
Mark.


Notes:
All benchmarks from http://hg.python.org/benchmarks/
using the -m flag to get memory usage data.

I've ignored the json benchmarks which shows unstable behaviour
on my machine.
Tiny changes to the dict being serialized or to the random seed can 
change the relative speed of my implementation vs CPython from -25% to +10%.

___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] requirements for moving __import__ over to importlib?

2012-02-08 Thread Brett Cannon
On Tue, Feb 7, 2012 at 17:42, Antoine Pitrou  wrote:

> On Tue, 7 Feb 2012 17:24:21 -0500
> Brett Cannon  wrote:
> >
> > IOW you want the sys.modules case fast, which I will never be able to
> match
> > compared to C code since that is pure execution with no I/O.
>
> Why wouldn't continue using C code for that? It's trivial (just a dict
> lookup).
>

 Sure, but it's all the code between the function call and hitting
sys.modules which would also need to get shoved into the C code. As I said,
I have not tried to optimize anything yet (and unfortunately a lot of the
upfront costs are over stupid things like checking if  __import__ is being
called with a string for the module name).
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] requirements for moving __import__ over to importlib?

2012-02-08 Thread Brett Cannon
On Tue, Feb 7, 2012 at 18:08, Antoine Pitrou  wrote:

> On Tue, 7 Feb 2012 17:16:18 -0500
> Brett Cannon  wrote:
> >
> > >  > IOW I really do not look forward to someone saying "importlib is so
> much
> > > > slower at importing a module containing ``pass``" when (a) that never
> > > > happens, and (b) most programs do not spend their time importing but
> > > > instead doing interesting work.
> > >
> > > Well, import time is so important that the Mercurial developers have
> > > written an on-demand import mechanism, to reduce the latency of
> > > command-line operations.
> > >
> >
> > Sure, but they are a somewhat extreme case.
>
> I don't think Mercurial is extreme. Any command-line tool written in
> Python applies. For example, yum (Fedora's apt-get) is written in
> Python. And I'm sure many people do small administration scripts in
> Python. These tools may then be run in a loop by whatever other script.
>
> > > But it's not only important for Mercurial and the like. Even if you're
> > > developing a Web app, making imports slower will make restarts slower,
> > > and development more tedious in the first place.
> > >
> > >
> > Fine, startup cost from a hard crash I can buy when you are getting 1000
> > QPS, but development more tedious?
>
> Well, waiting several seconds when reloading a development server is
> tedious. Anyway, my point was that other cases (than command-line
> tools) can be negatively impacted by import time.
>
> > >  > So, if there is going to be some baseline performance target I need
> to
> > > hit
> > > > to make people happy I would prefer to know what that (real-world)
> > > > benchmark is and what the performance target is going to be on a
> > > non-debug
> > > > build.
> > >
> > > - No significant slowdown in startup time.
> > >
> >
> > What's significant and measuring what exactly? I mean startup already
> has a
> > ton of imports as it is, so this would wash out the point of measuring
> > practically anything else for anything small.
>
> I don't understand your sentence. Yes, startup has a ton of imports and
> that's why I'm fearing it may be negatively impacted :)
>
> ("a ton" being a bit less than 50 currently)
>

So you want less than a 50% startup cost on the standard startup benchmarks?


>
> > This is why I said I want a
> > benchmark to target which does actual work since flat-out startup time
> > measures nothing meaningful but busy work.
>
> "Actual work" can be very small in some cases. For example, if you run
> "hg branch" I'm quite sure it doesn't do a lot of work except importing
> many modules and then reading a single file in .hg (the one named
> ".hg/branch" probably, but I'm not a Mercurial dev).
>
> In the absence of more "real world" benchmarks, I think the startup
> benchmarks in the benchmarks repo are a good baseline.
>
> That said you could also install my 3.x port of Twisted here:
> https://bitbucket.org/pitrou/t3k/
>
> and then run e.g. "python3 bin/trial -h".
>
> > I would get more out of code
> > that just stat'ed every file in Lib since at least that did some work.
>
> stat()ing files is not really representative of import work. There are
> many indirections in the import machinery.
> (actually, even import.c appears quite slower than a bunch of stat()
> calls would imply)
>
> > > - Within 25% of current performance when importing, say, the "struct"
> > >  module (Lib/struct.py) from bytecode.
> > >
> >
> > Why struct? It's such a small module that it isn't really a typical
> module.
>
> Precisely to measure the overhead. Typical module size will vary
> depending on development style. Some people may prefer writing many
> small modules. Or they may be using many small libraries, or using
> libraries that have adoptes such a development style.
>
> Measuring the overhead on small modules will make sure we aren't overly
> confident.
>
> > The median file size of Lib is 11K (e.g. tabnanny.py), not 238 bytes
> (which
> > is barely past Hello World). And is this just importing struct or is this
> > from startup, e.g. ``python -c "import struct"``?
>
> Just importing struct, as with the timeit snippets in the other thread.


 OK, so less than 25% slowdown when importing a module with pre-existing
bytecode that is very small.

And here I was worrying you were going to suggest easy goals to reach for.
;)
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] requirements for moving __import__ over to importlib?

2012-02-08 Thread Brett Cannon
On Tue, Feb 7, 2012 at 21:27, PJ Eby  wrote:

>
>
> On Tue, Feb 7, 2012 at 5:24 PM, Brett Cannon  wrote:
>
>>
>> On Tue, Feb 7, 2012 at 16:51, PJ Eby  wrote:
>>
>>> On Tue, Feb 7, 2012 at 3:07 PM, Brett Cannon  wrote:
>>>
 So, if there is going to be some baseline performance target I need to
 hit to make people happy I would prefer to know what that (real-world)
 benchmark is and what the performance target is going to be on a non-debug
 build. And if people are not worried about the performance then I'm happy
 with that as well. =)

>>>
>>> One thing I'm a bit worried about is repeated imports, especially ones
>>> that are inside frequently-called functions.  In today's versions of
>>> Python, this is a performance win for "command-line tool platform" systems
>>> like Mercurial and PEAK, where you want to delay importing as long as
>>> possible, in case the code that needs the import is never called at all...
>>>  but, if it *is* used, you may still need to use it a lot of times.
>>>
>>> When writing that kind of code, I usually just unconditionally import
>>> inside the function, because the C code check for an already-imported
>>> module is faster than the Python "if" statement I'd have to clutter up my
>>> otherwise-clean function with.
>>>
>>> So, in addition to the things other people have mentioned as performance
>>> targets, I'd like to keep the slowdown factor low for this type of scenario
>>> as well.  Specifically, the slowdown shouldn't be so much as to motivate
>>> lazy importers like Mercurial and PEAK to need to rewrite in-function
>>> imports to do the already-imported check ourselves.  ;-)
>>>
>>> (Disclaimer: I haven't actually seen Mercurial's delayed/dynamic import
>>> code, so I can't say for 100% sure if they'd be affected the same way.)
>>>
>>
>> IOW you want the sys.modules case fast, which I will never be able to
>> match compared to C code since that is pure execution with no I/O.
>>
>
> Couldn't you just prefix the __import__ function with something like this:
>
>  ...
>  try:
>   module = sys.modules[name]
>  except KeyError:
>   # slow code path
>
> (Admittedly, the import lock is still a problem; initially I thought you
> could just skip it for this case, but the problem is that another thread
> could be in the middle of executing the module.)
>

I practically do already. As of right now there are some 'if' checks that
come ahead of it that I could shift around to fast path this even more
(since who cares about types and such if the module name happens to be in
sys.modules), but  it isn't that far off as-is.
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] requirements for moving __import__ over to importlib?

2012-02-08 Thread Antoine Pitrou
Le mercredi 08 février 2012 à 11:01 -0500, Brett Cannon a écrit :
> 
> 
> On Tue, Feb 7, 2012 at 17:42, Antoine Pitrou 
> wrote:
> On Tue, 7 Feb 2012 17:24:21 -0500
> Brett Cannon  wrote:
> >
> > IOW you want the sys.modules case fast, which I will never
> be able to match
> > compared to C code since that is pure execution with no I/O.
> 
> 
> Why wouldn't continue using C code for that? It's trivial
> (just a dict
> lookup).
> 
> 
>  Sure, but it's all the code between the function call and hitting
> sys.modules which would also need to get shoved into the C code. As I
> said, I have not tried to optimize anything yet (and unfortunately a
> lot of the upfront costs are over stupid things like checking if
> __import__ is being called with a string for the module name).

I guess my point was: why is there a function call in that case? The
"import" statement could look up sys.modules directly.
Or the built-in __import__ could still be written in C, and only defer
to importlib when the module isn't found in sys.modules.
Practicality beats purity.

Regards

Antoine.


___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] requirements for moving __import__ over to importlib?

2012-02-08 Thread Brett Cannon
On Tue, Feb 7, 2012 at 22:47, Nick Coghlan  wrote:

> On Wed, Feb 8, 2012 at 12:54 PM, Terry Reedy  wrote:
> > On 2/7/2012 9:35 PM, PJ Eby wrote:
> >>  It's just that not everything I write can depend on Importing.
> >> Throw an equivalent into the stdlib, though, and I guess I wouldn't have
> >> to worry about dependencies...
> >
> > And that is what I think (agree?) should be done to counteract the likely
> > slowdown from using importlib.
>
> Yeah, this is one frequently reinvented wheel that could definitely do
> with a standard implementation. Christian Heimes made an initial
> attempt at such a thing years ago with PEP 369, but an importlib based
> __import__ would let the implementation largely be pure Python (with
> all the increase in power and flexibility that implies).
>
>
I'll see if I can come up with a pure Python way to handle setting
attributes on the module since that is the one case that my importers
project code can't handle.


> I'm not sure such an addition would help much with the base
> interpreter start up time though - most of the modules we bring in are
> because we're actually using them for some reason.
>

It wouldn't. This would be for third-parties only.


>
> The other thing that shouldn't be underrated here is the value in
> making the builtin import system PEP 302 compliant from a
> *documentation* perspective. I've made occasional attempts at fully
> documenting the import system over the years, and I always end up
> giving up because the combination of the pre-PEP 302 builtin
> mechanisms in import.c and the PEP 302 compliant mechanisms for things
> like zipimport just degenerate into a mess of special cases that are
> impossible to justify beyond "nobody got around to fixing this yet".
> The fact that we have an undocumented PEP 302 based reimplementation
> of imports squirrelled away in pkgutil to make pkgutil and runpy work
> is sheer insanity (replacing *that* with importlib might actually be a
> good first step towards full integration).
>

I actually have never bothered to explain import as it is currently
implemented in any of my PyCon import talks precisely because it is such a
mess. It's just easier to explain from a PEP 302 perspective since you can
actually comprehend that.
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] requirements for moving __import__ over to importlib?

2012-02-08 Thread Brett Cannon
On Tue, Feb 7, 2012 at 22:47, Nick Coghlan  wrote

[SNIP]


> The fact that we have an undocumented PEP 302 based reimplementation
> of imports squirrelled away in pkgutil to make pkgutil and runpy work
> is sheer insanity (replacing *that* with importlib might actually be a
> good first step towards full integration).
>

It easily goes beyond runpy. You could ditch much of imp's C code (e.g.
load_module()), you could write py_compile and compileall using importlib,
you could rewrite zipimport, etc. Anything that touches import could be
refactored to (a) use just Python code, and (b) reshare code so as to not
re-invent the wheel constantly.
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] requirements for moving __import__ over to importlib?

2012-02-08 Thread Brett Cannon
On Tue, Feb 7, 2012 at 18:26, Alex Gaynor  wrote:

> Brett Cannon  python.org> writes:
>
>
> > IOW you want the sys.modules case fast, which I will never be able to
> match
> compared to C code since that is pure execution with no I/O.
> >
>
>
> Sure you can: have a really fast Python VM.
>
> Constructive: if you can run this code under PyPy it'd be easy to just:
>
> $ pypy -mtimeit "import struct"
> $ pypy -mtimeit -s "import importlib" "importlib.import_module('struct')"
>
> Or whatever the right API is.


I'm not worried about PyPy. =) I assume you will just  flat-out use
importlib regardless of what happens with CPython since it is/will be fully
compatible and is already written for you.
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] requirements for moving __import__ over to importlib?

2012-02-08 Thread Brett Cannon
On Wed, Feb 8, 2012 at 11:09, Antoine Pitrou  wrote:

> Le mercredi 08 février 2012 à 11:01 -0500, Brett Cannon a écrit :
> >
> >
> > On Tue, Feb 7, 2012 at 17:42, Antoine Pitrou 
> > wrote:
> > On Tue, 7 Feb 2012 17:24:21 -0500
> > Brett Cannon  wrote:
> > >
> > > IOW you want the sys.modules case fast, which I will never
> > be able to match
> > > compared to C code since that is pure execution with no I/O.
> >
> >
> > Why wouldn't continue using C code for that? It's trivial
> > (just a dict
> > lookup).
> >
> >
> >  Sure, but it's all the code between the function call and hitting
> > sys.modules which would also need to get shoved into the C code. As I
> > said, I have not tried to optimize anything yet (and unfortunately a
> > lot of the upfront costs are over stupid things like checking if
> > __import__ is being called with a string for the module name).
>
> I guess my point was: why is there a function call in that case? The
> "import" statement could look up sys.modules directly.
>

Because people like to do wacky stuff  with their imports and so fully
bypassing __import__ would be bad.


> Or the built-in __import__ could still be written in C, and only defer
> to importlib when the module isn't found in sys.modules.
> Practicality beats purity.


 It's a possibility, although that would require every function call to
fetch the PyInterpreterState to get at the cached __import__ (so the proper
sys and imp modules are used) and I don't know how expensive that would be
(probably as not as expensive as calling out to Python code but I'm
thinking out loud).
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] requirements for moving __import__ over to importlib?

2012-02-08 Thread Brett Cannon
On Wed, Feb 8, 2012 at 11:15, Brett Cannon  wrote:

>
>
> On Tue, Feb 7, 2012 at 22:47, Nick Coghlan  wrote
>
> [SNIP]
>
>
>> The fact that we have an undocumented PEP 302 based reimplementation
>> of imports squirrelled away in pkgutil to make pkgutil and runpy work
>> is sheer insanity (replacing *that* with importlib might actually be a
>> good first step towards full integration).
>>
>
> It easily goes beyond runpy. You could ditch much of imp's C code (e.g.
> load_module()), you could write py_compile and compileall using importlib,
> you could rewrite zipimport, etc. Anything that touches import could be
> refactored to (a) use just Python code, and (b) reshare code so as to not
> re-invent the wheel constantly.
>

And taking it even farther, all of the blackbox aspects of import go away.
For instance, the implicit, hidden importers for built-in modules, frozen
modules, extensions, and source could actually be set on sys.path_hooks.
The Meta path importer that handles sys.path could actually exist on
sys.meta_path.
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Add a new "locale" codec?

2012-02-08 Thread Victor Stinner
>> The current locale is process-wide: if a thread changes the locale,
>> all threads are affected. Some functions have to use the current
>> locale encoding, and not the locale encoding read at startup. Examples
>> with C functions: strerror(), strftime(), tzname, etc.
>
> Could a core part of Python breaking because of a sequence like:
>
> 1) Encode unicode to bytes using locale codec.
> 2) Silly third-party library code changes the locale codec.
> 3) Attempt to decode bytes back to unicode using the locale codec
> (which is now a different underlying codec).

When you decode data from the OS, you have to use the current locale
encoding. If you use a variable to store the encoding and the locale
is changed, you have to update your variable or you get mojibake.

Example with Python 2:

lisa$ python2.7
Python 2.7.2+ (default, Oct  4 2011, 20:06:09)
>>> import locale
>>> encoding=locale.getpreferredencoding(False)
>>> encoding
'ANSI_X3.4-1968'
>>> encoding, os.strerror(23).decode(encoding)
u'Too many open files in system'
>>> locale.setlocale(locale.LC_ALL, '') # set the locale
'fr_FR.UTF-8'
>>> os.strerror(23).decode(encoding)
Traceback (most recent call last):
  ...
UnicodeDecodeError: 'ascii' codec can't decode byte 0xc3 in position
37: ordinal not in range(128)
>>> encoding=locale.getpreferredencoding(False)
>>> encoding
'UTF-8'
>>> os.strerror(23).decode(encoding)
u'Trop de fichiers ouverts dans le syst\xe8me'

You have to update manually encoding because setlocale() changed
LC_MESSAGES locale category (message language) but also LC_CTYPE
locale category (encoding).

Using the "locale" encoding, you always get the current locale encoding.

In some cases, you must use sys.getfilesystemencoding() (e.g. write
into the console or encode/decode filenames), in other cases, you must
use the current locale encoding (e.g. sterror() or strftime()). Python
3 does most of the work for me, so you don't have to care of the
locale encoding (you just manipulate Unicode, it decodes bytes or
encode back to bytes for you). But in some cases, you have to decode
or encode manually using the right encoding. In this case, the
"locale" codec can help you.

The documentation will have to explain exactly what this new codec is,
because as expected, it is confusing :-)

Victor
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


[Python-Dev] Code review tool uses my old email address

2012-02-08 Thread Mark Shannon

Hi,

I changed my email address (about a year ago) and updated my bug tracker 
settings to my new address (late last year).

However, the code review tool still shows my old email address.
How do I change it?

Cheers,
Mark.
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Code review tool uses my old email address

2012-02-08 Thread Nadeem Vawda
This may be a bug in the tracker, possibly related to
http://psf.upfronthosting.co.za/roundup/meta/issue402 - it
seems like changes to a user's details on bugs.python.org
are not propagated to the review tool.

Cheers,
Nadeem
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


[Python-Dev] PEP for new dictionary implementation

2012-02-08 Thread Mark Shannon

Proposed PEP for new dictionary implementation, PEP 410?
is attached.

Cheers,
Mark.
PEP: XXX
Title: Key-Sharing Dictionary
Version: $Revision$
Last-Modified: $Date$
Author: Mark Shannon 
Status: Draft
Type: Standards Track
Content-Type: text/x-rst
Created: 08-Feb-2012
Python-Version: 3.3 or 3.4
Post-History: 08-Feb-2012


Abstract


This PEP proposes a change in the implementation of the builtin dictionary
type ``dict``. The new implementation allows dictionaries which are used as
attribute dictionaries (the ``__dict__`` attribute of an object) to share
keys with other attribute dictionaries of instances of the same class.

Motivation
==

The current dictionary implementation uses more memory than is necessary
when used as a container for object attributes as the keys are
replicated for each instance rather than being shared across many instances
of the same class.
Despite this, the current dictionary implementation is finely tuned and
performs very well as a general-purpose mapping object.

By separating the keys (and hashes) from the values it is possible to share
the keys between multiple dictionaries and improve memory use.
By ensuring that keys are separated from the values only when beneficial,
it is possible to retain the high-performance of the current dictionary
implementation when used as a general-purpose mapping object.

Behaviour
=

The new dictionary behaves in the same way as the old implementation.
It fully conforms to the Python API, the C API and the ABI.

Performance
===

Memory Usage


Reduction in memory use is directly related to the number of dictionaries
with shared keys in existence at any time. These dictionaries are typically
half the size of the current dictionary implementation.

Benchmarking shows that memory use is reduced by 10% to 20% for
object-oriented programs with no significant change in memory use
for other programs.

Speed
-

The performance of the new implementation is dominated by memory locality
effects. When keys are not shared (for example in module dictionaries
and dictionary explicitly created by dict() or {} ) then performance is
unchanged (within a percent or two) from the current implementation.

For the shared keys case, the new implementation tends to separate keys
from values, but reduces total memory usage. This will improve performance
in many cases as the effects of reduced memory usage outweigh the loss of
locality, but some programs may show a small slow down.

Benchmarking shows no significant change of speed for most benchmarks.
Object-oriented benchmarks show small speed ups when they create large
numbers of objects of the same class (the gcbench benchmark shows a 10%
speed up; this is likely to be an upper limit).

Implementation
==

Both the old and new dictionaries consist of a fixed-sized dict struct and
a re-sizeable table.
In the new dictionary the table can be further split into a keys table and
values array.
The keys table holds the keys and hashes and (for non-split tables) the
values as well. It differs only from the original implementation in that it
contains a number of fields that were previously in the dict struct.
If a table is split the values in the keys table are ignored, instead the
values are held in a separate array.

Split-Table dictionaries


When dictionaries are created to fill the __dict__ slot of an object, they are
created in split form. The keys table is cached in the type, potentially
allowing all attribute dictionaries of instances of one class to share keys.
In the event of the keys of these dictionaries starting to diverge,
individual dictionaries will lazily convert to the combined-table form.
This ensures good memory use in the common case, and correctness in all cases.

Combined-Table dictionaries
---

Explicit dictionaries (dict() or {}), module dictionaries and most other
dictionaries are created as combined-table dictionaries.
A combined-table dictionary never becomes a split-table dictionary.
Combined tables are laid out in much the same way as the tables in the old
dictionary, resulting in very similar performance.

Implementation
==

The new dictionary implementation is available at [1]_.

Pros and Cons
=

Pros


Significant memory savings for object-oriented applications.
Small improvement to speed for programs which create lots of objects.

Cons


Change to data structures:
Third party modules which meddle with the internals of the dictionary
implementation will break.
Changes to repr() output and iteration order:
For most cases, this will be unchanged.
However for some split-table dictionaries the iteration order will
change.

Neither of these cons should be a problem.
Modules which meddle with the internals of the dictionary
implementation are already broken and should be fixed to use the API.
The iteration order of dictionaries was never defined and has always 

Re: [Python-Dev] requirements for moving __import__ over to importlib?

2012-02-08 Thread Terry Reedy

On 2/8/2012 11:13 AM, Brett Cannon wrote:

On Tue, Feb 7, 2012 at 22:47, Nick Coghlan 


I'm not sure such an addition would help much with the base
interpreter start up time though - most of the modules we bring in are
because we're actually using them for some reason.



It wouldn't. This would be for third-parties only.


such as hg. That is what I had in mind.

Would the following work? Treat a function as a 'loop' in that it may be 
executed repeatedly. Treat 'import x' in a function as what it is, an 
__import__ call plus a local assignment. Apply a version of the usual 
optimization: put a sys.modules-based lazy import outside of the 
function (at the top of the module?) and leave the local assignment "x = 
sys.modules['x']" in the function. Change sys.modules.__delattr__ to 
replace a module with a dummy, so the function will still work after a 
deletion, as it does now.


--
Terry Jan Reedy

___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] requirements for moving __import__ over to importlib?

2012-02-08 Thread Antoine Pitrou
On Wed, 8 Feb 2012 11:07:10 -0500
Brett Cannon  wrote:
> >
> > > >  > So, if there is going to be some baseline performance target I need
> > to
> > > > hit
> > > > > to make people happy I would prefer to know what that (real-world)
> > > > > benchmark is and what the performance target is going to be on a
> > > > non-debug
> > > > > build.
> > > >
> > > > - No significant slowdown in startup time.
> > > >
> > >
> > > What's significant and measuring what exactly? I mean startup already
> > has a
> > > ton of imports as it is, so this would wash out the point of measuring
> > > practically anything else for anything small.
> >
> > I don't understand your sentence. Yes, startup has a ton of imports and
> > that's why I'm fearing it may be negatively impacted :)
> >
> > ("a ton" being a bit less than 50 currently)
> >
> 
> So you want less than a 50% startup cost on the standard startup benchmarks?

No, ~50 is the number of imports at startup.
I think startup time should grow by less than 10%.
(even better if it shrinks of course :))

> And here I was worrying you were going to suggest easy goals to reach for.
> ;)

He. Well, if importlib enabled user-level functionality, I guess it
could be attractive to trade a slice of performance against it. But
from an user's point of view, bootstrapping importlib is mostly an
implementation detail with not much of a positive impact.

Regards

Antoine.
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] requirements for moving __import__ over to importlib?

2012-02-08 Thread Brett Cannon
On Wed, Feb 8, 2012 at 14:57, Terry Reedy  wrote:

> On 2/8/2012 11:13 AM, Brett Cannon wrote:
>
>> On Tue, Feb 7, 2012 at 22:47, Nick Coghlan >
>
> I'm not sure such an addition would help much with the base
>>interpreter start up time though - most of the modules we bring in are
>>because we're actually using them for some reason.
>>
>
>  It wouldn't. This would be for third-parties only.
>>
>
> such as hg. That is what I had in mind.
>
> Would the following work? Treat a function as a 'loop' in that it may be
> executed repeatedly. Treat 'import x' in a function as what it is, an
> __import__ call plus a local assignment. Apply a version of the usual
> optimization: put a sys.modules-based lazy import outside of the function
> (at the top of the module?) and leave the local assignment "x =
> sys.modules['x']" in the function. Change sys.modules.__delattr__ to
> replace a module with a dummy, so the function will still work after a
> deletion, as it does now.


Probably, but I would hate to force people to code in a specific way for it
to work.
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] requirements for moving __import__ over to importlib?

2012-02-08 Thread Terry Reedy

On 2/8/2012 3:16 PM, Brett Cannon wrote:

On Wed, Feb 8, 2012 at 14:57, Terry Reedy 

The intent of what I proposed it to be transparent for imports within 
functions. It would be a minor optimization if anything, but it would 
mean that there is a lazy mechanism in place.


For top-level imports, unless *all* are made lazy, then there *must* be 
some indication in the code of whether to make it lazy or not.


--
Terry Jan Reedy

___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] peps: Update with bugfix releases.

2012-02-08 Thread Martin v. Löwis
Am 05.02.2012 21:34, schrieb Ned Deily:
> In article 
> <[email protected]>,
>  [email protected] wrote:
> 
>>> I understand that but, to me, it makes no sense to send out truly  
>>> broken releases.  Besides, the hash collision attack is not exactly  
>>> new either.  Another few weeks can't make that much of a difference.
>>
>> Why would the release be truly broken? It surely can't be worse than
>> the current releases (which apparently aren't truly broken, else
>> there would have been no point in releasing them back then).
> 
> They were broken by the release of OS X 10.7 and Xcode 4.2 which were 
> subsequent to the previous releases.  None of the currently available 
> python.org installers provide a fully working system on OS X 10.7, or on 
> OS X 10.6 if the user has installed Xcode 4.2 for 10.6.

In what way are the current releases not fully working? Are you
referring to issues with building extension modules?

If it's that, I wouldn't call that "truly broken". Plus, the releases
continue to work fine on older OS X releases.

So when you build a bug fix release, just build it with the same tool
chain as the previous bug fix release, and all is fine.

Regards,
Martin
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] which C language standard CPython must conform to

2012-02-08 Thread Martin v. Löwis
> Some quick searching shows that there is at least hope Microsoft is on
> board with C++11x (not so surprising, their crown jewels are written
> in C++).  We should at some point demand a C++ compiler for CPython
> and pick of subset of C++ features to allow use of but that is likely
> reserved for the Python 4 timeframe (a topic for another thread and
> time entirely, it isn't feasible for today's codebase).

See my earlier post on building Python as a Windows 8 Metro App.
As one strategy, I tried compiling Python as C++ code (as it wasn't
clear whether C is fully supported; this is now resolved). It is
actually feasible to change Python so that it compiles with a C++
compiler and still continues to compile as C also, with just
a few ifdefs.

This is, of course, off-topic wrt. the original question: even
C++11 compilers often don't support non-ASCII identifiers.

Regards,
Martin
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] requirements for moving __import__ over to importlib?

2012-02-08 Thread Brett Cannon
On Wed, Feb 8, 2012 at 15:31, Terry Reedy  wrote:

> On 2/8/2012 3:16 PM, Brett Cannon wrote:
>
>> On Wed, Feb 8, 2012 at 14:57, Terry Reedy >Would the following work? Treat a function as a 'loop' in that it
>>may be executed repeatedly. Treat 'import x' in a function as what
>>it is, an __import__ call plus a local assignment. Apply a version
>>of the usual optimization: put a sys.modules-based lazy import
>>outside of the function (at the top of the module?) and leave the
>>local assignment "x = sys.modules['x']" in the function. Change
>>sys.modules.__delattr__ to replace a module with a dummy, so the
>>function will still work after a deletion, as it does now.
>>
>> Probably, but I would hate to force people to code in a specific way for
>> it to work.
>>
>
> The intent of what I proposed it to be transparent for imports within
> functions. It would be a minor optimization if anything, but it would mean
> that there is a lazy mechanism in place.
>
> For top-level imports, unless *all* are made lazy, then there *must* be
> some indication in the code of whether to make it lazy or not.


Not true; importlib would make it dead-simple to whitelist what modules to
make lazy (e.g. your app code lazy but all stdlib stuff not, etc.).
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] peps: Update with bugfix releases.

2012-02-08 Thread Ned Deily
In article <[email protected]>,
 "Martin v. Lowis"  wrote:

> Am 05.02.2012 21:34, schrieb Ned Deily:
> > In article 
> > <[email protected]>,
> >  [email protected] wrote:
> > 
> >>> I understand that but, to me, it makes no sense to send out truly  
> >>> broken releases.  Besides, the hash collision attack is not exactly  
> >>> new either.  Another few weeks can't make that much of a difference.
> >>
> >> Why would the release be truly broken? It surely can't be worse than
> >> the current releases (which apparently aren't truly broken, else
> >> there would have been no point in releasing them back then).
> > 
> > They were broken by the release of OS X 10.7 and Xcode 4.2 which were 
> > subsequent to the previous releases.  None of the currently available 
> > python.org installers provide a fully working system on OS X 10.7, or on 
> > OS X 10.6 if the user has installed Xcode 4.2 for 10.6.
> 
> In what way are the current releases not fully working? Are you
> referring to issues with building extension modules?

Yes
 
> If it's that, I wouldn't call that "truly broken". Plus, the releases
> continue to work fine on older OS X releases.

If not "truly", then how about "seriously broken"? And it's not quite 
the case that the releases work fine on older OS X releases.  The 
installers in question, the 64-/32-bit installer variants, work only on 
OS X 10.6 and above.  If the user installed the optional Xcode 4.2 for 
10.6, then they have the same problem with building extension modules as 
10.7 users do.

> So when you build a bug fix release, just build it with the same tool
> chain as the previous bug fix release, and all is fine.

I am not proposing changing the build tool chain for 3.2.x and 2.7.x bug 
fix releases.  But, users not being able to build extension modules out 
of the box with the default vendor-supplied build tools as they have in 
the past is not a case of of all is fine, IMO.

However, this may all be a moot point now as I've subsequently proposed 
a patch to Distutils to smooth over the problem by checking for the case 
of gcc-4.2 being required but not available and, if so, automatically 
substituting clang instead.  (http://bugs.python.org/issue13590)   This 
trades off a certain risk of using clang for extension modules against 
the 100% certainty of users being unable to build extension modules.

-- 
 Ned Deily,
 [email protected]

___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP for new dictionary implementation

2012-02-08 Thread Terry Reedy

On 2/8/2012 2:18 PM, Mark Shannon wrote:

A pretty clear draft PEP.


Changes to repr() output and iteration order:
For most cases, this will be unchanged.
However for some split-table dictionaries the iteration order will
change.

Neither of these cons should be a problem.
Modules which meddle with the internals of the dictionary
implementation are already broken and should be fixed to use the API.


So are modules that depend on set and dict iteration order and the 
consequent representations.



The iteration order of dictionaries was never defined and has always been
arbitrary; it is different for Jython and PyPy.


I am pretty sure iteration order has changed between CPython versions in 
the past (and that when it did, people got caught). The documentation 
for doctest has section 25.2.3.6. Warnings. It starts with this very issue!

'''
doctest is serious about requiring exact matches in expected output. If 
even a single character doesn’t match, the test fails. This will 
probably surprise you a few times, as you learn exactly what Python does 
and doesn’t guarantee about output. For example, when printing a dict, 
Python doesn’t guarantee that the key-value pairs will be printed in any 
particular order, so a test like


>>> foo()
{"Hermione": "hippogryph", "Harry": "broomstick"}
is vulnerable! One workaround is to do

>>> foo() == {"Hermione": "hippogryph", "Harry": "broomstick"}
True
instead. Another is to do

>>> d = sorted(foo().items())
>>> d
[('Harry', 'broomstick'), ('Hermione', 'hippogryph')]
'''
(Object addresses and full-precision float representations are also 
discussed.)


--
Terry Jan Reedy


___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP for new dictionary implementation

2012-02-08 Thread Mark Shannon

Terry Reedy wrote:

On 2/8/2012 2:18 PM, Mark Shannon wrote:

A pretty clear draft PEP.


Changes to repr() output and iteration order:
For most cases, this will be unchanged.
However for some split-table dictionaries the iteration order will
change.

Neither of these cons should be a problem.
Modules which meddle with the internals of the dictionary
implementation are already broken and should be fixed to use the API.


So are modules that depend on set and dict iteration order and the 
consequent representations.



The iteration order of dictionaries was never defined and has always been
arbitrary; it is different for Jython and PyPy.


I am pretty sure iteration order has changed between CPython versions in 
the past (and that when it did, people got caught). The documentation 
for doctest has section 25.2.3.6. Warnings. It starts with this very issue!

'''
doctest is serious about requiring exact matches in expected output. If 
even a single character doesn’t match, the test fails. This will 
probably surprise you a few times, as you learn exactly what Python does 
and doesn’t guarantee about output. For example, when printing a dict, 
Python doesn’t guarantee that the key-value pairs will be printed in any 
particular order, so a test like


 >>> foo()
{"Hermione": "hippogryph", "Harry": "broomstick"}
is vulnerable! One workaround is to do

 >>> foo() == {"Hermione": "hippogryph", "Harry": "broomstick"}
True
instead. Another is to do

 >>> d = sorted(foo().items())
 >>> d
[('Harry', 'broomstick'), ('Hermione', 'hippogryph')]
'''
(Object addresses and full-precision float representations are also 
discussed.)




There are a few things in the standard lib that rely on dict repr ordering:
http://bugs.python.org/issue13907
http://bugs.python.org/issue13909

I expect that the long-awaited fix to the hash-collision security issue
will expose a few more.

Version 2 of the new dict passes all these tests,
but that doesn't mean the tests are correct.

Cheers,
Mark.


___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] A new dictionary implementation

2012-02-08 Thread francis

Just more info: changeset is: 74843:20702d1acf17

Cheers,

francis

___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


[Python-Dev] ctypes/utils.py problem

2012-02-08 Thread David Goulet
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

Hi everyone,

I'm working with the LTTng (Linux Tracing) team and we came across a problem
with our user-space tracer and Python default behavior. We provide a libc
wrapper that instrument free() and malloc() with a simple ld_preload of that 
lib.

This lib *was* named "liblttng-ust-libc.so" and we came across python software
registering to our trace registry daemon (meaning that somehow the python binary
is using our in-process library). We dig a bit and found this:

Lib/ctypes/utils.py:

def _findLib_ldconfig(name):
# XXX assuming GLIBC's ldconfig (with option -p)
expr = r'/[^\(\)\s]*lib%s\.[^\(\)\s]*' % re.escape(name)
res = re.search(expr,
os.popen('/sbin/ldconfig -p 2>/dev/null').read())

and, at least, also found in _findLib_gcc(name) and _findSoname_ldconfig(name).

This cause Python to use any library ending with "libc.so" to be loaded

I don't know the reasons behind this but we are concerned about "future issues"
that can occur with this kind of behavior.

For now, we renamed our lib so everything is fine.

Thanks a lot guys.
David
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.10 (GNU/Linux)

iQEcBAEBAgAGBQJPMv9BAAoJEELoaioR9I02jwkIALmLg0esubJL+TrZFEahNwz7
85RUKSa/GKDx2sagsi62PWy5RfvRABs5Ij6ldtyQoszyuZuOlM5B7rMrpDvO588P
WqO1lzT6rdO9uyq2B6vPZRjjAr++StLKyIBbQodQd8PJkEsdN0kJISdRgIrSFL/E
0+2aUllrRgsVxc/oOF2LG+u7828iAYPfB71pC4euj2PgiwffZZ6J5gH4Q+mrUqt0
KiYU5X+vCEzWLv+ZLtq+h2rVrLNk8cFTL5N092iMwFfooSC70urD5a0cTR6pf/iI
UfFvuIVROsqiT2MwQxHApyChkrLnX0eWDPdeZZAFjnWVm4QPy8q09m6qX5eHloA=
=9wj8
-END PGP SIGNATURE-
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] A new dictionary implementation

2012-02-08 Thread francis

Hi Mark,
I've just cloned :


Repository: https://bitbucket.org/markshannon/cpython_new_dict



Do please try it on your machine(s).

that's a:
Linux random 3.1.0-1-amd64 #1 SMP Tue Jan 10 05:01:58 UTC 2012 x86_64 
GNU/Linux



and I'm getting:

gcc -pthread -c -Wno-unused-result -g -O0 -Wall -Wstrict-prototypes -I. 
-I./Include -DPy_BUILD_CORE -o Objects/dictobject.o Objects/dictobject.c
gcc -pthread -c -Wno-unused-result -g -O0 -Wall -Wstrict-prototypes -I. 
-I./Include -DPy_BUILD_CORE -o Objects/memoryobject.o Objects/memoryobject.c

Objects/dictobject.c: In function ‘dict_popitem’:
Objects/dictobject.c:2208:5: error: ‘PyDictKeyEntry’ has no member named 
‘me_value’

make: *** [Objects/dictobject.o] Error 1
make: *** Waiting for unfinished jobs

Cheers

francis



___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Add a new "locale" codec?

2012-02-08 Thread Steven D'Aprano

Simon Cross wrote:

I think I'm -1 on a "locale" encoding because it refers to different
actual encodings depending on where and when it's run, which seems
surprising


Why is it surprising? Surely that's the whole point of a locale encoding: to 
use the locale encoding, whatever that happens to be at the time.


Perhaps I'm missing something, but I don't see how this proposal is any more 
surprising than the fact that (say) Decimal uses a global context if you don't 
specify one explicitly. Only this should be *less* surprising, because Decimal 
uses the global context by default, while this will use the global locale 
encoding only if you explicitly tell it to.




--
Steven

___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] folding cElementTree behind ElementTree in 3.3

2012-02-08 Thread Steven D'Aprano

Paul Moore wrote:


I would suggest that, assuming python-dev want to take ownership of
the module, one last-ditch attempt be made to contact Fredrik. We
should email him,


I wouldn't call email to be "last-ditch". I call email "first-ditch".

I would expect that a last-ditch attempt would include trying to call him by 
phone, sending him a dead-tree letter by post, and if important enough, 
actually driving out to his home or place of work and trying to see him face 
to face.


(All depending on the importance of making contact, naturally.)

--
Steven
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] ctypes/utils.py problem

2012-02-08 Thread Brett Cannon
Could you file a bug at bugs.python.org, David, so we don't lose track of
this?

On Wed, Feb 8, 2012 at 18:03, David Goulet  wrote:

> -BEGIN PGP SIGNED MESSAGE-
> Hash: SHA1
>
> Hi everyone,
>
> I'm working with the LTTng (Linux Tracing) team and we came across a
> problem
> with our user-space tracer and Python default behavior. We provide a libc
> wrapper that instrument free() and malloc() with a simple ld_preload of
> that lib.
>
> This lib *was* named "liblttng-ust-libc.so" and we came across python
> software
> registering to our trace registry daemon (meaning that somehow the python
> binary
> is using our in-process library). We dig a bit and found this:
>
> Lib/ctypes/utils.py:
>
> def _findLib_ldconfig(name):
># XXX assuming GLIBC's ldconfig (with option -p)
>expr = r'/[^\(\)\s]*lib%s\.[^\(\)\s]*' % re.escape(name)
>res = re.search(expr,
>os.popen('/sbin/ldconfig -p 2>/dev/null').read())
>
> and, at least, also found in _findLib_gcc(name) and
> _findSoname_ldconfig(name).
>
> This cause Python to use any library ending with "libc.so" to be loaded
>
> I don't know the reasons behind this but we are concerned about "future
> issues"
> that can occur with this kind of behavior.
>
> For now, we renamed our lib so everything is fine.
>
> Thanks a lot guys.
> David
> -BEGIN PGP SIGNATURE-
> Version: GnuPG v1.4.10 (GNU/Linux)
>
> iQEcBAEBAgAGBQJPMv9BAAoJEELoaioR9I02jwkIALmLg0esubJL+TrZFEahNwz7
> 85RUKSa/GKDx2sagsi62PWy5RfvRABs5Ij6ldtyQoszyuZuOlM5B7rMrpDvO588P
> WqO1lzT6rdO9uyq2B6vPZRjjAr++StLKyIBbQodQd8PJkEsdN0kJISdRgIrSFL/E
> 0+2aUllrRgsVxc/oOF2LG+u7828iAYPfB71pC4euj2PgiwffZZ6J5gH4Q+mrUqt0
> KiYU5X+vCEzWLv+ZLtq+h2rVrLNk8cFTL5N092iMwFfooSC70urD5a0cTR6pf/iI
> UfFvuIVROsqiT2MwQxHApyChkrLnX0eWDPdeZZAFjnWVm4QPy8q09m6qX5eHloA=
> =9wj8
> -END PGP SIGNATURE-
> ___
> Python-Dev mailing list
> [email protected]
> http://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe:
> http://mail.python.org/mailman/options/python-dev/brett%40python.org
>
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] requirements for moving __import__ over to importlib?

2012-02-08 Thread PJ Eby
On Wed, Feb 8, 2012 at 4:08 PM, Brett Cannon  wrote:

>
> On Wed, Feb 8, 2012 at 15:31, Terry Reedy  wrote:
>
>> For top-level imports, unless *all* are made lazy, then there *must* be
>> some indication in the code of whether to make it lazy or not.
>>
>
> Not true; importlib would make it dead-simple to whitelist what modules to
> make lazy (e.g. your app code lazy but all stdlib stuff not, etc.).
>

There's actually only a few things stopping all imports from being lazy.
 "from x import y" immediately de-lazies them, after all.  ;-)

The main two reasons you wouldn't want imports to *always* be lazy are:

1. Changing sys.path or other parameters between the import statement and
the actual import
2. ImportErrors are likewise deferred until point-of-use, so conditional
importing with try/except would break.
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] requirements for moving __import__ over to importlib?

2012-02-08 Thread Nick Coghlan
On Thu, Feb 9, 2012 at 2:09 AM, Antoine Pitrou  wrote:
> I guess my point was: why is there a function call in that case? The
> "import" statement could look up sys.modules directly.
> Or the built-in __import__ could still be written in C, and only defer
> to importlib when the module isn't found in sys.modules.
> Practicality beats purity.

I quite like the idea of having builtin __import__ be a *very* thin
veneer around importlib that just does the "is this in sys.modules
already so we can just return it from there?" checks and delegates
other more complex cases to Python code in importlib.

Poking around in importlib.__import__ [1] (as well as
importlib._gcd_import), I'm thinking what we may want to do is break
up the logic a bit so that there are multiple helper functions that a
C version can call back into so that we can optimise certain simple
code paths to not call back into Python at all, and others to only do
so selectively.

Step 1: separate out the "fromlist" processing from __import__ into a
separate helper function

def _process_fromlist(module, fromlist):
# Perform any required imports as per existing code:
# 
http://hg.python.org/cpython/file/aba513307f78/Lib/importlib/_bootstrap.py#l987


Step 2: separate out the relative import resolution from _gcd_import
into a separate helper function.

def _resolve_relative_name(name, package, level):
assert hasattr(name, 'rpartition')
assert hasattr(package, 'rpartition')
assert level > 0
name = # Recalculate as per the existing code:
# 
http://hg.python.org/cpython/file/aba513307f78/Lib/importlib/_bootstrap.py#l889
return name

Step 3: Implement builtin __import__ in C (pseudo-code below):

def __import__(name, globals={}, locals={}, fromlist=[], level=0):
if level > 0:
name = importlib._resolve_relative_import(name)
try:
module = sys.modules[name]
except KeyError:
# Not cached yet, need to invoke the full import machinery
# We already resolved any relative imports though, so
# treat it as an absolute import
return importlib.__import__(name, globals, locals, fromlist, 0)
# Got a hit in the cache, see if there's any more work to do
if not fromlist:
# Duplicate relevant importlib.__import__ logic as C code
# to find the right module to return from sys.modules
elif hasattr(module, "__path__"):
importlib._process_fromlist(module, fromlist)
return module

This would then be similar to the way main.c already works when it
interacts with runpy - simple cases are handled directly in C, more
complex cases get handed over to the Python module.

Cheers,
Nick.

[1] http://hg.python.org/cpython/file/default/Lib/importlib/_bootstrap.py#l950

-- 
Nick Coghlan   |   [email protected]   |   Brisbane, Australia
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] requirements for moving __import__ over to importlib?

2012-02-08 Thread Nick Coghlan
On Thu, Feb 9, 2012 at 11:28 AM, PJ Eby  wrote:
> The main two reasons you wouldn't want imports to *always* be lazy are:
>
> 1. Changing sys.path or other parameters between the import statement and
> the actual import
> 2. ImportErrors are likewise deferred until point-of-use, so conditional
> importing with try/except would break.

3. Module level code may have non-local side effects (e.g. installing
codecs, pickle handlers, atexit handlers)

A white-listing based approach to lazy imports would let you manage
all those issues without having to change all the code that actually
*does* the imports.

Cheers,
Nick.

-- 
Nick Coghlan   |   [email protected]   |   Brisbane, Australia
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] [Python-checkins] cpython: PEP 410

2012-02-08 Thread Nick Coghlan
On Thu, Feb 9, 2012 at 7:52 AM, victor.stinner
 wrote:
> http://hg.python.org/cpython/rev/f8409b3d6449
> changeset:   74832:f8409b3d6449
> user:        Victor Stinner 
> date:        Wed Feb 08 14:31:50 2012 +0100
> summary:
>  PEP 410

Ah, even when written by a core dev, a PEP should still be at Accepted
before we check anything in. PEP 410 is still at Draft.

Did Guido accept this one by private email? (He never made me his
delegate, and without that, my agreement doesn't count as acceptance
of the PEP).

Cheers,
Nick.

-- 
Nick Coghlan   |   [email protected]   |   Brisbane, Australia
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] [Python-checkins] cpython: PEP 410

2012-02-08 Thread Nick Coghlan
On Thu, Feb 9, 2012 at 2:48 PM, Nick Coghlan  wrote:
> On Thu, Feb 9, 2012 at 7:52 AM, victor.stinner
>  wrote:
>> http://hg.python.org/cpython/rev/f8409b3d6449
>> changeset:   74832:f8409b3d6449
>> user:        Victor Stinner 
>> date:        Wed Feb 08 14:31:50 2012 +0100
>> summary:
>>  PEP 410
>
> Ah, even when written by a core dev, a PEP should still be at Accepted
> before we check anything in. PEP 410 is still at Draft.

Never mind, I just saw the checkin that reverted the change.

Cheers,
Nick.

-- 
Nick Coghlan   |   [email protected]   |   Brisbane, Australia
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Add a new "locale" codec?

2012-02-08 Thread Simon Cross
On Thu, Feb 9, 2012 at 2:35 AM, Steven D'Aprano  wrote:
> Simon Cross wrote:
>>
>> I think I'm -1 on a "locale" encoding because it refers to different
>> actual encodings depending on where and when it's run, which seems
>> surprising
>
>
> Why is it surprising? Surely that's the whole point of a locale encoding: to
> use the locale encoding, whatever that happens to be at the time.

I think there's a general expectation that if you encode something
with one codec you will be able to decode it with the same codec.
That's not necessarily true for the locale encoding.
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com