Re: [Python-Dev] cProfile with generator throwing
Eyal Lotem schrieb:
> Hi. It seems that cProfile does not support throwing exceptions into
> generators properly, when an external timer routine is used.
>
> The problem is that _lsprof.c: ptrace_enter_call assumes that there
> are no exceptions set when it is called, which is not true when the
> generator frame is being gen_send_ex'd to send an exception into it
> (Maybe you could say that only CallExternalTimer assumes this, but I
> am not sure). This assumption causes its eventual call to
> CallExternalTimer to discover that an error is set and assume that it
> was caused by its own work (which it wasn't).
>
> I am not sure what the right way to fix this is, so I cannot send a patch.
> Here is a minimalist example to reproduce the bug:
>
import cProfile
import time
p=cProfile.Profile(time.clock)
def f():
> ... yield 1
> ...
p.run("f().throw(Exception())")
> Exception exceptions.Exception: Exception() in object at 0xb7f5a304> ignored
> Traceback (most recent call last):
> File "", line 1, in
> File "/usr/lib/python2.5/cProfile.py", line 135, in run
> return self.runctx(cmd, dict, dict)
> File "/usr/lib/python2.5/cProfile.py", line 140, in runctx
> exec cmd in globals, locals
> File "", line 1, in
> File "", line 1, in f
> SystemError: error return without exception set
There might be a similar problem with trace functions, see bug #1733757 which
is quite obscure too.
Georg
--
Thus spake the Lord: Thou shalt indent with four spaces. No more, no less.
Four shall be the number of spaces thou shalt indent, and the number of thy
indenting shall be four. Eight shalt thou not indent, nor either indent thou
two, excepting that thou then proceed to four. Tabs are right out.
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe:
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Compiling 2.5.1 under Studio 11
> I am having a couple of issues compiling Python 2.5.1 under Sun Solaris > Studio 11 on Solaris 8. > > Everything compiles correctly except the _ctypes module because it > cannot use the libffi that comes with Python and it does not exist on > the system. > > Has anyone gotten it to compile correctly using Studio 11? This is not a question for python-dev; please ask it on comp.lang.python. In any case, what processor are you using? I have compiled Python successfully with Sun C 5.8. > Also, during the pyexpat tests, Python generates a segfault. > > Are there any patches to fix these? Without knowing what precisely the problem is, it is difficult to say whether it has been fixed. Regards, Martin ___ Python-Dev mailing list [email protected] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
[Python-Dev] zipfile and unicode filenames
Hi everyone, Today I've stumbled upon a bug in my program that wasn't very straightforward to understand. The problem is that I was passing unicode filenames to zipfile.ZipFile.write and I had sys.setdefaultencoding() in effect, which resulted in a situation where most of the bytes generated in zipfile.ZipInfo.FileHeader would pass thru, except for a few, which caused codec error on another machine (where filenames got infectiously upgraded to unicode). The problem here is that it was absolutely unclear at first that I get unicode filenames passed to write, and it incorrectly accepted them silently. Is it worth to submit a bug report on this? The desired behavior here would be to either a) disallow unicode strings as arcname are raise an exception (since it is used in concatenation with raw data it is likely to cause problems because of auto upgrading raw data to unicode), or b) silently encode unicode strings to raw strings (something like if isinstance(filename, unicode): filename = filename.encode() in zipfile.ZipInfo constructor). So, should I submit a bug report, and which behavior would be actually correct? ___ Python-Dev mailing list [email protected] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
[Python-Dev] Instance variable access and descriptors
Hi. I was surprised to find in my profiling that instance variable access was pretty slow. I looked through the CPython code involved, and discovered something that really surprises me. Python, probably through the valid assumption that most attribute lookups go to the class, tries to look for the attribute in the class first, and in the instance, second. What Python currently does is quite peculiar! Here's a short description o PyObject_GenericGetAttr: A. Python looks for a descriptor in the _entire_ mro hierarchy (len(mro) class/type check and dict lookups). B. If Python found a descriptor and it has both get and set functions - it uses it to get the value and returns, skipping the next stage. C. If Python either did not find a descriptor, or found one that has no setter, it will try a lookup in the instance dict. D. If Python failed to find it in the instance, it will use the descriptor's getter, and if it has no getter it will use the descriptor itself. I believe the average costs of A are much higher than of C. Because there is just 1 instance dict to look through, and it is also typically smaller than the class dicts (in rare cases of worse-case timings of hash lookups), while there are len(mro) dicts to look for a descriptor in. This means that for simple instance variable lookups, Python is paying the full mro lookup price! I believe that this should be changed, so that Python first looks for the attribute in the instance's dict and only then through the dict's mro. This will have the following effects: A. It will break code that uses instance.__dict__['var'] directly, when 'var' exists as a property with a __set__ in the class. I believe this is not significant. B. It will simplify getattr's semantics. Python should _always_ give precedence to instance attributes over class ones, rather than have very weird special-cases (such as a property with a __set__). C. It will greatly speed up instance variable access, especially when the class has a large mro. I think obviously the code breakage is the worst problem. This could probably be addressed by a transition version in which Python warns about any instance attributes that existed in the mro as descriptors as well. What do you think? ___ Python-Dev mailing list [email protected] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Instance variable access and descriptors
On 6/9/07, Eyal Lotem <[EMAIL PROTECTED]> wrote: > I believe that this should be changed, so that Python first looks for > the attribute in the instance's dict and only then through the dict's > mro. [snip] > What do you think? Are you suggesting that the following code should print "43" instead of "42"? :: >>> class C(object): ... x = property(lambda self: self._x) ... def __init__(self): ... self._x = 42 ... >>> c = C() >>> c.__dict__['x'] = 43 >>> c.x 42 If so, this is a pretty substantial backwards incompatibility, and you should probably post this to python-ideas first to hash things out. If people like it there, the right target is probably Python 3000, not Python 2.x. STeVe -- I'm not *in*-sane. Indeed, I am so far *out* of sane that you appear a tiny blip on the distant coast of sanity. --- Bucky Katt, Get Fuzzy ___ Python-Dev mailing list [email protected] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Instance variable access and descriptors
> On 6/10/07, Steven Bethard <[EMAIL PROTECTED]> wrote: > > On 6/9/07, Eyal Lotem <[EMAIL PROTECTED]> wrote: > > > I believe that this should be changed, so that Python first looks for > > > the attribute in the instance's dict and only then through the dict's > > > mro. > > > > Are you suggesting that the following code should print "43" instead of > > "42"? > > :: > > > > >>> class C(object): > > ... x = property(lambda self: self._x) > > ... def __init__(self): > > ... self._x = 42 > > ... > > >>> c = C() > > >>> c.__dict__['x'] = 43 > > >>> c.x > > 42 On 6/9/07, Eyal Lotem <[EMAIL PROTECTED]> wrote: > Yes, I do suggest that. > But its important to notice that this is not a suggestion in order to > improve Python, but one that makes it possible to get reasonable > performance out of CPython. As such, I don't believe it should be done > in Py3K. > > Firstly, like everything that breaks backwards compatibility, it is > possible to have a transitional version that spits warnings for all > problems (detect name clashes between properties and instance dict). Sure, but then you're talking about really introducing this in Python 2.7, with 2.6 as a transitional version. So take a minute to look at the release timelines: http://www.python.org/dev/peps/pep-0361/ The initial 2.6 target is for April 2008. http://www.python.org/dev/peps/pep-3000/ I hope to have a first alpha release (3.0a1) out in the first half of 2007; it should take no more than a year from then before the first proper release, named Python 3.0 So I'm expecting Python 3.0 to come out *before* 2.7. Thus if you're proposing a backwards-incompatible change that would have to wait until 2.7 anyway, why not propose it for 3.0 where backwards-incompatible changes are more acceptable? STeVe -- I'm not *in*-sane. Indeed, I am so far *out* of sane that you appear a tiny blip on the distant coast of sanity. --- Bucky Katt, Get Fuzzy ___ Python-Dev mailing list [email protected] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Instance variable access and descriptors
At 12:23 AM 6/10/2007 +0300, Eyal Lotem wrote: >A. It will break code that uses instance.__dict__['var'] directly, >when 'var' exists as a property with a __set__ in the class. I believe >this is not significant. >B. It will simplify getattr's semantics. Python should _always_ give >precedence to instance attributes over class ones, rather than have >very weird special-cases (such as a property with a __set__). Actually, these are features that are both used and desirable; I've been using them both since Python 2.2 (i.e., for many years now). I'm -1 on removing these features from any version of Python, even 3.0. >C. It will greatly speed up instance variable access, especially when >the class has a large mro. ...at the cost of slowing down access to properties and __slots__, by adding an *extra* dictionary lookup there. Note, by the way, that if you want to change attribute lookup semantics, you can always override __getattribute__ and make it work whatever way you like, without forcing everybody else to change *their* code. ___ Python-Dev mailing list [email protected] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
[Python-Dev] Summary of Tracker Issues
ACTIVITY SUMMARY (06/03/07 - 06/10/07) Tracker at http://bugs.python.org/ To view or respond to any of the issues listed below, click on the issue number. Do NOT respond to this message. 1645 open ( +0) / 8584 closed ( +0) / 10229 total ( +0) Average duration of open issues: 822 days. Median duration of open issues: 770 days. Open Issues Breakdown open 1645 ( +0) pending 0 ( +0) Issues Now Closed (1) _ New issue test for email 87 days http://bugs.python.org/issue1001admin ___ Python-Dev mailing list [email protected] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Instance variable access and descriptors
I agree with Phillip with regard to the semantics. They are semantically desirable. However, there is a patch to add a mro cache to speed up these sorts of cases on the Python tracker, originally submitted by Armin Rigo. He saw ~20% speedups, others see less. It is currently just sitting there with no apparent activity. So if the overhead of mro lookups is that bothersome, it may be well worth your time to review the patch. URL: http://sourceforge.net/tracker/index.php?func=detail&aid=1700288&group_id=5470&atid=305470 -Kevin On 6/9/07, Phillip J. Eby <[EMAIL PROTECTED]> wrote: At 12:23 AM 6/10/2007 +0300, Eyal Lotem wrote: >A. It will break code that uses instance.__dict__['var'] directly, >when 'var' exists as a property with a __set__ in the class. I believe >this is not significant. >B. It will simplify getattr's semantics. Python should _always_ give >precedence to instance attributes over class ones, rather than have >very weird special-cases (such as a property with a __set__). Actually, these are features that are both used and desirable; I've been using them both since Python 2.2 (i.e., for many years now). I'm -1 on removing these features from any version of Python, even 3.0. >C. It will greatly speed up instance variable access, especially when >the class has a large mro. ...at the cost of slowing down access to properties and __slots__, by adding an *extra* dictionary lookup there. Note, by the way, that if you want to change attribute lookup semantics, you can always override __getattribute__ and make it work whatever way you like, without forcing everybody else to change *their* code. ___ Python-Dev mailing list [email protected] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/jacobs%40bioinformed.com ___ Python-Dev mailing list [email protected] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
[Python-Dev] Frame zombies
I was just looking through the code that handles frames (as part of my current effort to determine how to improve on CPython's performance), when I noticed the freelist/zombie mechanism for frame allocation handling. While the zombie mechanism seems like a nice optimization, I believe there can be a couple of improvements. Currently, each code object has a zombie frame that last executed it. This zombie is reused when that code object is re-executed in a frame. When a frame is released, it is reassigned as the zombie of the code, and iff the code object already has a zombie assigned to it, it places the frame in the freelist. If I understand correctly, this means, that the "freelist" is actually only ever used for recursive-call frames that were released. It also means that these released frames will be reassigned to other code objects in the future - in which case they will be reallocated, perhaps unnecessarily. "freelist" is just temporary storage for released recursive calls. A program with no recursions will always have an empty freelist. What is bounding the memory consumption of this mechanism is a limit on the number of frames in the freelist (and the fact that there is a limited number of code objects, each of which may have an additional zombie frame). I believe a better way to implement this mechanism: A. Replace the co_zombie frame with a co_zombie_freelist. B. Remove the global freelist. In other words, have a free list for each code object, rather than one-per-code-object and a freelist. This can be memory-bound by limiting the freelist size in each code object. This can be use a bit more memory if a recursion is called just once - and then discarded (waste for example 10 frames instead of 1), but can save a lot of realloc calls when there is more than one recursion used in the same program. It is also possible to substantially increase the number of frames stored per code-object, and then use some kind of more sophisticated aging mechanism on the zombie freelists to periodically get rid of unused freelists. That kind of mechanism would mean that even in the case of recursive calls, virtually all frame allocs are available from the freelist. I also believe this to be somewhat simpler, as there is only one concept (a zombie freelist) rather than 2 (a zombie code object and a freelist for recursive calls), and frames are never realloc'd, but only allocated. Should I make a patch? ___ Python-Dev mailing list [email protected] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Instance variable access and descriptors
I must be missing something, as I really see no reason to keep the existing semantics other than backwards compatibility (which can be achieved by introducing a __fastattr__ or such). Can you explain under which situations or find any example situation where the existing semantics are desirable? As for the mro cache - thanks for pointing it out - I think it can serve as a platform for another idea that in conjunction with psyco, can possibly speed up CPython very significantly (will create a thread about this soon). Please note that speeding up the mro-lookup solves only half of the problem (if it was solved - which it seems not to have been), the more important half of the problem remains, allow me to emphasize: ALL instance attribute accesses look up in both instance and class dicts, when it could look just in the instance dict. This is made worse by the fact that the class dict lookup is more expensive (with or without the mro cache). Some code that accesses a lot of instance attributes in an inner loop can easily be sped up by a factor of 2 or more (depending on the size of the mro). Eyal On 6/10/07, Kevin Jacobs <[EMAIL PROTECTED]> <[EMAIL PROTECTED]> wrote: > I agree with Phillip with regard to the semantics. They are semantically > desirable. However, there is a patch to add a mro cache to speed up these > sorts of cases on the Python tracker, originally submitted by Armin Rigo. > He saw ~20% speedups, others see less. It is currently just sitting there > with no apparent activity. So if the overhead of mro lookups is that > bothersome, it may be well worth your time to review the patch. > > URL: > http://sourceforge.net/tracker/index.php?func=detail&aid=1700288&group_id=5470&atid=305470 > > -Kevin > > > > On 6/9/07, Phillip J. Eby <[EMAIL PROTECTED]> wrote: > > > > At 12:23 AM 6/10/2007 +0300, Eyal Lotem wrote: > > >A. It will break code that uses instance.__dict__['var'] directly, > > >when 'var' exists as a property with a __set__ in the class. I believe > > >this is not significant. > > >B. It will simplify getattr's semantics. Python should _always_ give > > >precedence to instance attributes over class ones, rather than have > > >very weird special-cases (such as a property with a __set__). > > > > Actually, these are features that are both used and desirable; I've > > been using them both since Python 2.2 (i.e., for many years > > now). I'm -1 on removing these features from any version of Python, even > 3.0. > > > > > > >C. It will greatly speed up instance variable access, especially when > > >the class has a large mro. > > > > ...at the cost of slowing down access to properties and __slots__, by > > adding an *extra* dictionary lookup there. > > > > Note, by the way, that if you want to change attribute lookup > > semantics, you can always override __getattribute__ and make it work > > whatever way you like, without forcing everybody else to change *their* > code. > > > > ___ > > Python-Dev mailing list > > [email protected] > > http://mail.python.org/mailman/listinfo/python-dev > > Unsubscribe: > http://mail.python.org/mailman/options/python-dev/jacobs%40bioinformed.com > > > > ___ Python-Dev mailing list [email protected] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
[Python-Dev] Fwd: Instance variable access and descriptors
On 6/10/07, Phillip J. Eby <[EMAIL PROTECTED]> wrote: > At 12:23 AM 6/10/2007 +0300, Eyal Lotem wrote: > >A. It will break code that uses instance.__dict__['var'] directly, > >when 'var' exists as a property with a __set__ in the class. I believe > >this is not significant. > >B. It will simplify getattr's semantics. Python should _always_ give > >precedence to instance attributes over class ones, rather than have > >very weird special-cases (such as a property with a __set__). > > Actually, these are features that are both used and desirable; I've > been using them both since Python 2.2 (i.e., for many years > now). I'm -1 on removing these features from any version of Python, even 3.0. It is the same feature, actually, two sides of the same coin. Why do you use self.__dict__['propertyname'] when you can use self._propertyname? Why even call the first form, which is both longer and causes performance problems "a feature"? > >C. It will greatly speed up instance variable access, especially when > >the class has a large mro. > > ...at the cost of slowing down access to properties and __slots__, by > adding an *extra* dictionary lookup there. It will slow down access to properties - but that slowdown is insignificant: A. The vast majority of lookups are *NOT* of properties. They are the rare case and should not be the focus of optimization. B. Property access involves calling Python functions - which is heavier than a single dict lookup. C. The dict lookup to find the property in the __mro__ can involve many dicts (so in those cases adding a single dict lookup is not heavy). > Note, by the way, that if you want to change attribute lookup > semantics, you can always override __getattribute__ and make it work > whatever way you like, without forcing everybody else to change *their* code. If I write my own __getattribute__ I lose the performance benefit that I am after. I do agree that code shouldn't be broken, that's why a transitional that requires using __fastlookup__ can be used (Unfortunately, from __future__ cannot be used as it is not local to a module, but to a class hierarchy - unless one imports a feature from __future__ into a class). ___ Python-Dev mailing list [email protected] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] zipfile and unicode filenames
> Today I've stumbled upon a bug in my program that wasn't very > straightforward to understand. Unfortunately, it isn't straight-forward to understand your description of it, either. > The problem is that I was passing > unicode filenames to zipfile.ZipFile.write and I had > sys.setdefaultencoding() in effect What do you mean here? How can sys.setdefaultencoding() be "in effect"? There is always a default encoding; did you mean you changed the default? > which resulted in a situation > where most of the bytes generated in zipfile.ZipInfo.FileHeader would > pass thru, except for a few, which caused codec error on another > machine (where filenames got infectiously upgraded to unicode). Was the problem that most of the bytes would pass thru, or was the problem that a few did not pass thru? Why did filenames in the FileHeader infectiously upgraded to unicode on the other machine, but not on the first machine? > The > problem here is that it was absolutely unclear at first that I get > unicode filenames passed to write, and it incorrectly accepted them > silently. Is it worth to submit a bug report on this? Try to let me rephrase what I understood so far: "I changed the default system encoding from ASCII to some other value, and that caused zipfile.py to generate an incorrect zipfile. Is that a bug in zipfile?" To that, the answer is a clear "no". If you change the default encoding, you are on your own. Don't do that. > So, should I submit a bug report, and which behavior would be actually > correct? The issue of non-ASCII file names in zipfiles is fairly well understood. The ZIP format historically did not support them well. I believe this has recently been improved, but that format change has not propagated into the zipfile module, yet. Howeer, everybody is aware of the situation, so there is no need to report a bug. Regards, Martin ___ Python-Dev mailing list [email protected] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Frame zombies
> Should I make a patch? -1. This could consume quite a lot of memory, and I doubt that the speed improvement would be significant. Instead, I would check whether performance could be improved by just dropping the freelist. Looking at the code, I see that it performs a realloc (!) of the frame object if the one it found is too small. That surely must be expensive, and should be replaced with a free/malloc pair instead. I'd be curious to see whether malloc on today's systems is still so slow as to justify a free list. If it is, I would propose to create multiple free lists per size classes, e.g. for frames with 10, 20, 30, etc. variables, rather than having free lists per code object. Regards, Martin ___ Python-Dev mailing list [email protected] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Frame zombies
The freelist currently serves as a good optimization of a special case of a recurring recursion. If the same code object (or one of the same size) is used for recursive calls repeatedly, the freelist will realloc-to-same-size (which probably has no serious cost) and thus the cost of allocating/deallocating frames was averted. I think that in general, the complexity of a sophisticated and efficient aging mechanism is not worth it just to optimize recursive calls. The only question is whether it is truly a memory problem, if using, say, up-to 50 frames per code object? Note that _only_ recursions will have more than 1 frame attached. How many recursions are used and then discarded? How slow is it to constantly malloc/free frames in a recursion? My proposal will accelerate the following example: def f(x, y): if 0 == x: return f(x-1, y) def g(x): if 0 == x: return g(x-1) while True: f(100, 100) g(100) The current implementation will work well with the following: while True: f(100, 100) But removing freelist altogether will not work well with any type of recursion. Eyal On 6/10/07, "Martin v. Löwis" <[EMAIL PROTECTED]> wrote: > > Should I make a patch? > > -1. This could consume quite a lot of memory, and I doubt > that the speed improvement would be significant. Instead, > I would check whether performance could be improved by > just dropping the freelist. Looking at the code, I see > that it performs a realloc (!) of the frame object if > the one it found is too small. That surely must be > expensive, and should be replaced with a free/malloc pair > instead. > > I'd be curious to see whether malloc on today's systems > is still so slow as to justify a free list. If it is, > I would propose to create multiple free lists per size > classes, e.g. for frames with 10, 20, 30, etc. variables, > rather than having free lists per code object. > > Regards, > Martin > ___ > Python-Dev mailing list > [email protected] > http://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: > http://mail.python.org/mailman/options/python-dev/eyal.lotem%40gmail.com > ___ Python-Dev mailing list [email protected] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Frame zombies
> Note that _only_ recursions will have more than 1 frame attached. That's not true; in the presence of threads, the same method may also be invoked more than one time simultaneously. > But removing freelist altogether will not work well with any type of > recursion. How do you know that? Did you measure the time? On what system? What were the results? Performance optimization without measuring is just unacceptable. Regards, Martin ___ Python-Dev mailing list [email protected] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
