GeneratorExit should derive from BaseException, not Exception

2007-08-20 Thread Chad Austin
Hi all,

First, I'd like to describe a system that we've built here at IMVU in 
order to manage the complexity of our network- and UI-heavy application:

Our application is a standard Windows desktop application, with the main 
thread pumping Windows messages as fast as they become available.  On 
top of that, we've added the ability to queue arbitrary Python actions 
in the message pump so that they get executed on the main thread when 
its ready.  You can think of our EventPump as being similar to Twisted's 
reactor.

On top of the EventPump, we have a TaskScheduler which runs "tasks" in 
parallel.  Tasks are generators that behave like coroutines, and it's 
probably easiest to explain how they work with an example (made up on 
the spot, so there may be minor typos):

def openContentsWindow(contents):
# Imagine a notepad-like window with the URL's contents...
# ...

@threadtask
def readURL(url):
return urllib2.urlopen(url).read()

@task
def displayURL(url):
with LoadingDialog():
# blocks this task from running while contents are 
being downloaded, 
but does not block
# main thread because readURL runs in the threadpool.
contents = yield readURL(url)

openContentsWindow(contents)

A bit of explanation:

The @task decorator turns a generator-returning function into a 
coroutine that is run by the scheduler.  It can call other tasks via 
"yield" and block on network requests, etc.

All blocking network calls such as urllib2's urlopen and friends and 
xmlrpclib ServerProxy calls go behind the @threadtask decorator.  This 
means those functions will run in the thread pool and allow other ready 
tasks to execute in the meantime.

There are several benefits to this approach:

1) The logic is very readable.  The code doesn't have to go through any 
hoops to be performant or correct.
2) It's also very testable.  All of the threading-specific logic goes 
into the scheduler itself, which means our unit tests don't need to deal 
with any (many?) thread safety issues or races.
3) Exceptions bubble correctly through tasks, and the stack traces are 
what you would expect.
4) Tasks always run on the main thread, which is beneficial when you're 
dealing with external objects with thread-affinity, such as Direct3D and 
Windows.
5) Unlike threads, tasks can be cancelled.

ANYWAY, all advocacy aside, here is one problem we've run into:

Imagine a bit of code like this:

@task
def pollForChatInvites(chatGateway, userId, decisionCallback, 
startChatCallback, timeProvider, minimumPollInterval = 5):
while True:
now = timeProvider()

try:
result = yield 
chatGateway.checkForInvite({'userId': userId})
logger.info('checkForInvite2 returned %s', 
result)
except Exception:
logger.exception('checkForInvite2 failed')
result = None
# ...
yield Sleep(10)

This is real code that I wrote in the last week.  The key portion is the 
try: except:  Basically, there are many reasons the checkForInvite2 call 
can fail.  Maybe a socket.error (connection timeout), maybe some kind of 
httplib error, maybe an xmlrpclib.ProtocolError...  I actually don't 
care how it fails.  If it fails at all, then sleep for a while and try 
again.  All fine and good.

The problem is that, if the task is cancelled while it's waiting on 
checkForInvite2, GeneratorExit gets caught and handled rather than 
(correctly) bubbling out of the task.  GeneratorExit is similar in 
practice to SystemExit here, so it would make sense for it to be a 
BaseException as well.

So, my proposal is that GeneratorExit derive from BaseException instead 
of Exception.

p.s. Should I have sent this mail to python-dev directly?  Does what I'm 
saying make sense?  Does this kind of thing need a PEP?

-- 
Chad Austin
http://imvu.com/technology
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: GeneratorExit should derive from BaseException, not Exception

2007-08-21 Thread Chad Austin
Hi Terry,

Thank you for your feedback.  Responses inline:

Terry Reedy wrote:
> "Chad Austin" <[EMAIL PROTECTED]> wrote in message 
> news:[EMAIL PROTECTED]
> || try:
> | result = yield chatGateway.checkForInvite({'userId': userId})
> | logger.info('checkForInvite2 returned %s', result)
> 
> would not
> except GeneratorExit: 
> solve your problem?

Yes, we could add an "except GeneratorExit: raise" clause to every place
we currently catch Exception, but I feel like this is one of those
things where it's hard to get it right in all places and also hard to
cover with unit tests.  Instead, we'll have subtle bugs where finally
clauses don't run because the GeneratorExit was swallowed.

Also, SystemExit and KeyboardInterrupt were made into BaseExceptions for
the same reasons as I'm giving.  (As I understand it, anyway.)

> | except Exception:
> 
> Such catchalls are known to be prone to catch too much
> and are therefore not encouraged ;-).
> As in 'use at your own risk'.
> Guido encourages specific catches just for the reasons you give here.

More below:

> There was a *long* discussion of the current 2.5 exception hierarchy on 
> pydev.  Search either python.org's or gmane's archive if you want to pursue 
> this.  But I expect the people involved would say much the same as above.

I've actually read the background on the exception hierarchy (and agree
with it all), especially other suggestions that GeneratorExit derive
from BaseException.  As I understand it, Guido's objections are threefold:

1) The previous "generators as coroutines" examples were too
theoretical:  I've wanted GeneratorExit to derive from BaseException for
months now, but didn't write this proposal until I actually wrote code
that failed in the presence of task cancellation.

2) You should avoid catching everything with except Exception:  I think
that's too idealistic. Just do a search for try: except: through
publicly available Python.  :)  Sometimes, you really _do_ want to catch
everything.  When you're making a network request that involves
xmlrpclib, urllib2, httplib, etc. you don't actually care what the error
was.  (Well, except that the exceptions are submitted for automated
analysis.)  Similarly, when loading a cache file with pickle, I don't
care what went wrong, because it's not critical and should not be turned
into a crash for the user.  (We automatically report exceptions that
bubble into the main loop as crashes.)

3) If GeneratorExit escapes from the generator somehow and gets raised
in the main loop, then it will bubble out of the application like
SystemExit and KeyboardInterrupt would:  I think this argument is
somewhat specious, because I can't imagine how that would happen.  You'd
have to store exceptions in your generator and explicitly bubble them
out somehow.  Our crash handling has to specially handle
KeyboardInterrupt and SystemExit anyway, since there are currently
non-Exception exceptions, such as strings and custom classes that forgot
to derive from Exception, that should count as crashes.

I personally can't think of any cases where I would _want_ to handle
GeneratorExit.  I just want finally: and with: clauses to do the right
thing when a task is cancelled.  Anyway, I haven't yet encountered any
serious bugs due to this yet...  I'm just worried that if a task is
holding some resource and blocking on something, then the resource won't
get released.  If this really does come up, then I do have a little bit
of python + ctypes that replaces GeneratorExit with ImvuGeneratorExit
(deriving from BaseException), but that's not very appealing.

Thanks again,

-- 
Chad Austin
http://imvu.com/technology

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: GeneratorExit should derive from BaseException, not Exception

2007-08-21 Thread Chad Austin
Oops, forgot to mention this:

I wouldn't be opposed to a different extension that would effectively 
let me accomplish the same goals...  arbitrary exception filters. 
Imagine this:

try:
raise GeneratorExit
except ExceptionFilter:
# blah

where ExceptionFilter is any object that can be tested for containment. 
  Perhaps implemented like this:

class ExceptionFilter(object):
def __init__(self):
self.includes = set()
self.excludes = set()

self.include = self.includes.add
self.exclude = self.excludes.add

def __contains__(self, exc):
return any(isinstance(exc, cls) for cls in 
self.includes) and \
not any(isinstance(exc, cls) for cls in 
self.excludes)

ImvuExceptionFilter = ExceptionFilter()
ImvuExceptionFilter.include(Exception)
ImvuExceptionFilter.exclude(GeneratorExit)

Then, our code could just "catch" ImvuExceptionFilter.  This type of 
extension would be backwards compatible with the current except 
(FooError, BarError) tuple syntax.

I've never hacked on CPython itself, so I don't know what kind of 
changes there would be involved, but if there is sufficient pushback 
against making GeneratorExit derive from BaseException, I think this is 
a fine alternative.  Thoughts?

Chad

Chad Austin wrote:
> Hi Terry,
> 
> Thank you for your feedback.  Responses inline:
> 
> Terry Reedy wrote:
>> "Chad Austin" <[EMAIL PROTECTED]> wrote in message 
>> news:[EMAIL PROTECTED]
>> || try:
>> | result = yield chatGateway.checkForInvite({'userId': userId})
>> | logger.info('checkForInvite2 returned %s', result)
>>
>> would not
>> except GeneratorExit: 
>> solve your problem?
> 
> Yes, we could add an "except GeneratorExit: raise" clause to every place
> we currently catch Exception, but I feel like this is one of those
> things where it's hard to get it right in all places and also hard to
> cover with unit tests.  Instead, we'll have subtle bugs where finally
> clauses don't run because the GeneratorExit was swallowed.
> 
> Also, SystemExit and KeyboardInterrupt were made into BaseExceptions for
> the same reasons as I'm giving.  (As I understand it, anyway.)
> 
>> | except Exception:
>>
>> Such catchalls are known to be prone to catch too much
>> and are therefore not encouraged ;-).
>> As in 'use at your own risk'.
>> Guido encourages specific catches just for the reasons you give here.
> 
> More below:
> 
>> There was a *long* discussion of the current 2.5 exception hierarchy on 
>> pydev.  Search either python.org's or gmane's archive if you want to pursue 
>> this.  But I expect the people involved would say much the same as above.
> 
> I've actually read the background on the exception hierarchy (and agree
> with it all), especially other suggestions that GeneratorExit derive
> from BaseException.  As I understand it, Guido's objections are threefold:
> 
> 1) The previous "generators as coroutines" examples were too
> theoretical:  I've wanted GeneratorExit to derive from BaseException for
> months now, but didn't write this proposal until I actually wrote code
> that failed in the presence of task cancellation.
> 
> 2) You should avoid catching everything with except Exception:  I think
> that's too idealistic. Just do a search for try: except: through
> publicly available Python.  :)  Sometimes, you really _do_ want to catch
> everything.  When you're making a network request that involves
> xmlrpclib, urllib2, httplib, etc. you don't actually care what the error
> was.  (Well, except that the exceptions are submitted for automated
> analysis.)  Similarly, when loading a cache file with pickle, I don't
> care what went wrong, because it's not critical and should not be turned
> into a crash for the user.  (We automatically report exceptions that
> bubble into the main loop as crashes.)
> 
> 3) If GeneratorExit escapes from the generator somehow and gets raised
> in the main loop, then it will bubble out of the application like
> SystemExit and KeyboardInterrupt would:  I think this argument is
> somewhat specious, because I can't imagine how that would happen.  You'd
> have to store exceptions in your generator and explicitly bubble them
> out somehow.  Our crash handling has to specially handle
> KeyboardInterrupt and SystemExit anyway, since there are currently
> non-Exception exceptions, such as strings and custom classes tha

clearerr called on NULL FILE* ?

2006-05-02 Thread Chad Austin
Hi all,

My first post to the list.  :)  I'm debugging one of our application 
crashes, and I thought maybe one of you has seen something similar 
before.  Our application is mostly Python, with some work being done in 
a native C++ module.  Anyway, I'm getting a memory access violation at 
the following stack:


CRASHING THREAD
EXCEPTION POINTERS: 0x0012e424
 ExceptionRecord: 0x0012e518
 ExceptionCode: 0xc005 EXCEPTION_ACCESS_VIOLATION
 ExceptionFlags: 0x
 ExceptionAddress: 0x7c901010
 NumberParameters: 2
 ExceptionInformation[0]: 0x
 ExceptionInformation[1]: 0x0034
 ExceptionRecord: 0x

THREAD ID: 10b0frame count: 4
PYTHON23!0x000baa00 - PyFile_Type
PYTHON23!0x0003ac27 - PyFile_SetEncoding
   MSVCRT!0x00030a06 - clearerr
ntdll!0x1010 - RtlEnterCriticalSection


Here's my understanding:  something is getting called on a PyFileObject 
where f_fp is NULL, and clearerr in the multithreaded runtime tries to 
enter an invalid critical section.  It looks like PyFile_SetEncoding in 
the stack, but I can't figure out how in the Python source how 
SetEncoding calls clearerr.

Based on the timing of the crashes, I also think it might have something 
to do with log rollovers in RotatingFileHandler.

Has anyone run into something similar?  I don't expect anyone to spend a 
lot of time on this, but if there are any quick tips, they would be 
greatly appreciated...

We're using Python 2.3.5 and Visual C++ 6.

--
Chad Austin
http://imvu.com/technology

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: clearerr called on NULL FILE* ?

2006-05-10 Thread Chad Austin
Sorry to respond to myself; I wanted to give an update on this crash.  It turns 
out it's a race condition with multiple threads accessing the same Python file 
object!

http://sourceforge.net/tracker/index.php?func=detail&aid=595601&group_id=5470&atid=105470

Python-dev thread at 
http://mail.python.org/pipermail/python-dev/2003-June/036537.html

I wrote about the experience at http://aegisknight.livejournal.com/128191.html. 
  I agree that our program was incorrect to be writing to a log on one thread 
while it rotated them on another, but it'd be nice to get an exception that 
unambiguously shows what's going on rather than having random crashes reported 
in the field.

Chad

Chad Austin wrote:
> Hi all,
> 
> My first post to the list.  :)  I'm debugging one of our application 
> crashes, and I thought maybe one of you has seen something similar 
> before.  Our application is mostly Python, with some work being done in 
> a native C++ module.  Anyway, I'm getting a memory access violation at 
> the following stack:
> 
> 
> CRASHING THREAD
> EXCEPTION POINTERS: 0x0012e424
>  ExceptionRecord: 0x0012e518
>  ExceptionCode: 0xc005 EXCEPTION_ACCESS_VIOLATION
>  ExceptionFlags: 0x
>  ExceptionAddress: 0x7c901010
>  NumberParameters: 2
>  ExceptionInformation[0]: 0x
>  ExceptionInformation[1]: 0x0034
>  ExceptionRecord: 0x
> 
> THREAD ID: 10b0frame count: 4
> PYTHON23!0x000baa00 - PyFile_Type
> PYTHON23!0x0003ac27 - PyFile_SetEncoding
>MSVCRT!0x00030a06 - clearerr
> ntdll!0x1010 - RtlEnterCriticalSection
> 
> 
> Here's my understanding:  something is getting called on a PyFileObject 
> where f_fp is NULL, and clearerr in the multithreaded runtime tries to 
> enter an invalid critical section.  It looks like PyFile_SetEncoding in 
> the stack, but I can't figure out how in the Python source how 
> SetEncoding calls clearerr.
> 
> Based on the timing of the crashes, I also think it might have something 
> to do with log rollovers in RotatingFileHandler.
> 
> Has anyone run into something similar?  I don't expect anyone to spend a 
> lot of time on this, but if there are any quick tips, they would be 
> greatly appreciated...
> 
> We're using Python 2.3.5 and Visual C++ 6.
> 
> --
> Chad Austin
> http://imvu.com/technology
> 
-- 
http://mail.python.org/mailman/listinfo/python-list