Serge Orlov wrote: >> # So the question becomes: how can I make this work >> # in a graceful manner?
> change the return statement with this code: > > return (substitution.encode(error.encoding,"practical").decode( > error.encoding), error.start+1) Thanks, that was a quite neat recursive solution. :) I wouldn't have thought of that. I ended up doing it without the recursion, by testing the individual problematic code points with .encode() within the handler, and catching the possible exceptions: --- 8< --- # This is our original problematic code point: c = error.object[error.start] while 1: # Search for a substitute code point in # our table: c = table.get(c) # If a substitute wasn't found, convert the original code # point into a hexadecimal string representation of itself # and exit the loop. if c == None: c = u"[U+%04x]" % ord(error.object[error.start]) break # A substitute was found, but we're not sure if it is OK # for for our target encoding. Let's check: try: c.encode(error.encoding,'strict') # No exception; everything was OK, we # can break off from the loop now break except UnicodeEncodeError: # The mapping that was found in the table was not # OK for the target encoding. Let's loop and try # again; there might be a better (more generic) # substitution in the chain waiting for us. pass --- 8< --- -- znark -- http://mail.python.org/mailman/listinfo/python-list