Nick Coghlan added the comment:

To bring back Victor's comments from the list:

- stdout/stderr are fairly easy to handle, since the underlying buffers can be 
flushed before switching the encoding and error settings. Yes, there's a risk 
of creating mojibake, but that's unavoidable and, for this use case, trumped by 
the pragmatic need to support overriding the output encoding in a robust 
fashion (i.e. not breaking sys.__stdout__ or sys.__stderr__, and not crashing 
if something else displays output during startup, for example, when running 
under "python -v")

- stdin is more challenging, since it isn't entirely clear yet how to handle 
the case where data is already buffered internally. Victor proposes that it's 
acceptable to simply disallow changing the encoding of a stream that isn't 
seekable. My feeling is that such a restriction would largely miss the point, 
since the original use case that prompted the creation of this was shell 
pipeline processing, where stdin will often be a PIPE

I think the guiding use case here really needs to be this one: "How do I 
implement the equivalent of 'iconv' as a Python 3 script, without breaking 
internal interpreter state invariants?"

My current thought is that, instead of seeking, the input case can better be 
handled by manipulating the read ahead buffer directly. Something like (for the 
pure Python version):

   self._encoding = new_encoding
   if self._decoder is not None:
     old_data = self._get_decoded_chars().encode(old_encoding)
     old_data += self._decoder.getstate()[0]
     decoder = self._get_decoder()
     new_chars = ''
     if old_data:
         new_chars = decoder.decode(old_data)
     self._set_decoded_chars(new_chars)

(A similar mechanism could actually be used to support an "initial_data" 
parameter to TextIOWrapper, which would help in general encoding detection 
situations where changing encoding *in-place* isn't needed, but the application 
would like an easy way to "put back" the initial data for inclusion in the text 
stream without making assumptions about the underlying buffer implementation)

Also, StringIO should implement this new API as a no-op.

----------

_______________________________________
Python tracker <rep...@bugs.python.org>
<http://bugs.python.org/issue15216>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

Reply via email to