[issue15216] Support setting the encoding on a text stream after creation

INADA Naoki Wed, 11 Jan 2017 02:35:46 -0800

INADA Naoki added the comment:

> Inada, I think you messed up the positioning of bits of the patch. E.g. there 
> are now test methods declared > inside a helper function (rather than a test 
> class).


I'm sorry.  `patch -p1` merged previous patch into wrong place, and test passed 
accidently.

> Since it seems other people are in favour of this API, I would like to expand 
> it a bit to cover two uses  cases (see set_encoding-newline.patch):
> 
> * change the error handler without affecting the main character encoding
> * set the newline encoding (also suggested by Serhiy)

+1.  Since stdio is configured before running Python program, TextIOWrapper 
should be configurable after creation, as possible.

> Regarding Serhiy’s other suggestion about buffering parameters, perhaps 
> TextIOWrapper.line_buffering could become a writable attribute instead, and 
> the class could grow a similar write_through attribute. I don’t think these 
> affect encoding or decoding, so they could be treated independently.

Could them go another new issue?
This issue is too long to read already.

> The algorithm for rewinding unread data is complicated and can fail. What is 
> the advantage of using it? What is the use case for reading from a stream and 
> then changing the encoding, without a guarantee that it will work?
>
> Even if it is enhanced to never “fail”, it will still have strange behaviour, 
> such as data loss when a decoder is fed a single byte and produces multiple 
> characters (e.g. CR newline, backslashreplace, UTF-7).

When I posted the set_encoding-7.patch, I hadn't read io module deeply.  I just 
solved conflict and ran test.
After that, I read the code and I feel same thing (see msg285111 and msg285112).
Let's drop support changing encoding while reading.
It's significant step that allowing changing stdin encoding only before reading 
anything from it.


> One step in the right direction IMO would be to only support calling 
> set_encoding() when no extra read data has been buffered (or to explicitly 
> say that any buffered data is silently dropped). So there is no support for 
> changing the encoding halfway through a disk file, but it may be appropriate 
> if you can regulate the bytes being read, e.g. from a terminal (user input), 
> pipe, socket, etc.

Totally agree.


> But I would be happy enough without set_encoding(), and with something like 
> my rewrap() function at the bottom of 
> <https://github.com/vadmium/data/blob/master/data.py#L526>. It returns a 
> fresh TextIOWrapper, but when you exit the context manager you can continue 
> to reuse the old stream with the old settings.

I want one obvious way to control encoding and error handler from Python, (not 
from environment variable).
Rewrapping stream seems hacky way, rather than obvious way.

----------

_______________________________________
Python tracker <rep...@bugs.python.org>
<http://bugs.python.org/issue15216>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue15216] Support setting the encoding on a text stream after creation

Reply via email to