[issue7651] Python3: guess text file charset using the BOM

2015-01-07 Thread Jon Dufresne
Changes by Jon Dufresne : -- nosy: +jdufresne ___ Python tracker ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.pytho

[issue7651] Python3: guess text file charset using the BOM

2013-01-02 Thread STINNER Victor
STINNER Victor added the comment: The idea was somehow rejected on the python-dev mailing list. I'm not really motivated to work on this issue since I never see any file starting with a BOM on Linux, and I'm only working on Linux. So I just close this issue. If someone is motivated to work on

[issue7651] Python3: guess text file charset using the BOM

2012-07-07 Thread Florent Xicluna
Florent Xicluna added the comment: For the implementation part, there's something which already plays with the BOM in the tokenize module. See tokenize.open(), which uses tokenize.detect_encoding() to read the BOM in some cases. -- nosy: +flox ___

[issue7651] Python3: guess text file charset using the BOM

2012-07-07 Thread Łukasz Langa
Changes by Łukasz Langa : -- type: -> enhancement ___ Python tracker ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.p

[issue7651] Python3: guess text file charset using the BOM

2012-07-07 Thread Łukasz Langa
Łukasz Langa added the comment: After reading the mailing list thread at http://mail.python.org/pipermail/python-dev/2010-January/097102.html and waging on other concerns (e.g. how to behave on write-only and read-write modes), it looks like a PEP might be necessary to solve this once and f

[issue7651] Python3: guess text file charset using the BOM

2012-03-20 Thread Łukasz Langa
Changes by Łukasz Langa : -- assignee: -> lukasz.langa ___ Python tracker ___ ___ Python-bugs-list mailing list Unsubscribe: http://m

[issue7651] Python3: guess text file charset using the BOM

2010-07-23 Thread Łukasz Langa
Łukasz Langa added the comment: I agree with MvL that this is a broader issue that shouldn't be patched in user code (e.g. #7519) but on the codec level. The sniff codec idea seems neat. -- nosy: +lukasz.langa ___ Python tracker

[issue7651] Python3: guess text file charset using the BOM

2010-04-13 Thread Walter Dörwald
Walter Dörwald added the comment: Yes, that's the posting I was referring to. I wonder why the link is gone. -- ___ Python tracker ___ ___

[issue7651] Python3: guess text file charset using the BOM

2010-04-13 Thread Éric Araujo
Éric Araujo added the comment: The link has gone. Is this the message you’re refering to? http://mail.python.org/pipermail/python-dev/2010-January/097115.html Regards -- nosy: +merwok ___ Python tracker

[issue7651] Python3: guess text file charset using the BOM

2010-01-09 Thread Walter Dörwald
Walter Dörwald added the comment: IMHO this is the wrong approach. As Martin v. Löwis suggested here http://mail.python.org/pipermail/python-dev/2010-January/094841.html the best solution would be a new codec (which he named sniff), that autodetects the encoding on reading. This doesn't requ

[issue7651] Python3: guess text file charset using the BOM

2010-01-08 Thread STINNER Victor
STINNER Victor added the comment: New version of the patch which is shorter, cleaner, fix the last bug (seek) and don't change the default behaviour anymore (checking for BOM is now explicit): * BOM checking is now optional (explicit): use open(filename, encoding="BOM"). open(filename, "w", e

[issue7651] Python3: guess text file charset using the BOM

2010-01-07 Thread STINNER Victor
STINNER Victor added the comment: Oops, fix read() method of my previous patch. -- Added file: http://bugs.python.org/file15783/open_bom-2.patch ___ Python tracker ___ __

[issue7651] Python3: guess text file charset using the BOM

2010-01-07 Thread STINNER Victor
STINNER Victor added the comment: open_bom.patch is the proof of concept. It only works in read mode. The idea is to delay the creation of the encoding and the decoder. We wait for just after the first read_chunk(). The patch changes the default behaviour of open(): if the file starts with a

[issue7651] Python3: guess text file charset using the BOM

2010-01-07 Thread Antoine Pitrou
Antoine Pitrou added the comment: You should ask on the mailing-list (python-dev) because this is an important behaviour change which I'm not sure will get accepted. -- nosy: +pitrou ___ Python tracker ___

[issue7651] Python3: guess text file charset using the BOM

2010-01-06 Thread STINNER Victor
New submission from STINNER Victor : If the file starts with a BOM, open(filename) should be able to guess the charset. It would be helpful for many high level modules: - #7519: ConfigParser - #7185: csv - and any module using open() to read a text file Actually, the user have to choose bet