Adam Olsen added the comment: On 11/1/07, James G. sack (jim) <[EMAIL PROTECTED]> wrote: > > James G. sack (jim) added the comment: > > Adam Olsen wrote: > > Adam Olsen added the comment: > > > > The problem with "being tolerate" as you suggest is you lose the ability > > to round-trip. Read in a file using the UTF-8 signature, write it back > > out, and suddenly nothing else can open it. > > I'm sorry, I don't see the round-trip problem you describe. > > If codec utf_8 or utf_8_sig were to accept input with or without the > 3-byte BOM, and write it as currently specified without/with the BOM > respectively, then _I_ can reread again with either utf_8 or utf_8_sig. > > No round trip problem _for me_. > > Now If I need to exchange with some else, that's a different matter. One > way or another I need to know what format they need and create the > output they require for their input. > > Am I missing something in your statement of a problem?
You don't seem to think it's important to interact with other programs. If you're importing with no intent to write out to a common format, then yes, autodetecting the BOM is just fine. Python needs a more general default though, and not guessing is part of that. > > Conceptually, these signatures shouldn't even be part of the encoding; > > they're a prefix in the file indicating which encoding to use. > > Yes, I'm aware of that, but you can't predict what you may find in dusty > archives, or what someone may give to you. IMO, that's the basis of > being tolerant in what you accept, is it not? Garbage in, garbage out. There's a lot of protocols with whitespace, capitalization, etc that you can fudge around while retaining the same contents; character set encodings aren't one of them. __________________________________ Tracker <[EMAIL PROTECTED]> <http://bugs.python.org/issue1328> __________________________________ _______________________________________________ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com