[issue1328] feature request: force BOM option

2007-11-20 Thread Guido van Rossum
Changes by Guido van Rossum: -- nosy: -gvanrossum __ Tracker <[EMAIL PROTECTED]> __ ___ Python-bugs-list mailing list Unsubscribe: http://mai

[issue1328] feature request: force BOM option

2007-11-19 Thread James G. sack (jim)
James G. sack (jim) added the comment: More discussion of utf_8.py decoding behavior (and possible change): For my needs, I would like the decoding parts of the utf_8 module to treat an initial BOM as an optional signature and skip it if there is one (just like the utf_8_sig decoder). In fact

[issue1328] feature request: force BOM option

2007-11-15 Thread Walter Dörwald
Walter Dörwald added the comment: > For utf16, (arguably) a missing BOM should merely assume machian endianess. > For utf_16_le, utf_16_be input, both should accept & discard a BOM. > On output, I'm not sure; maybe all should write a BOM unless passed a flag > signifying no bom? > Or to preserve

[issue1328] feature request: force BOM option

2007-11-15 Thread Walter Dörwald
Walter Dörwald added the comment: jgsack wrote: > > If codec utf_8 or utf_8_sig were to accept input with or without the > 3-byte BOM, and write it as currently specified without/with the BOM > respectively, then _I_ can reread again with either utf_8 or utf_8_sig. That's exactly what the utf_8_

[issue1328] feature request: force BOM option

2007-11-15 Thread James G. sack (jim)
James G. sack (jim) added the comment: re: msg57041, I'm sorry if I gave the wrong impression about interacting with other programs. I started this feature request with some half-baked thinking, which I tried to revise in my second post. Anyway I'm most interested right now in lobbying for a c

[issue1328] feature request: force BOM option

2007-11-01 Thread Adam Olsen
Adam Olsen added the comment: On 11/1/07, James G. sack (jim) <[EMAIL PROTECTED]> wrote: > > James G. sack (jim) added the comment: > > Adam Olsen wrote: > > Adam Olsen added the comment: > > > > The problem with "being tolerate" as you suggest is you lose the ability > > to round-trip. Read in

[issue1328] feature request: force BOM option

2007-11-01 Thread James G. sack (jim)
James G. sack (jim) added the comment: Adam Olsen wrote: > Adam Olsen added the comment: > > The problem with "being tolerate" as you suggest is you lose the ability > to round-trip. Read in a file using the UTF-8 signature, write it back > out, and suddenly nothing else can open it. I'm sorry

[issue1328] feature request: force BOM option

2007-11-01 Thread Adam Olsen
Adam Olsen added the comment: The problem with "being tolerate" as you suggest is you lose the ability to round-trip. Read in a file using the UTF-8 signature, write it back out, and suddenly nothing else can open it. Conceptually, these signatures shouldn't even be part of the encoding; they'r

[issue1328] feature request: force BOM option

2007-10-27 Thread Gabriel Genellina
Changes by Gabriel Genellina: -- nosy: +gagenellina __ Tracker <[EMAIL PROTECTED]> __ ___ Python-bugs-list mailing list Unsubscribe: http://ma

[issue1328] feature request: force BOM option

2007-10-26 Thread James G. sack (jim)
James G. sack (jim) added the comment: OK, I will work on it. I have just downloaded trunk and will see what I can do. Might be a week or two. ..jim __ Tracker <[EMAIL PROTECTED]> __

[issue1328] feature request: force BOM option

2007-10-26 Thread Guido van Rossum
Guido van Rossum added the comment: If you can, please submit a patch that fixes all those issues, with unit tests and doc changes if at all possible. That will make it much easier to evaluate the ramifications of your proposal(s). __ Tracker <[EMAIL PROTECTED]> <

[issue1328] feature request: force BOM option

2007-10-26 Thread James G. sack (jim)
James G. sack (jim) added the comment: re: msg56782 Yes, of course I can explicitly write the BOM. I did realize that after my first post ( my-'duh' :-[ ). But after playing some more, I do think this issue has become a worthwhile one. My second post msg56780 asks that utf_8 be tolerant of t

[issue1328] feature request: force BOM option

2007-10-26 Thread Guido van Rossum
Guido van Rossum added the comment: Can't you force a BOM by simply writing \ufffe at the start of the file? -- nosy: +gvanrossum __ Tracker <[EMAIL PROTECTED]> __

[issue1328] feature request: force BOM option

2007-10-25 Thread James G. sack (jim)
James G. sack (jim) added the comment: Later note: kind of weird! On my LE machine, utf16 reads my BE-formatted test data (no BOM) apparently assumng some kind of surrogate format, until it finds an "illegal UTF-16 surrogate". That I fail to understand, especially since it quits upon seeing

[issue1328] feature request: force BOM option

2007-10-25 Thread James G. sack (jim)
James G. sack (jim) added the comment: Feature Request REVISION Upon reflection and more playing around with some test cases, I wish to revise my feature request. I think the utf8 codecs should accept input with or without the "sig". On output, only the utf_8_sig should

[issue1328] feature request: force BOM option

2007-10-25 Thread James G. sack (jim)
New submission from James G. sack (jim): The behavior of codecs utf_16_[bl]e is to omit the BOM. In a testing environment (and perhaps elsewhere), a forced BOM is useful. I'm requesting an optional argument something like force_BOM=False I guess it would require such an option in multiple func