[issue21927] BOM appears in stdin when using Powershell

2020-10-12 Thread Jason R. Coombs
Jason R. Coombs added the comment: Thanks Eryk for following up. Glad to hear the issue has been fixed upstream! -- ___ Python tracker ___

[issue21927] BOM appears in stdin when using Powershell

2020-10-12 Thread Eryk Sun
Eryk Sun added the comment: I'm closing this as a third-party issue with older versions of PowerShell. Newer versions of PowerShell set the output encoding to UTF-8 without a BOM preamble. For example: PS C:\> $PSVersionTable.PSVersion Major Minor Patch PreReleaseLabel BuildLabel

[issue21927] BOM appears in stdin when using Powershell

2014-07-16 Thread eryksun
eryksun added the comment: > PS C:\Users\jaraco> echo £ | py -3 -c "import sys; > print(repr(sys.stdin.buffer.read()))" > b'?\r\n' > Curiously, it appears as if powershell is actually receiving > a question mark from the pipe. PowerShell calls ReadConsoleW to read the console input buffer, i.

[issue21927] BOM appears in stdin when using Powershell

2014-07-16 Thread Serhiy Storchaka
Serhiy Storchaka added the comment: Bytes repr doesn't contains non-ascii characters, therefore Python is actually receiving a question mark from the pipe. What are results of following commands? py -3 -c "import sys; sys.stdout.buffer.write(bytes(range(128, 256)))" py -3 -c "import sys; sys.

[issue21927] BOM appears in stdin when using Powershell

2014-07-16 Thread STINNER Victor
STINNER Victor added the comment: Please use ascii() instead of repr() in your test to identify who replaces characters with question marks. -- ___ Python tracker ___ ___

[issue21927] BOM appears in stdin when using Powershell

2014-07-16 Thread Jason R. Coombs
Jason R. Coombs added the comment: Here I use the british pound symbol to attempt to answer that question. I've disabled the environment variable PYTHONIOENCODING and not set any code page or loaded any other Powershell profile settings. PS C:\Users\jaraco> echo £ £ PS C:\Users\jaraco> chcp Ac

[issue21927] BOM appears in stdin when using Powershell

2014-07-15 Thread Serhiy Storchaka
Serhiy Storchaka added the comment: > - when stdin is a pipe (ex: echo "abc"|python ...), the stdin encoding > becomes cp1252 (ANSI code page) because os.device_encoding(0) returns None; > cp1252 is the result of locale.getpreferredencoding(False) (ANSI code page). > sys.stdin.readline() does

[issue21927] BOM appears in stdin when using Powershell

2014-07-15 Thread Jason R. Coombs
Jason R. Coombs added the comment: I agree there appears to be an inconsistency in how Powershell handles pipes between child processes and between itself and child processes. I'm not complaining about Python, but rather trying to find the best practice here. I'm currently using PYTHONIOENCOD

[issue21927] BOM appears in stdin when using Powershell

2014-07-15 Thread R. David Murray
R. David Murray added the comment: I find it amusing that the complaint is that Python isn't detecting the BOM and using the info when powershell produces it, but when python produces the BOM, it is powershell that isn't detecting it and using the information. So it looks like there's a bug h

[issue21927] BOM appears in stdin when using Powershell

2014-07-11 Thread Jason R. Coombs
Jason R. Coombs added the comment: I get different results that @haypo when testing Powershell on Windows 8.1 with Python 3.4.1: C:\Users\jaraco> chcp 1252 Active code page: 1252 C:\Users\jaraco> $env:PYTHONIOENCODING='' > How you do change the console encoding? Using the chcp command? Yes. I

[issue21927] BOM appears in stdin when using Powershell

2014-07-11 Thread STINNER Victor
STINNER Victor added the comment: See also issues #1602 (Windows console) and #16587 (stdin, _setmode() and wprintf). I tried msvcrt.setmode(0, 0x4): set stdin mode to _O_U8TEXT. In this mode, echo "abc"|python -c "import sys; print(ascii(sys.stdin.read()))" displays "\xff\xfea\x00b\x00c\

[issue21927] BOM appears in stdin when using Powershell

2014-07-11 Thread STINNER Victor
STINNER Victor added the comment: > The BOM (byte order mark) appears in the standard input stream. When using > cmd.exe, the BOM is not present. This behavior occurs in CP1252 as well as > CP65001. How you do change the console encoding? Using the chcp command? I'm surprised that you get a U

[issue21927] BOM appears in stdin when using Powershell

2014-07-11 Thread Jason R. Coombs
Jason R. Coombs added the comment: I've tested it and setting PYTHONIOENCODING='utf-8-sig' starts to get there. It causes Python to consume the BOM on stdin, but it also causes stdout to print a spurious non-printable character in the output: C:\Users\jaraco> echo foo | ./print-input foo The

[issue21927] BOM appears in stdin when using Powershell

2014-07-10 Thread Jason R. Coombs
Jason R. Coombs added the comment: I'm not sure what you're suggesting. Are you suggesting that Powershell is wrong here and that Powershell's attempt here to provide more detail about content encoding is wrong? Or are you suggesting that every client that reads from stdin should detect that i

[issue21927] BOM appears in stdin when using Powershell

2014-07-08 Thread Ezio Melotti
Ezio Melotti added the comment: I would argue that adding the BOM is a Powershell issue, and I'm not sure Python should do anything about it. There are probably cases where people expects the BOM to be received by python, so stripping it is probably not an option. As for detecting, it should ha

[issue21927] BOM appears in stdin when using Powershell

2014-07-06 Thread Jason R. Coombs
New submission from Jason R. Coombs: Consider this simple example in Powershell (Windows 8.1): C:\Users\jaraco> cat .\print-input.py import sys print(next(sys.stdin)) C:\Users\jaraco> echo foo | .\print-input.py foo The BOM (byte order mark) appears in the standard input stream. When using