Eryk Sun <eryk...@gmail.com> added the comment:

I'm closing this as a third-party issue with older versions of PowerShell. 
Newer versions of PowerShell set the output encoding to UTF-8 without a BOM 
preamble. For example:

    PS C:\> $PSVersionTable.PSVersion

    Major  Minor  Patch  PreReleaseLabel BuildLabel
    -----  -----  -----  --------------- ----------
    7      0      3

    PS C:\> $OutputEncoding.EncodingName
    Unicode (UTF-8)

    PS C:\> echo ¡¢£¤¥ | py -3 -X utf8 -c "print(ascii(input()))"
    '\xa1\xa2\xa3\xa4\xa5'

It's still possible to manually set the output encoding to include a BOM 
preamble. For example:

    PS C:\> $OutputEncoding = [System.Text.Encoding]::UTF8
    PS C:\> $OutputEncoding.GetPreamble()
    239
    187
    191
    PS C:\> echo ¡¢£¤¥ | py -3 -X utf8 -c "print(ascii(input()))"
    '\ufeff\xa1\xa2\xa3\xa4\xa5'

I don't know what would be appropriate for Python's I/O stack in terms of 
detecting and handling a UTF-8 preamble on any type of file (console/terminal, 
pipe, disk), i.e. using the "utf-8-sig" encoding instead of "utf-8", as opposed 
to just letting scripts detect and handle an initial BOM character (U+FEFF) 
however they see fit. But that discussion needs a new issue if people are 
interested in supporting new behavior.

----------
resolution:  -> third party
stage:  -> resolved
status: open -> closed

_______________________________________
Python tracker <rep...@bugs.python.org>
<https://bugs.python.org/issue21927>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

Reply via email to