On Tue, Oct 22, 2019 at 8:16 AM Albert-Jan Roskam
<sjeik_ap...@hotmail.com> wrote:
> On 18 Oct 2019 20:36, Chris Angelico <ros...@gmail.com> wrote:
> On Sat, Oct 19, 2019 at 5:29 AM Jagga Soorma <jagg...@gmail.com> wrote:
> >
> > Hello,
> >
> > I am writing my second python script and got it to work using
> > python2.x.  However, realized that I should be using python3 and it
> > seems to fail with the following message:
> >
> > --
> > Traceback (most recent call last):
> >   File "test_script.py", line 29, in <module>
> >     test_cmd = ("diskcmd -u " + x + " | grep -v '\*' | awk '{print $1,
> > $3, $4, $9, $10}'" )
> > TypeError: Can't convert 'bytes' object to str implicitly
> > --
> >
> > I then run this command and save the output like this:
> >
> > --
> > test_info = (subprocess.check_output( test_cmd,
> > stderr=subprocess.STDOUT, shell=True )).splitlines()
> > --
> >
> > Looks like the command output is in bytes and I can't simply wrap that
> > around str().  Thanks in advance for your help with this.
> >That's correct. The output of the command >is, by default, given to you
> >in bytes.
> Do you happen to know why this is the default?

Because at the OS level, it's all bytes.

> And is there a reliable way to figure out the encoding? On posix, it's 
> probably utf8, but on windows I usually use cp437, but knowing windows, it 
> could be any codepage (you can even change it with chcp.exe)

Reliable? Nope. You can guess at what your local console would expect,
but there's no way to be certain what a program will produce. You
can't even be sure that the program will produce text - for instance,
I have quite often piped data into or out of FFMPEG, which means the
encoding isn't "UTF-8" or "Windows-1252", but is something like
"16-bit 44KHz WAV".

If you're uncertain, I would recommend attempting to decode the data
as either ASCII or UTF-8. Most of the encodings you'll come across
will be ASCII-compatible, meaning that decoding as ASCII will either
succeed and give the right result, or fail with a clear exception.
UTF-8 is designed to be similarly reliable, so you should generally be
able to assume that a successful UTF-8 decode will give you the
correct result.


Reply via email to