On Tue, Oct 22, 2019 at 8:16 AM Albert-Jan Roskam <sjeik_ap...@hotmail.com> wrote: > > > > On 18 Oct 2019 20:36, Chris Angelico <ros...@gmail.com> wrote: > > On Sat, Oct 19, 2019 at 5:29 AM Jagga Soorma <jagg...@gmail.com> wrote: > > > > Hello, > > > > I am writing my second python script and got it to work using > > python2.x. However, realized that I should be using python3 and it > > seems to fail with the following message: > > > > -- > > Traceback (most recent call last): > > File "test_script.py", line 29, in <module> > > test_cmd = ("diskcmd -u " + x + " | grep -v '\*' | awk '{print $1, > > $3, $4, $9, $10}'" ) > > TypeError: Can't convert 'bytes' object to str implicitly > > -- > > > > I then run this command and save the output like this: > > > > -- > > test_info = (subprocess.check_output( test_cmd, > > stderr=subprocess.STDOUT, shell=True )).splitlines() > > -- > > > > Looks like the command output is in bytes and I can't simply wrap that > > around str(). Thanks in advance for your help with this. > > >That's correct. The output of the command >is, by default, given to you > >in bytes. > > > Do you happen to know why this is the default?
Because at the OS level, it's all bytes. > And is there a reliable way to figure out the encoding? On posix, it's > probably utf8, but on windows I usually use cp437, but knowing windows, it > could be any codepage (you can even change it with chcp.exe) > Reliable? Nope. You can guess at what your local console would expect, but there's no way to be certain what a program will produce. You can't even be sure that the program will produce text - for instance, I have quite often piped data into or out of FFMPEG, which means the encoding isn't "UTF-8" or "Windows-1252", but is something like "16-bit 44KHz WAV". If you're uncertain, I would recommend attempting to decode the data as either ASCII or UTF-8. Most of the encodings you'll come across will be ASCII-compatible, meaning that decoding as ASCII will either succeed and give the right result, or fail with a clear exception. UTF-8 is designed to be similarly reliable, so you should generally be able to assume that a successful UTF-8 decode will give you the correct result. ChrisA -- https://mail.python.org/mailman/listinfo/python-list