On Thu, Jan 29, 2015 at 02:42:31AM -0500, Robert Simmons wrote: > On Thu, Jan 29, 2015 at 2:29 AM, Roland Smith <rsm...@xs4all.nl> wrote: > > On Thu, Jan 29, 2015 at 01:38:21AM -0500, Robert Simmons wrote: > >> I'm having a unicode problem on FreeBSD lang/python34 that does not > >> appear on MacOS X. I've condensed the problem to one single line to > >> enter in the interpreter: > >> > >> FreeBSD: > >> Python 3.4.2 (default, Jan 28 2015, 22:23:57) > >> [GCC 4.2.1 Compatible FreeBSD Clang 3.4.1 (tags/RELEASE_34/dot1-final > >> 208032)] on freebsd10 > >> Type "help", "copyright", "credits" or "license" for more information. > >> >>> b'\xc3\xa2'.decode('utf-8') > >> '\xe2' > >> > >> MacOS X: > >> Python 3.4.2 (default, Oct 19 2014, 17:55:38) > >> [GCC 4.2.1 Compatible Apple LLVM 6.0 (clang-600.0.54)] on darwin > >> Type "help", "copyright", "credits" or "license" for more information. > >> >>> b'\xc3\xa2'.decode('utf-8') > >> 'â' > >> > >> Why is Python on FreeBSD incorrectly decoding this? > > > > Works fine here (FreeBSD 10.1-STABLE #0 r276653 amd64): > > > > Python 3.4.2 (default, Nov 4 2014, 19:34:48) > > [GCC 4.2.1 Compatible FreeBSD Clang 3.4.1 (tags/RELEASE_34/dot1-final > > 208032)] on freebsd10 > > Type "help", "copyright", "credits" or "license" for more information. > > >>> b'\xc3\xa2'.decode('utf-8') > > 'â'
(please don't top-post) > What is the output from print(sys.stdout.encoding) on your system? Python 3.4.2 (default, Nov 4 2014, 19:34:48) [GCC 4.2.1 Compatible FreeBSD Clang 3.4.1 (tags/RELEASE_34/dot1-final 208032)] on freebsd10 Type "help", "copyright", "credits" or "license" for more information. >>> import sys >>> print(sys.stdout.encoding) UTF-8 > And, can you explain how to change that on mine so that it is UTF-8? > Mine is a default fresh install, btw. In /etc/login.conf, I set LC_ALL=en_US.UTF-8; default:\ :passwd_format=sha512:\ :copyright=/etc/COPYRIGHT:\ :welcome=/etc/motd:\ :setenv=MAIL=/var/mail/$,BLOCKSIZE=K,LC_ALL=en_US.UTF-8:\ :path=/sbin /bin /usr/sbin /usr/bin /usr/games /usr/local/sbin /usr/local/bin And I use a unicode aware X terminal (rxvt-unicode). In case you're not using X11, the new vt(4) device uses UTF-8, but the old sc(4) doesn't support it at all, AFAIK. Roland -- R.F.Smith http://rsmith.home.xs4all.nl/ [plain text _non-HTML_ PGP/GnuPG encrypted/signed email much appreciated] pgp: 5753 3324 1661 B0FE 8D93 FCED 40F6 D5DC A38A 33E0 (keyID: A38A33E0)
pgpp4MY4jhhiJ.pgp
Description: PGP signature