Re: Pyhon 2.x or 3.x, which is faster?

BartC Wed, 09 Mar 2016 06:07:10 -0800

On 09/03/2016 02:18, Steven D'Aprano wrote:

On Wed, 9 Mar 2016 12:28 pm, BartC wrote:

(Which wasn't as painful as I'd expected. However the next project I
have in mind is 20K lines rather than 0.7K. For that I'm looking at some
mechanical translation I think. And probably some library to wrap around
Python's i/o.)


You almost certainly don't need another wrapper around Python's I/O, making
it slower still. You need to understand what Python's I/O is doing.

Well, the original project will be using its file i/o library. So it'lluse the same interface that will be reimplemented on top of Python i/o.

And input operations mainly consist of grabbing an entire file at once.Output is a little more mixed.

If you open a file in binary mode, Python will give you a stream of bytes
(ordinal values 0 through 255 inclusive). Python won't modify or change
those bytes in any way. Whatever it reads from disk, it will give to you.

If you open a file in text mode, Python 3 will give you a stream of Unicode
code points (ordinal values 0 through 0x10FFFF). Earlier versions of Python
3 may behave somewhat strangely with so-called "astral characters": I
recommend that you avoid anything below version 3.3. Unless you are
including (e.g.) Chinese or ancient Phoenician in your text file, you
probably won't care.

I've just tried a UTF-8 file and getting some odd results. With a filecontaining [three euro symbols]:


€€€

(including a 3-byte utf-8 marker at the start), and opened in text mode,Python 3 gives me this series of bytes (ie. the ord() of each character):

And prints the resulting string as: ï»¿â‚¬â‚¬â‚¬. Although this lattermight depend on my console's code page setting. Changing it to UTF-8however (CHCP 65001 in Windows) gives me this error when I run theprogram again:


----------
Fatal Python error: Py_Initialize: can't initialize sys standard streams
LookupError: unknown encoding: cp65001

This application has requested the Runtime to terminate it in an unusualway.

Please contact the application's support team for more information.
----------

(That was with 3.1; 3.4 gives the same set of characters as above, andshows the string differently, but still wrong. While PyPy 3.2.4 gives adifferent set of byte values, all 0..255, and a different string again,although it now contains some actual € characters.

So I think I'll skip Unicode handling to start off with! (I've alreadyhad plenty of fun and games with it in the past.)


--
Bartc



--
https://mail.python.org/mailman/listinfo/python-list

Re: Pyhon 2.x or 3.x, which is faster?

Reply via email to