Hi Matt, I'm in no position to comment on your wider point, but...
On 9 May 2013 19:59, Matt Newell <newe...@blur.com> wrote: > On Monday, May 06, 2013 07:49:25 AM Phil Thompson wrote: > > The first PyQt5 snapshots are now available. You will need the current > SIP > > snapshot. PyQt5 can be installed alongside PyQt4. > > > > I welcome any suggestions for additional changes - as PyQt5 is not > > intended to be compatible with PyQt4 it is an opportunity to fix and > > improve things. > > > > Current changes from PyQt4: > > > > - Versions of Python earlier than v2.6 are not supported. > > > > - PyQt4 supported a number of different API versions (QString, QVariant > > etc.). PyQt5 only implements v2 of those APIs for all versions of Python. > > > > I haven't looked into this deeper but I am a bit worried about the possible > performance impacts of QString always being converted to a python > str/unicode. > (Not to mention the added porting work when going c++ <-> python). > > The vast majority of the PyQt code that we use loads data from libraries > that > deal with Qt types, and either directly loads that data into widgets, or > does > some processing then loads the data into widgets. I suspect that this > kind of > usage is very common. > > As an example a user of QtSql with the qsqlpsql driver that loads data and > displays it in a list view is going to see the following data > transformations/copies: > > PyQt4 with v1 QString api: > > libpq data comes from socket > -> QString (probable utf8->utf16) > -> PyQt wrapper of QString (actual data not copied or converted) > -> QString (pointer dereference to get Qt type) > > PyQt5, PyQt4 with v2 QString api: > > libpq data comes from socket > -> QString (probable utf8->utf16) > -> unicode (deep copy of data) > -> QString (deep copy of data) > > So instead of one conversion we now have one conversion and two deep > copies. > Another very probable side-effect is that in many cases either the original > QString and/or the unicode object will be held in memory, resulting in two > or > possibly even three copies of the data. Even if all but the last stage is > freed, there will still be 2 or 3 copies in memory during processing > depending > on how the code is written, which can reduce performance quite a bit > depending > on data size because of cpu cache flushing. > > So far this is completely theoretical, and I'm sure in a large portion of > applications will have no noticeable effect, however I don't like the idea > that > things may get permanently less efficient for apps that do process and > display > larger data sets. > > The one thing that stands out to me as possibly being a saving grace is the > fact that (at least in my understanding) both Qt and python use utf16 as > their > internal string format, which means fast copies instead of slower > conversions, > and that it may be possible with some future Qt/python changes to actually > allow QString -> unicode -> QString without any data copies. > FWIW, the Python-uses-roughly-utf16 meme is a common oversimplification. First, as I'm sure most people know, there are significant changes between Python2 str/unicode and Python3 str. That cannot but be reflected in differences between the CPython usage across the 2/3 boundary. What is less well known is that there is a significant change to CPython between 3.2 and 3.3 where the latter can store a str as either an array of 8, 16 or 32 bit values with automatic run-time conversions between them (and API changes to match). So whatever else happens within PyQt, I don't think the aspiration to the old 1-copy model can be relied on. In the event, this is what I came up with for the QString to str direction (corrections/optimisations welcome!): PyObject *Python::unicode(const QString &string){#if PY_MAJOR_VERSION < 3 /* Python 2.x. http://docs.python.org/2/c-api/unicode.html */ PyObject *s = PyString_FromString(PQ(string)); PyObject *u = PyUnicode_FromEncodedObject(s, "utf-8", "strict"); Py_DECREF(s); return u;#elif PY_MINOR_VERSION < 3 /* Python 3.2 or less. http://docs.python.org/3.2/c-api/unicode.html#unicode-objects */#ifdef Py_UNICODE_WIDE return PyUnicode_DecodeUTF16((const char *)string.constData(), string.length() * 2, 0, 0);#else return PyUnicode_FromUnicode(string.constData(), string.length());#endif#else /* Python 3.3 or greater. http://docs.python.org/3.3/c-api/unicode.html#unicode-objects */ return PyUnicode_FromKindAndData(PyUnicode_2BYTE_KIND, string.constData(), string.length());#endif} The referenced URLs contain more material. Hth, Shaheed > At some point I will try to do some benchmarks and look into the actual > code > to see if there is an elegant solution to this potential problem. > > >> >> >> kMatt > _______________________________________________ > PyQt mailing list PyQt@riverbankcomputing.com > http://www.riverbankcomputing.com/mailman/listinfo/pyqt >
_______________________________________________ PyQt mailing list PyQt@riverbankcomputing.com http://www.riverbankcomputing.com/mailman/listinfo/pyqt