On Aug 21, 10:27 am, Andreas Löscher <andreas.loesc...@s2005.tu- chemnitz.de> wrote: > > from Python/ceval.c: > > 1316 case BINARY_ADD: > 1317 w = POP(); > 1318 v = TOP(); > 1319 if (PyInt_CheckExact(v) && PyInt_CheckExact(w)) { > 1320 /* INLINE: int + int */ > 1321 register long a, b, i; > 1322 a = PyInt_AS_LONG(v); > 1323 b = PyInt_AS_LONG(w); > 1324 /* cast to avoid undefined behaviour > 1325 on overflow */ > 1326 i = (long)((unsigned long)a + b); > 1327 if ((i^a) < 0 && (i^b) < 0) > 1328 goto slow_add; > 1329 x = PyInt_FromLong(i); > 1330 } > 1331 else if (PyString_CheckExact(v) && > 1332 PyString_CheckExact(w)) { > 1333 x = string_concatenate(v, w, f, next_instr); > 1334 /* string_concatenate consumed the ref to v */ > 1335 goto skip_decref_vx; > 1336 } > 1337 else { > 1338 slow_add: > 1339 x = PyNumber_Add(v, w); > 1340 } > 1341 Py_DECREF(v); > 1342 skip_decref_vx: > 1343 Py_DECREF(w); > 1344 SET_TOP(x); > 1345 if (x != NULL) continue; > 1346 break; > > 1532 case INPLACE_ADD: > 1533 w = POP(); > 1534 v = TOP(); > 1535 if (PyInt_CheckExact(v) && PyInt_CheckExact(w)) { > 1536 /* INLINE: int + int */ > 1537 register long a, b, i; > 1538 a = PyInt_AS_LONG(v); > 1539 b = PyInt_AS_LONG(w); > 1540 i = a + b; > 1541 if ((i^a) < 0 && (i^b) < 0) > 1542 goto slow_iadd; > 1543 x = PyInt_FromLong(i); > 1544 } > 1545 else if (PyString_CheckExact(v) && > 1546 PyString_CheckExact(w)) { > 1547 x = string_concatenate(v, w, f, next_instr); > 1548 /* string_concatenate consumed the ref to v */ > 1549 goto skip_decref_v; > 1550 } > 1551 else { > 1552 slow_iadd: > 1553 x = PyNumber_InPlaceAdd(v, w); > 1554 } > 1555 Py_DECREF(v); > 1556 skip_decref_v: > 1557 Py_DECREF(w); > 1558 SET_TOP(x); > 1559 if (x != NULL) continue; > 1560 break; > > As for using Integers, the first case (line 1319 and 1535) are true and > there is no difference in Code. However, Python uses a huge switch-case > construct to execute it's opcodes and INPLACE_ADD cames after > BINARY_ADD, hence the difference in speed.
That fragment of cevel.c is from a 2.x version. Python 2.x supports both a PyInt and PyLong type and the cevel loop optimized the PyInt case only. On my system, I could not measure a difference between binary and inplace addition. Python 3.x behaves differently: TARGET(BINARY_ADD) w = POP(); v = TOP(); if (PyUnicode_CheckExact(v) && PyUnicode_CheckExact(w)) { x = unicode_concatenate(v, w, f, next_instr); /* unicode_concatenate consumed the ref to v */ goto skip_decref_vx; } else { x = PyNumber_Add(v, w); } Py_DECREF(v); skip_decref_vx: Py_DECREF(w); SET_TOP(x); if (x != NULL) DISPATCH(); break; TARGET(INPLACE_ADD) w = POP(); v = TOP(); if (PyUnicode_CheckExact(v) && PyUnicode_CheckExact(w)) { x = unicode_concatenate(v, w, f, next_instr); /* unicode_concatenate consumed the ref to v */ goto skip_decref_v; } else { x = PyNumber_InPlaceAdd(v, w); } Py_DECREF(v); skip_decref_v: Py_DECREF(w); SET_TOP(x); if (x != NULL) DISPATCH(); break; cevel just calls PyNumber_Add or PyNumber_InPlaceAdd. If you look at the code for PyNumber_InPlaceAdd (in abstract.c), it calls an internal function binary_iop1 with pointers to nb_inplace_add and nb_add. binary_iop1 then checks if nb_inplace_add exists. The PyLong type does not implement nb_inplace_add so the check fails and binary_iop1 used nb_add. In recent version of gmpy and gmpy2, I implemented the nb_inplace_add function and performance (for the gmpy.mpz type) is much better for the in-place addition. For the adventuresome, gmpy2 implements a mutable integer type called xmpz. It isn't much faster until the values are so large that the memory copy times become significant. (Some old gmpy documentation implies that operations with mutable integers should be much faster. With agressive caching of deleted objects, the object creation overhead is very low. So the big win for mutable integers is reduced to avoiding memory copies.) casevh > > To be clear, this is nothing you should consider when writing fast code. > Complexity wise they both are the same. -- http://mail.python.org/mailman/listinfo/python-list