On Sun, Dec 17, 2017 at 12:22:20PM +0100, Xen wrote: > Neal McBurnett schreef op 16-12-2017 18:16: > > For more on the rationale for changes related to iterators see > > http://portingguide.readthedocs.io/en/latest/iterators.html > > That entire rationale is only explained with one word "memory consumption". > > So now you are changing the design of your _grammar_ just so that the > resulting code will use less memory. > > That is the job of the compiler, not the developer.
I don't think the document above does a particularly good job of explaining it, and I think you've fundamentally misunderstood things, perhaps by extrapolating too much from toy examples. zip() takes iterables as its inputs; concrete lists are only one kind of iterable. Iteration constructs are very widespread in non-trivial Python code, and it's common to make use of iterators to express constructions where you can cheaply extract a few elements but it would be expensive to extract them all. For example, I spend most of my time working on database-backed web applications, which is a very popular application for Python. In that context, it's commonplace to make database queries via functions that return iterators and do lazy loading of results. You then iterate over these to build a page of results (which can use things like LIMIT and OFFSET when compiling its SQL queries), and you render and return that. If you accidentally call something that consumes the whole input iterable in the process, then it's going to do a *lot* of database traffic for some queries, and it doesn't take much of that to utterly destroy the performance of your application. This is not something that the compiler can optimise, because the *contract* of zip et al in Python 2 was that it would consume the entire inputs (up to the shortest one in the case of zip, anyway); iteration is visible to the program and can have external side-effects, and it's not something that can be quietly optimised out given the design of the language. Talking about memory consumption of the result is relevant in some cases, sure, but it's certainly not the whole story; what often matters is the work involved in materialising the whole iterable, and that can be very significant indeed. In Python 2, there were many functions that took iterables as input and returned concrete lists, consuming the entire inputs in the process. In most cases there were versions of these that operated in a lazy fashion and returned iterables instead, but they were generally hidden off in the itertools module and less obvious compared to the built-in versions. Effectively, the language did the wrong thing by default. Python 3 changes these around to give preference to the versions that take iterables as input and return iterables as output, and says that if you want a list then you have to use list() or similar to get one. This reduces the cognitive load of the language, because now instead of remembering the different names for the list-returning and iterable-returning versions of various things, you only have to remember one version and the general rule that you use list() to materialise a whole iterable into a list (which was already useful for other things even in Python 2). It makes the language simpler to learn, because there are fewer rules and they compose well; and it makes it easier to do what's usually the right thing. This comes at the cost of a bit of porting effort for some code that started out in Python 2, of which there'll be less and less as time goes on. To put it another way: "don't perform operations on collections of unbounded size" is pretty much the number one rule for webapps that I've picked up over the last few years, and Python 3 takes this lesson and applies it to the core language. Toy examples involving zip([1, 2], [3, 4]) and the like miss the point because they simplify too much. This family of functions is almost always used in iteration constructs, usually "for ... in" or a comprehension, and in those common cases the programmer doesn't have to change anything at all. In cases where they do need to change something, it has the useful effect of highlighting that something a little unusual may be going on, rather than hiding behaviour that's potentially catastrophic at scale behind an innocuous-looking built-in function. > Meanwhile Python 3.4 can be excessively slower than 2.7. SO WHERE'S THE > GAIN? It will no doubt depend on the benchmark, and rather than cherry-picking a single one it's likely more interesting to look at either a wide range of benchmarks, or at the specific application in question. Counterpoint, which also links to much more data: https://lwn.net/Articles/725114/ -- Colin Watson [cjwat...@ubuntu.com] -- Ubuntu-devel-discuss mailing list Ubuntu-devel-discuss@lists.ubuntu.com Modify settings or unsubscribe at: https://lists.ubuntu.com/mailman/listinfo/ubuntu-devel-discuss