On 19 Jan., 21:06, Gerald Britton <gerald.brit...@gmail.com> wrote: > [snip] > > > > > Yes, list building from a generator expression *is* expensive. And > > join has to do it, because it has to iterate twice over the iterable > > passed in: once for calculating the memory needed for the joined > > string, and once more to actually do the join (this is implementation > > dependent, of course). If the iterable is a list already, the list > > building is not needed. > > if join has to iterate twice over the iterable, how does this work? > > $ python3.1 > Python 3.1.1+ (r311:74480, Nov 2 2009, 14:49:22) > [GCC 4.4.1] on linux2 > Type "help", "copyright", "credits" or "license" for more information. > > >>> l = map(str, (x for x in range(10) if int(x)%2)) > >>> '.'.join(l) > '1.3.5.7.9' > > If join had to iterate twice over l, it would be consumed on the first > pass.
Yes. (Coincidentally, l is consumed in the first execution of the Timer ()-statement, which is why I had to add the call to list. Not to mention the xrange example of Stephen. But all this is not linked to what join does internally.) > If it works as you say then join would have to copy the > iterable on the first pass, effectively turning it into a list. Yes, that's what I'm saying above in the first two lines of what you quoted. I should have added somehting like "AFAIK CPython does it that way". > Though I didn't read through it, I would suppose that join could use a > dynamic-table approach to hold the result, starting with some > guesstimate then expanding the result buffer if and when needed. Probably. But that's not what happens. Try "".join("" for x in range(10**10) and watch join eating memory. -- http://mail.python.org/mailman/listinfo/python-list