On 2005-02-18, Andy Dustman <[EMAIL PROTECTED]> wrote: > The reason it does this is exactly why you said: It iterates over the > sequence and gets the sum of the lengths, adds the length of n-1 > separators, and then allocates a string this size. Then it iterates > over the list again to build up the string.
The other (and, I suspect, the real) reason for materializing the argument is to be able to call unicode.join if it finds Unicode elements in the sequence. If it finds such an element, unicode.join has to be called on the entire sequence; the part already accumulated can't be used because unicode.join wants to call PyUnicode_FromObject on all the elements. Since it can't know whether the original argument is reiterable, it has to keep around the materialized sequence. > For generators, you'd have to make a trial allocation and start > appending stuff as you go, periodically resizing. This *might* end up > being more efficient in the case of generators, but the only way to > know for sure is to write the code and benchmark it. Even if it's not faster, it should use about half as much memory for non-sequence arguments. That can be a big win if elements are being generated on the fly (e.g., it's a generator that does something other than just iterate over an existing sequence). -- http://mail.python.org/mailman/listinfo/python-list