On 6/18/2010 3:57 PM, Pierre Reinbold wrote:
Hi all,

This is my first post on the list. I'm mainly a sysadmin and no expert
in programming languages, so this may be a stupid question

no

but it puzzles me.

I was pondering on the documentation of the function product(*args,
**kwds) in the itertools module. It is said that:

This function is equivalent to the following code, except that the
actual implementation does not build up intermediate results in memory:

I suspect the actual algorithm is an iterative algorithm that simulates nested for-loops to produce one item at a time, much like the working nested generator version I give below.

def product(*args, **kwds):
     # product('ABCD', 'xy') -->  Ax Ay Bx By Cx Cy Dx Dy
     # product(range(2), repeat=3) -->  000 001 010 011 100 101 110 111
     pools = map(tuple, args) * kwds.get('repeat', 1)
     result = [[]]
     for pool in pools:
         result = [x+[y] for x in result for y in pool]

This runs iter(result) to completion *before* rebinding the result to 'result'.

     for prod in result:
         yield tuple(prod)

Ok, then what about using a generator expression instead of a list
comprehension to avoid generating the intermediate results in memory?

Letting the **kwds trick aside, it may give:

def genexp_product(*args):
     pools = map(tuple, args)
     result = [[]]
     for pool in pools:
         result = (x+[y] for x in result for y in pool)

The name binding in the first for-clause in a genexp is handled slightly differently from that in subsequent for-clauses. I suspect that this is relevant here. This code rebinds 'result' to a new generator *without* running running the previously bound 'result' generator.

     for prod in result:
         yield tuple(prod)

This runs N nested generators, but which?

but this do not work as expected:

print list(product("ABC", "xy"))
[('A', 'x'), ('A', 'y'), ('B', 'x'), ('B', 'y'), ('C', 'x'), ('C', 'y')]
print list(genexp_product("ABC", "xy"))
[('x', 'x'), ('x', 'y'), ('y', 'x'), ('y', 'y')]

My question is why ? What is the final generator "result" at the end of
the main loop in genexp_product ? How is it build exactly ? Is this an
effet of some sort of "lazy evaluation" ?

That and/or a namespace issue.

Let's apply Reedy's Rule: when you have trouble understanding a function expression, replace it with the (near) equivalent def statement. (Among other advantages, one can insert print calls!)

Genexps, like lambdas, are specialized function expressions.

def augment(s_of_s, s):
     for x in s_of_s:
          for y in s:
               yield x+[y]

def gen_product(*args):
    pools = map(tuple, args)
    result = [[]]
    for pool in pools:
        result = augment(result,pool)
    for prod in result:
        yield tuple(prod)

print(list(gen_product("ABC", "xy")))

>>> #3.1
[('A', 'x'), ('A', 'y'), ('B', 'x'), ('B', 'y'), ('C', 'x'), ('C', 'y')]

Terry Jan Reedy

--
http://mail.python.org/mailman/listinfo/python-list

Reply via email to