Well I tried the NumPy array thing that I was talking about, to parallelise the problem, and there were some difficulties with it. Firstly, the pruning makes a really big difference to the speed, and you don't get that if you're trying to parallelise the problem because what is an equivalent calculation for one set of numbers is obviously not for another set. I think this problem disappears when you consider very large sets of numbers though (where the cost of doing equivalent computations vanishes in comparison to the alternative cost of starting up the whole recursive computation from scratch many times). The second problem is that you can't weed out division by zero and intermediate fractions. I haven't looked at the internals of how NumPy deals with these though, so this might be fixable or it might not.
For my part, I'd consider the following equivalences to be right for defining equivalent expressions (where here a, b and c are any subexpression, and 1 and 0 means any subexpression that evaluates to 1 and 0). Commutativity: a*b <-> b*a a+b <-> b+a Associativity: (a+b)+c <-> a+(b+c) (a+b)-c <-> a+(b-c) (a-b)+c <-> a-(b-c) (a-b)-c <-> a-(b+c) (a*b)*c <-> a*(b*c) (a*b)/c <-> a*(b/c) (a/b)*c <-> a/(b/c) (a/b)/c <-> a/(b*c) Units (1 is multiplicative unit, 0 is additive unit): a*1 <-> a a/1 <-> a a+0 <-> a a-0 <-> a Substitution (swapping equal starting numbers is equivalent): expr(a,b,c,...,z) <-> expr(s(a,b,c,...,z)) where a,b,c,...,z are the original numbers given and s is a permutation of (a,b,c...,z) so that (a,b,c,...z) evaluates to the same thing as s(a,b,c,...,z) or equivalently, expr1 <-> expr2 if str(expr1)==str(expr2) Then, any two expressions which can be transformed into one another by the equivalences above are equivalent. Commutativity and units can be easily implemented as you go and most of the programs suggested so far do this, by for example only allowing a*b or a+b if a>=b, and not allowing a*1 or 1*a, etc. Substitution can be implemented by just taking set(map(str,expressions)) at the end. The associativity ones are a little more tricky to implement, but Arnaud's idea of the fully ordered binary tree seems to do it. Another way of saying it is that any subexpression consisting only of + and - operations should be reduced to a+b+c+...-z-y-x-.... where a>b>c>... and z>y>x>... (and similarly for an expression involving only * and /). Terry, I'd also be interested in a copy of your stack simulation code, btw. -- http://mail.python.org/mailman/listinfo/python-list