Recursive generators and backtracking search
I've been using generators to implement backtracking search for a while now. Unfortunately, my code is large and complex enough (doing unification on math expressions) that its hard to post a simple example. So I decided to look for a simpler problem that could be used to demonstrate the technique that I am talking about. I noticed that PEP 255 (Simple Generators) refers to an implementation of the "8 Queens" problem in the lib/test directory. Looking at the code, I see that while it does use generators, it doesn't use them recursively. As an alternative, I'd like to present the following implementation. If you compare this one with the one in lib/test/test_generator.py you will agree (I hope) that by using recursive generators to implement backtracking, the resulting code is a little more straightforward and intuitive: # Example of using generators to implement backtracking search. # The example below is an implementation of the classic "N queens" # problem (place N queens on an N x N chessboard so that none are # threatened by the others.) # # Board representation: Since no two queens can be one the same # row, the board is represented as a tuple of N length, where # each element is the column occupied by the queen on that row. def queens( bsize ): # Function to test if a queen is threatened by any previously # placed queen. def threaten( qarray, newpos ): # Now check the diagonals dist = len( qarray )# Distance between rows for q in qarray: if q == newpos: return True # Same column if q + dist == newpos: return True # diagonal if q - dist == newpos: return True # diagonal dist -= 1 return False def qsearch( qarray = () ): for q in range( 0, bsize ):# Try each position if not threaten( qarray, q ): # If not threatened pos = qarray + ( q, ); # Append to the pos array if len( pos ) >= bsize:# Are we done? yield pos # Yield the answer else:# recursively call new generator for pos in qsearch( pos ): yield pos print "Queens problem for", bsize, "x", bsize, "board." for ans in qsearch(): ` # Print out the board print "+" + "---+" * bsize; for q in ans: print "|" + " |" * q + " Q |" + " |" * (bsize - q - 1) print "+" + "---+" * bsize; print queens( 8 ) Now, you may be wondering what is my point? Well, first, I want to encourage people to think about using Python as a language for complex heuristic search problems. Traditionally, LISP and Prolog have been the language of choices for "AI" type programming, however there is a clear advantage to the readability and maintainability of Python, as well as being much more integrated into modern computing environments (in terms of available interpreters, IDEs, libraries, etc.) Secondly, I want to lobby for additional support in the language and standard libraries for handling such problems. There are a number of fairly obvious language enhancements which would make the above example even simpler - for examle, the idea of being able to return the output of one generator directly from another instead of having to iterate through all of the results and then re-yield them has already been discussed in this forum. -- http://mail.python.org/mailman/listinfo/python-list
Re: Recursive generators and backtracking search
Alex Martelli wrote: > for x in whatever_other_iterable: yield x > > into (say) > > yield from whatever_other_iterable > > is minute and not worth changing the syntax (even though something like > 'yield from' would mean no keywords would need to be added). I agree that the improvement is minor, but I'll take what I can get. Although, I can think that perhaps there could be a potential efficiency improvement as well - right now, each result has to crawl its way up the stack of yields. For an 8 x 8 board, each final result gets yielded 8 times. A 'yield from' might potentially be able to splice the results directly into the output of the generator using some kind of iterator chain logic. I'm not sure how feasible this is, however. A more compelling benefit would be some means of yielding a value across multiple stack frames. Currently the generator semantics are limited, in that they only allow co-routine behavior between a caller and its directly called subroutine. What I am more interested in, however, is the use of backtracking search in Python, and its application to more complex search problems. The main benefit of the generator approach is in the elimination of callbacks, allowing the consumer of the search results to retain its local state in a convenient way, (Anyone who's ever struggled with the SAX API for XML parsing knows what I am talking about.) One difference between these more complex problems and the simple example I posted is that it isn't just simple recursion at work. Each level of search may invoke a variety of different search strategies for each sub-part of the problem, which themselves will do the same with their own portion. Here's a thought experiment: What would it take to make Python the premier language for AI research? (Not that I am proposing that this should be the Python community's goal, this is more in the way of a brainstorm session.) -- http://mail.python.org/mailman/listinfo/python-list
Speeding up multiple regex matches
I've run in to this problem a couple of times. Say I have a piece of text that I want to test against a large number of regular expressions, where a different action is taken based on which regex successfully matched. The naive approach is to loop through each regex, and stop when one succeeds. However, I am finding this to be too slow for my application -- currently 30% of the run time is being taken up in the regex matching. I thought of a couple of approaches, but I am not sure how to make them work: 1) Combine all of the regular expressions into one massive regex, and let the regex state machine do all the discriminating. The problem with this is that it gives you no way to determine which regex was the matching one. 2) Use the first character of the text string to cut down the search size. The problem here is that since I don't know the regex patterns in advance, I would need to inspect each regex and determine the possible set of starting characters that could be matched. This would require creating my own regex parser. 3) Use a lexical scannner. This is probably overkill for what I want to do. 4) Write my own regex system that does what I want. This is more work than I want to do. Any thoughts? -- http://mail.python.org/mailman/listinfo/python-list
Re: Speeding up multiple regex matches
OK that worked really well. In particular, the "lastindex" property of the match object can be used to tell exactly which group matched, without having to sequentially search the list of groups. In fact, I was able to use your idea to cobble together a poor man's lexer which I am calling "reflex" (Regular Expressions For Lexing). Here's an example of how it's used: # Define the states using an enumeration State = Enum( 'Default', 'Comment', 'String' ) # Create a scanner scanner = reflex.scanner( State.Default ) scanner.rule( "\s+" ) scanner.rule( "/\*", reflex.set_state( State.Comment ) ) scanner.rule( "[a-zA-Z_][\w_]*", KeywordOrIdent ) scanner.rule( "0x[\da-fA-F]+|\d+", token=TokenType.Integer ) scanner.rule( "(?:\d+\.\d*|\.\d+)(?:[eE]?[+-]?\d+)|\d+[eE]?[+-]?\d+", token=TokenType.Real ) # Multi-line comment state scanner.state( State.Comment ) scanner.rule( "\*/", reflex.set_state( State.Default ) ) scanner.rule( "(?:[^*]|\*(?!/))+" ) # Now, create an instance of the scanner token_stream = scanner( input_file_iter ) for token in token_stream: print token Internally, it creates an array of patterns and actions for each state. Then when you ask it to create a scanner instance, it combines all of the patterns into a large regular expression. Input lines are marched vs. this regex, and if a match succeeds, then the match object's 'lastindenx' property is used to look up the actions to perform in the array. -- http://mail.python.org/mailman/listinfo/python-list
Permutation Generator
I'm sure I am not the first person to do this, but I wanted to share this: a generator which returns all permutations of a list: def permute( lst ): if len( lst ) == 1: yield lst else: head = lst[:1] for x in permute( lst[1:] ): yield head + x yield x + head return -- Talin -- http://mail.python.org/mailman/listinfo/python-list
Dictionary inheritance
I want to make a dictionary that acts like a class, in other words, supports inheritance: If you attempt to find a key that isn't present, it searches a "base" dictionary, which in turn searches its base, and so on. Now, I realize its fairly trivial to code something like this using UserDict, but given that classes and modules already have this behavior, is there some built-in type that already does this? (This is for doing nested symbol tables and such.) --- Also, on a completely different subject: Has there been much discussion about extending the use of the 'is' keyword to do type comparisons a la C# (e.g. "if x is list:") ? -- Talin -- http://mail.python.org/mailman/listinfo/python-list
Further questions on dictionaries, namespaces, etc.
Thanks to everyone for pointing out my bone-headed errors :) I don't *usually* make that sort of mistake... What I'm attempting to do is create an implentation of the unification algorithm. Now, its fairly straightforward to do unification on my own custom types, all I have to do is define a unify() method for each class: def unify( self, other, bindings ): ... However, this doesn't work so well when the objects being unified are built-in python types (tuples, integers, etc.) This requires the "un-pythonic" testing of each individual type and then calling the appropriate unification function for the given type. Another thing that is important is that this particular unifier supports commutative functions (which accept their arguments in any order), which is what the permutation stuff is for. (I haven't done associativity yet, still working on it.) The unifier is structured as a stack of generators, each returning all possible matches. This allows the higher levels of the unifier to backtrack by processing alternatives produced by the lower-level generators. So here's my list of further questions: 1) Is there are better way to do "functional overloading" on built-in types than using a whole series of "if type( x ) is y". 2) Is there an easy way to determine if a given object has a callable method named "unify"? I know how to determine if there's an attribute with a name, but not how to determine whether or not that attribute is callable. 3) The "bindings" object is a dictionary that is constructed as the unification proceeds. (So for example, if I attempt to unify "x + 1" with "2 + 1" it will have the binding "x : 2" in the dictionary. I want this bindings object to behave like a function call scope - in that you can "see" the variables in the enclosing scope. In fact, I'd like to be able to use a regular python namespace (such as a module) as the enclosing scope, so that the unification algorithm has access to all of the variable definitions within that module. (Again, this is part of the notion that I want to be able to do unification operations on normal python data structures, rather than specially "wrapped" types.) In fact, I had thought about the idea of the bindings being an actual Python "module" object, however that doesn't appear to be possible (or what I want for that matter.) Similarly, being able to take the local function scope and capture it in a closure and export that to the outside world was also something I briefly pondered. Because of the backtracking nature of the matcher, I need to be able to "undefine" bindings that have been previously defined. The way I am currently doing this is to create a new "scope" each time I add a new binding, where the new scope's "parent" is the previous scope. (So in fact my dictionary has only one entry in it.) Something like this: for new_bindings in generate_matches( a, b, old_bindings ): ... If none of the alternatives pan out, it simply discards "new_bindings" and reverts back to the old_bindings. So my question here is: Do I want to continue using a subclass of dict for this, or something more exotic? 4) I've seen a couple of code examples on the net where people use the idiom "lambda x: (for x in [])" to represent a "null" iterator, i.e. one that immediately terminates. How is this any different from simply returning "()"? (Or range( n, n ) for that matter.) Which one is the most efficient? And if they are different, how about adding a null iterator to itertools :) -- Talin -- http://mail.python.org/mailman/listinfo/python-list
Yielding a chain of values
I'm finding that a lot of places within my code, I want to return the output of a generator from another generator. Currently the only method I know of to do this is to explicitly loop over the results from the inner generator, and yield each one: for x in inner(): yield x I was wondering if there was a more efficient and concise way to do this. And if there isn't, then what about extending the * syntax used for lists, i.e.: yield *inner() The * would indicate that you want to iterate through the given expression, and yield each value in turn. You could also use it on ordinary lists: yield *[1, 2, 3 ] Anyway, just an idle thought on a Sunday morning :) -- http://mail.python.org/mailman/listinfo/python-list
Re: Fun with decorators and unification dispatch
Well, the Matrix matching function now works as described above: @Arity( MatchMatrix( MatchInteger.n, MatchInteger.n ).x ) Now I am trying to see if I can write the rules for Derviative()... -- http://mail.python.org/mailman/listinfo/python-list
PEP 3102 for review and comment
(Note: PEPs in the 3xxx number range are intended for Python 3000, however this particular PEP may be backported if there is support for it.) PEP: 3102 Title: Keyword-Only Arguments Version: $Revision: 46053 $ Last-Modified: $Date: 2006-05-19 22:23:44 -0700 (Fri, 19 May 2006) $ Author: Talin Status: Draft Type: Standards Content-Type: text/plain Created: 22-Apr-2006 Python-Version: 3.0 Post-History: 28-Apr-2006, May-19-2006 Abstract This PEP proposes a change to the way that function arguments are assigned to named parameter slots. In particular, it enables the declaration of "keyword-only" arguments: arguments that can only be supplied by keyword and which will never be automatically filled in by a positional argument. Rationale The current Python function-calling paradigm allows arguments to be specified either by position or by keyword. An argument can be filled in either explicitly by name, or implicitly by position. There are often cases where it is desirable for a function to take a variable number of arguments. The Python language supports this using the 'varargs' syntax ('*name'), which specifies that any 'left over' arguments be passed into the varargs parameter as a tuple. One limitation on this is that currently, all of the regular argument slots must be filled before the vararg slot can be. This is not always desirable. One can easily envision a function which takes a variable number of arguments, but also takes one or more 'options' in the form of keyword arguments. Currently, the only way to do this is to define both a varargs argument, and a 'keywords' argument (**kwargs), and then manually extract the desired keywords from the dictionary. Specification Syntactically, the proposed changes are fairly simple. The first change is to allow regular arguments to appear after a varargs argument: def sortwords(*wordlist, case_sensitive=False): ... This function accepts any number of positional arguments, and it also accepts a keyword option called 'case_sensitive'. This option will never be filled in by a positional argument, but must be explicitly specified by name. Keyword-only arguments are not required to have a default value. Since Python requires that all arguments be bound to a value, and since the only way to bind a value to a keyword-only argument is via keyword, such arguments are therefore 'required keyword' arguments. Such arguments must be supplied by the caller, and they must be supplied via keyword. The second syntactical change is to allow the argument name to be omitted for a varargs argument. The meaning of this is to allow for keyword-only arguments for functions that would not otherwise take a varargs argument: def compare(a, b, *, key=None): ... The reasoning behind this change is as follows. Imagine for a moment a function which takes several positional arguments, as well as a keyword argument: def compare(a, b, key=None): ... Now, suppose you wanted to have 'key' be a keyword-only argument. Under the above syntax, you could accomplish this by adding a varargs argument immediately before the keyword argument: def compare(a, b, *ignore, key=None): ... Unfortunately, the 'ignore' argument will also suck up any erroneous positional arguments that may have been supplied by the caller. Given that we'd prefer any unwanted arguments to raise an error, we could do this: def compare(a, b, *ignore, key=None): if ignore: # If ignore is not empty raise TypeError As a convenient shortcut, we can simply omit the 'ignore' name, meaning 'don't allow any positional arguments beyond this point'. (Note: After much discussion of alternative syntax proposals, the BDFL has pronounced in favor of this 'single star' syntax for indicating the end of positional parameters.) Function Calling Behavior The previous section describes the difference between the old behavior and the new. However, it is also useful to have a description of the new behavior that stands by itself, without reference to the previous model. So this next section will attempt to provide such a description. When a function is called, the input arguments are assigned to formal parameters as follows: - For each formal parameter, there is a slot which will be used to contain the value of the argument assigned to that parameter. - Slots which have had values assigned to them are marked as 'filled'. Slots which have no value assigned to them yet ar
Re: PEP 3102 for review and comment
Allowing keyword arguments without defaults may seem like a syntactical change, but in fact it's not. You can do this in Python now - any argument can be a 'keyword' argument, whether it has a default value or not. So for example: def func( a, b, c=0 ): pass func( a=1, b=2, c=3 ) In other words, the use of '=' to indicate a keyword argument, and the use of '=' to indicate a default argument value have *nothing* to do with each other, other than the fact that they both have the requirement that they have to come after positional arguments. If you wanted to require keyword-only arguments to have default values, that would require adding an additional check, which would be a syntactical change. -- Talin -- http://mail.python.org/mailman/listinfo/python-list
Python download problem
Good evening, I would like to ask for help regarding Python installation , I'm a beginner and I would like to challenge myself to start coding using Python language , but unfortunately I'm having problems while installing it , I used python.org to install the latest Windows version of Python ( 3.8.5) on my Windows 10 64 bit computer , after clicking on this link to download the file Windows (x86-64 executable installer) I opened it and check the boxes , I continued with the steps until it was successfully downloaded . After that I opened CMD and this screen appeared for me I wrote python --version then enter and I got error 'python' is not recognized as an internal or external command,operable program or batch file. I tried to fix it following these steps as the following Edit the system environment variables>>environment variables >>Path >>edit>>browse .exe path C:\Program Files\Python38>>ok (this is the link that I paste in the path) C:\Program Files\Python38 and same problem occurred . I opened Python IDLE , then I wrote Print("HI") >> enter , and I got HI as a result Does that mean that it is python is working fine on my PC and if yes , can I use IDLE to run the future scripts instead of CMD or I need to fix the CMD ? Thank you -- https://mail.python.org/mailman/listinfo/python-list
'isa' keyword
Although I realize the perils of even suggesting polluting the Python namespace with a new keyword, I often think that it would be useful to consider defining an operator for testing whether or not an item is a member of a category. Currently, we have the 'in' operator, which tests for membership within a container, and that works very well -- in particular, it allows such membership tests to be expressed in very natural way. So for example, whereas in C++ I always have to say: if (dependencies.find( name ) != dependencies.end()) in Python I can simply say: if name in dependencies: ...which is much more readable and intuitive. At the same time, however, I recognize that there is a logical difference between membership in a container, and membership in a category. For example, although a bear is a member of the class of mammals, it doesn't make as much to say "if bear in mammal". Similarly, you wouldn't want to use the 'in' keyword as a replacement for isinstance(), i.e. "if name in str". I propose the word 'isa' because the term 'isa hierarchy' is commonly used to indicate a tree of types. So the syntax would look like this: if bear isa mammal: if name isa str: (I suppose it would look prettier to put a space between "is" and "a", but there are many obvious reasons why you don't want "a" to be a keyword!) The "isa" operator would of course be overloadable, perhaps by an accessor functions called __isa__, which works similarly to __contains__. The potential uses for this are not limited to isinstance() sugar, however. For example: if image isa gif: elif image isa jpeg: elif image isa png: In this case, we're not testing for object identity (which tests if two variables are referring to the same object), or even object equivalence (which tests of two objects are of equal value), nor are we testing for membership within a container -- instead we're testing for membership with a type hierarchy, where 'type' can be defined to mean whatever the programmer wants. Of course, at this point, I am sure that someone will point out that I should be using method overloading and inheritance rather than explicit testing of types. However, try writing an efficient __cmp__ function solely by method overloading -- or any other function that deals with more two object argument, where the action to be taken depends on the combination of types of both arguments. This can be solved with multi-method dispatch, but such methods are complex, non-standard, and have somewhat dubious performance characteristics. Its generally faster and simpler to dispatch based on the type of one of the arguments, and then test the types of the other arguments. -- http://mail.python.org/mailman/listinfo/python-list
Re: 'isa' keyword
Thanks for all the respones :) I realized up front that this suggestion is unlikely to gain approval, for reasons eloquently stated above. However, there are still some interesting issues raised that I would like to discuss. Let me first respond to a few of the comments: >What's the difference between this and ``isinstance`` ? What's the difference between 'in' and 'has_key()"? 1) Its shorter and more readable, 2) it can be overridden to mean different things for different container types. > What's wrong with: > if image.isa(gif): > elif image.isa(jpeg): >elif image.isa(png): That forces the classification logic to be put into the instance, rather than in the category. With the "in" keyword, the "__contains__" function belongs to the container, not the contained item, which is as it should be, since an item can be in multiple containers. > Especially conidering that checking parameters with "isinstance" is > considered bad form with Python's duck typing. Here's an example where the strict OOP style of programming breaks down. I'll use SCons as an example. In SCons, there is a "Depends" function that can take a filename, a list of filenames, or a build target (which is a python object). So the logic looks like this: def Depends( target ): if isinstance( target, str ): ... elif isinstance( target, list ): ... elif isinstance( target, BuildTarget ): ... else: error You can't use method overloading here, because you are dealing with builtin python objects (except for the BuildTarget). I can think of several different cases where you would have to resort to logic like this: -- Where you are trying to distinguish between built-n python types -- Where you are trying to distinguish between types that are created by another, independent module and which you can't modify -- Where you are trying to do multi-method dispatch logic. As an example of the latter, imagine a music sequencer application that edits a stream of Midi events. Lets suppose that there are various "classes" of events: Note On Note Off Aftertouch Pitchbend Control Change In addition, lets suppose that we have a variety of different ways of editing these events: "Piano Roll" Editor - edits in horizontal "piano roll" form "Drum Machine" editor - notes are placed on a grid Event List Editor - a text list of events Music Notation Editor - uses conventional notes and staves All of these editors operate on the same underlying data, which is a stream of events. Each of these editors has a "repaint" function to render the stream of events. So the rendering of the event depends *both* on the class of the event, and the class of the editor. So you could organize it this way: class PianoRollEditor: def repaint( event ): if isinstance( event, Note ): # draw note elif isinstance( event, Aftertouch ): # draw ...etc You could also invert the logic (which is somewhat clumsier): class Note: def repaint( context ): if isinstance( context, PianoRollEditor ): ... etc... Now, I realize that some folks have built multi-method dispatch systems for Python (I did one myself at one point.) However, these tend to be somewhat slow and clunky without language support (and no, I am not asking for Python to become Dylan or CLOS.) But it would be nice to know if there was a cleaner way to solve the above problems... -- http://mail.python.org/mailman/listinfo/python-list
Replacement for lambda - 'def' as an expression?
I've been reading about how "lambda" is going away in Python 3000 (or at least, that's the stated intent), and while I agree for the most part with the reasoning, at the same time I'd be sad to see the notion of "anonymous functions" go - partly because I use them all the time. Of course, one can always create a named function. But there are a lot of cases, such as multimethods / generics and other scenarios where functions are treated as data, where you have a whole lot of functions and it can be tedious to come up with a name for each one. For example, my current hobby project is implementing pattern matching similar to Prolog in Python. The dispatcher I am making allows you to create "overloaded" versions of a function that take different patterns as their input arguments, so that Simplify( (add, x, y) ) calls a different method than Simplify( (log, x) ) -- in other words, the choice of which code is executed is based on the structure of the tuple that is passed into it. However, in order for this to work, I need to be able to assign a block of Python code to a particular pattern, and having to invent a named function for each pattern is a burden :) Anyway, here's an example, then, of how 'def' could be used: add = def( a, b ): return a + b The lack of a function name signals that this is an anonymous function. The def keyword defines the function using the same syntax as always - the arguments are in parentheses, and are unevaluated; The colon marks the beginning of a suite. In fact, it looks a lot like the existing lambda, with a couple of differences: 1) It uses the familiar "def" keyword, which every Python beginner understands, instead of the somewhat unfamiliar "lambda" 2) The arguments are enclosed in parentheses, instead of a bare tuple followed by a colon, again reiterating the similarity to the normal usage of "def". 3) The statements are a real suite instead of a pseudo-suite - they can consist of multiple lines of statements. Like all statements whose last argument is a suite, you can put the body of the function on a single line: add = def( a, b ): return a + b (If this were Perl, you could also omit the "return", since in Perl the last evaluated expression in the function body is what gets returned if there's no explicit return statement.) What about passing an anonymous function as an argument, which is the most common case? This gets tricky, because you can't embed a suite inside of an expression. Or can you? The most powerful option would be to leverage the fact that you can already do line breaks inside of parentheses. So the "def" keyword would tell the parser to restart the normal indentation calculations, which would terminate whenever an unmatched brace or paren was encountered: a = map( (def( item ): item = do_some_calculation( item ) return item ), list ) The one-liner version looks a lot prettier of course: a = map( (def( item ): return item * item), list ) And it looks even nicer if we switch the order of the arguments around, since you can now use the final paren of the enclosing function call to terminate the def suite. a = map( list, def( item ): return item * item ) Unfortunately, there's no other good way I can think of to signal the end of the block of statements without introducing some radical new language construct. (Besides, if being an expression is good enough for 'yield', why shouldn't def get the same privilege? :) -- http://mail.python.org/mailman/listinfo/python-list
Re: Replacement for lambda - 'def' as an expression?
I like the decorator idea. Unfortunately, the version of Python I am using is pre-decorator, and there are various issues involved in upgrading on Mac OS X (due to the built-in Python 2.3 being used by the OS itself.) I'll have to look into how to upgrade without breaking too much... Some further examples of what I am trying to do. First let me state what my general goal is: There are lots of inference engines out there, from Prolog to Yacas, but most of them rely on a custom interpreter. What I want to find out is if I can build a solver, not by creating a new language on top of Python, but rather by giving solver-like capabilities to a Python programmer. Needless to say, this involves a number of interesting hacks, and part of the motivation for my suggestion(s) is reducing the hack factor. So, at the risk of being visited by Social Services for my abuse of Python Operators, here's a sample of how the sovler works: # Define a function with multiple arities Simplify = Function() # Define some arities. We overload __setitem__ to define an arity. # Param is a class who'se metaclass defines __getattr__ to return a new instance # of Param with the given parameter name. Simplify[ ( add, Param.x, 0 ) ] = lamba x: return Simplify( x )# x + 0 = x Simplify[ ( mul, Param.x, 1 ) ] = lamba x: return Simplify( x )# x * 1 = x Simplify[ ( mul, Param.x, 0 ) ] = lamba x: return 0 # x * 0 = 0 Simplify[ Param.x ] = lamba x: return x # Fallback case # Invoke the function. Should print the value of x print Simplify( (add, x, 0) ) Of course, what I really want is not def or lambda, what I really want is to be able to define functions that take suites as arguments. But that would be crazy talk :) Define( "Simplify", args ): code -- http://mail.python.org/mailman/listinfo/python-list
Fun with decorators and unification dispatch
Been playing around a bit more with developing my python inference engine, and I thought it might be of general interest (plus, I am finding the criticism useful). My original goal was to brush up on my math skills. Now, I've long felt that the best way to learn a language *thoroughly* is to write a compiler for it. So why not write a compiler for math? However, algebra and calculus aren't just about evaluating an expression and getting the answer, they are about *manipulating* mathematical expressions, applying transformations to them. Systems that do this (such as Mathematica and Yacas) are called "computer algebra systems", so I decided to see how hard it would be to implement one in Python. A computer algebra system is essentially a specialized kind of expert system, in other words it is a set of transformation rules, where an input expression is matched against a database of patterns, and the result of the database query determines what transformation is to be made. Most such systems start by implementing their own, specialized programming language, but I wanted instead to see if I could add a inference engine capability to Python itself. So what I have is a dispatch mechanism that maintains a database of Python functions, keyed by the input pattern. Unlike multimethod systems, the input patterns are not merely a list of types, but can be trees containing both constants and variables. Here's a trivial example, using the classic recursive algorithm for computing the factorial of a number: # Declare that "Factor" is a generic function Factorial = Function() # Define Factorial( 0 ) @Arity( 0 ) def Factorial(): return 1 # Define Factorial( x ) @Arity( MatchInteger.x ) def Factorial( x ): return x * Factorial( x - 1 ) print Factorial( 12 ) "Function" produces an instance of a generic function, which is a callable object. When called, it searches its list of patterns to determine the function to dispatch. The "Arity" decorator adds the function as a special case of the generic function. It adds the specific function to the generic's internal pattern database, and also returns the generic function as its return result, so that that (in this case) the name "Factorial" is bound to the generic function object. "MatchInteger" is a class that produces a matching object, otherwise known as a bound variable. In this case, the variable's name is "x". (It overloads the "__getattr__" method to get the variable name.) When the pattern matcher encounters a matching object, it attempts to bind the corresponding portion of the expression to that variable. It does this by adding the mapping of variable name to value into a dictionary of variable bindings. When the function is called, the dictionary of variable bindings is expanded and passed to the function (i.e. func( *bindings )), so that the variable that was bound to "x" now gets passed in as the "x" parameter of the function. MatchInteger itself is a type of "qualified match" (that is, a variable that only binds under certain conditions), and could be defined as: MatchInteger = QualifiedMatch( lambda x: type( x ) is int ) (Although it is not in fact defined this way.) QualifiedMatch takes a list of matching preducates, which are applied to the expression before binding can take place. Here's a more complex example, which is a set of rules for simplifying an expression: Simplify = Function() # x => x @Arity( MatchAny.x ) def Simplify( x ): return x # x * 0 => 0 @Arity( ( Multiply, MatchAny.x, 0 ) ) def Simplify( x ): return 0 # x * 1 => x @Arity( ( Multiply, MatchAny.x, 1 ) ) def Simplify( x ): return Simplify( x ) # x + 0 => x @Arity( ( Add, MatchAny.x, 0 ) ) def Simplify( x ): return Simplify( x ) # x + x => 2x @Arity( ( Add, MatchAny.x, MatchAny.x ) ) def Simplify( x ): return (Multiply, 2, Simplify( x ) ) # General recursion rule @Arity( ( MatchAny.f, MatchAny.x, MatchAny.y ) ) def Simplify( f, x, y ): return ( Simplify( f ), Simplify( x ), Simplify( y ) ) And in fact if I call the function: print Pretty( Simplify( Parse( "(x + 2) * 1" ) ) ) print Pretty( Simplify( Parse( "x * 1 + 0" ) ) ) print Pretty( Simplify( Parse( "y + y" ) ) ) print Pretty( Simplify( Parse( "(x + y) + (x + y)" ) ) ) It prints: x + 2 x 2 * y 2 * (x + y) The argument matcher tries to prioritize matches so that more specific matches (i.e. containing more constants) are matched before more general matches. This is perhaps too unsophisticated a scheme, but it seems to have worked so far. The pattern matcher also looks to see if the object being matched has the "commute" or "associate" property. If it finds "commute" it attempts to match against all posssible permutations of the input arguments (with special optimized logic for functions of one and two arguments). If the "associate" property is found, it first tries to flatten the expression (transforming (+ a (+ b c)) into (+ a b c), and then generating all possible partitions of the
Re: Fun with decorators and unification dispatch
Yes, it was a typo. Even thought the function has not yet been bound to the name "Factorial" when it calls the decorator, the function's __name__ attribute is set to it, so I use that to look up the name of the generic. Here''s the source for Arity: def Arity( *pattern ): """A function decorator that defines a specific arity of a generic function. This registers the specific implementation with the generic function (which must be in the global scope.)""" def inner( f ): if isinstance( f, Function ): generic = f f = generic.last_defined else: name = f.__name__ if not name in f.func_globals: raise Exception( "Generic function " + name + " has not been defined." ) generic = f.func_globals[ name ] generic.name = name generic.last_defined = f generic.add_arity( pattern, f ) return generic return inner There's a couple of kludges here: 1) The Generic doesn't know its own name until you define at least one specialization for it. Otherwise, you would have to say: Factorial = Function( "Factorial" ) which I consider verbose and redundant. 2) The whole last_defined thing is to allow multiple arities for a single specialized function. Since the arity normally returns the generic, and not the specialized func, as the return value, that means that any additional decorators will be applied to the generic rather than the specialized func. last_defined is a kludge for getting around that, but only for decorators that understand the situation. -- http://mail.python.org/mailman/listinfo/python-list