20050207 text pattern matching # -*- coding: utf-8 -*- # Python
# suppose you want to replace all strings of the form # <img src="some.gif" width="30" height="20"> # to # <img src="some.png" width="30" height="20"> # in your html files. # you can use the "re" module. import re text = r'''<html> blab blab <P> look at this <img src="./some.gif" width="30" height="20"> pict and this one <img class="float-right" src="../that.gif">, both are beautiful, but also look: <img src ="my.gif">, and sequel <img src= "girl.gif"> yeah! </p> ''' new = re.sub(r'''src\s*=\s*"([^"]+)\.gif"''', r'''src="\1.png"''', text) print new # the first argument to re.sub is a regex pattern. # the second argument is the replacement string, # which can contain captured pattern (the \1) # the third argument is the text to be checked. # an optional 4th argument is number of replacement # to make. If ommitted, it replace all occurances of matches. # see # http://python.org/doc/lib/module-re.html -------------------- # similar code in perl is s///. For example, $text = "123"; $text =~ s/2/9/; print $text; ---------------------- In languages human or computer, there's a notion of expressiveness. English for example, is very expressive in manifestation, witness all the poetry and implications and allusions and connotations and dictions. There are a myriad ways to say one thing, fuzzy and warm and all. But when we look at what things it can say, its power of expression with respect to meaning, or its efficiency or precision, we find natural languages incapable. These can be felt thru several means. A sure way is thru logic, linguistics, and or what's called Philosophy of Languages. One can also glean directly the incapacity and inadequacy of natural languages by studying the artificial language lojban, where one realizes, not only are natural languages incapable in precision and lacking in efficiency, but simply a huge number of things are near impossible to express thru them. One thing commonly misunderstood in computing industry is the notion of expressiveness. If a language has a vocabulary of (smile, laugh, grin, giggle, chuckle, guffaw, cackle), then that language will not be as expressive, as a language with just (severe, slight, laugh, cry). The former is "expressive" in terms of fluff, where the latter is expressive with respect to meaning. Similarly, in computer languages, expressiveness is significant with respect to semantics, not syntactical variation. These two contrasting ideas can be easily seen thru Perl vs Python languages, and as one specific example of their text pattern matching abilities. Perl is a language of syntactical variegations. Python on the other hand, does not even allow changes in code's indentation, but its efficiency and power in expression, with respect to semantics (i.e. algorithms), showcases Perl's poverty in specification. Xah [EMAIL PROTECTED] http://xahlee.org/PageTwo_dir/more.html -- http://mail.python.org/mailman/listinfo/python-list