Hi Duncan > Nick Craig-Wood wrote: >> Which translates to >> match = re.search('(blue|white|red)', t) >> if match: >> else: >> if match: >> else: >> if match: > > This of course gives priority to colours and only looks for garments or > footwear if the it hasn't matched on a prior pattern. If you actually > wanted to match the first occurrence of any of these (or if the condition > was re.match instead of re.search) then named groups can be a nice way of > simplifying the code:
A good point. And a good example when to use named capture group references. This is easily extended for 'spitting out' all other occuring categories (see below). > PATTERN = ''' > (?P<c>blue|white|red) > ... This is one nice thing in Pythons Regex Syntax, you have to emulate the ?P-thing in other Regex-Systems more or less 'awk'-wardly ;-) > For something this simple the titles and group names could be the > same, but I'm assuming real code might need a bit more. Non no, this is quite good because it involves some math-generated table-code lookup. I managed somehow to extend your example in order to spit out all matches and their corresponding category: import re PATTERN = ''' (?P<c>blue |white |red ) | (?P<g>socks|tights ) | (?P<f>boot |shoe |trainer) ''' PATTERN = re.compile(PATTERN , re.VERBOSE) TITLES = { 'c': 'Colour', 'g': 'Garment', 'f': 'Footwear' } t = 'blue socks and red shoes' for match in PATTERN.finditer(t): grp = match.lastgroup print "%s: %s" %( TITLES[grp], match.group(grp) ) which writes out the expected: Colour: blue Garment: socks Colour: red Footwear: shoe The corresponding Perl-program would look like this: $PATTERN = qr/ (blue |white |red )(?{'c'}) | (socks|tights )(?{'g'}) | (boot |shoe |trainer)(?{'f'}) /x; %TITLES = (c =>'Colour', g =>'Garment', f =>'Footwear'); $t = 'blue socks and red shoes'; print "$TITLES{$^R}: $^N\n" while( $t=~/$PATTERN/g ); and prints the same: Colour: blue Garment: socks Colour: red Footwear: shoe You don't have nice named match references (?P<..>) in Perl-5, so you have to emulate this by an ordinary code assertion (?{..}) an set some value ($^R) on the fly - which is not that bad in the end (imho). (?{..}) means "zero with code assertion", this sets Perl-predefined $^R to its evaluated value from the {...} As you can see, the pattern matching related part reduces from 4 lines to one line. If you wouldn't need dictionary lookup and get away with associated categories, all you'd have to do would be this: $PATTERN = qr/ (blue |white |red )(?{'Colour'}) | (socks|tights )(?{'Garment'}) | (boot |shoe |trainer)(?{'Footwear'}) /x; $t = 'blue socks and red shoes'; print "$^R: $^N\n" while( $t=~/$PATTERN/g ); What's the point of all that? IMHO, Python's Regex support is quite good and useful, but won't give you an edge over Perl's in the end. Thanks & Regards Mirco -- http://mail.python.org/mailman/listinfo/python-list