Re: regex: multiple matching for one string

Scott David Daniels Fri, 24 Jul 2009 08:51:27 -0700

ru...@yahoo.com wrote:

Nick Dumas wrote:

On 7/23/2009 9:23 AM, Mark Lawrence wrote:

scriptlear...@gmail.com wrote:

For example, I have a string "#a=valuea;b=valueb;c=valuec;", and I
will like to take out the values (valuea, valueb, and valuec).  How do
I do that in Python?  The group method will only return the matched
part.  Thanks.


p = re.compile('#a=*;b=*;c=*;')
m = p.match(line)
        if m:
             print m.group(),

IMHO a regex for this is overkill, a combination of string methods such
as split and find should suffice.


You're saying that something like the following
is better than the simple regex used by the OP?
[untested]
values = []
parts = line.split(';')
if len(parts) != 4: raise SomeError()
for p, expected in zip (parts[-1], ('#a','b','c')):
    name, x, value = p.partition ('=')
    if name != expected or x != '=':
        raise SomeError()
    values.append (value)
print values[0], values[1], values[2]

I call straw man: [tested]
    line = "#a=valuea;b=valueb;c=valuec;"
    d = dict(single.split('=', 1)
             for single in line.split(';') if single)
    d['#a'], d['b'], d['c']
If you want checking code, add:
    if len(d) != 3:
        raise ValueError('Too many keys: %s in %r)' % (
                             sorted(d), line))

Blech, not in my book.  The regex checks the
format of the string, extracts the values, and
does so very clearly.  Further, it is easily
adapted to other similar formats, or evolutionary
changes in format.  It is also (once one is
familiar with regexes -- a useful skill outside
of Python too) easier to get right (at least in
a simple case like this.)

The posted regex doesn't work; this might be homework, so
I'll not fix the two problems.  The fact that you did not
see the failure weakens your claim of "does so very clearly."

--Scott David Daniels
scott.dani...@acm.org
--
http://mail.python.org/mailman/listinfo/python-list

Re: regex: multiple matching for one string

Reply via email to