Re: [Tutor] really basic py/regex

Steven D'Aprano Sat, 31 Mar 2018 00:11:59 -0700

On Fri, Mar 30, 2018 at 02:00:13PM -0400, bruce wrote:
> Hi.
> 
> Trying to quickly get the re.match(....)  to extract the groups from the 
> string.
> 
> x="MATH 59900/40 [47490] - THE "
> 
> The regex has to return MATH, 59900, 40,, and 47490


Does it have to be a single regex? The simplest way is to split the 
above into words, apply a regex to each word separately, and filter out 
anything you don't want with a blacklist:

import re
regex = re.compile(r'\w+')  # one or more alphanumeric characters

string = "MATH 59900/40 [47490] - THE "
blacklist = set(['THE'])  # in Python 3, use {'THE'}

words = string.split()
results = []
for word in words:
    results.extend(regex.findall(word))

results = [word for word in results if word not in blacklist]
print(results)


Here's an alternative solution:

# version 2
words = string.split()
results = []
for word in words:
    for w in regex.findall(word):
        if w not in blacklist:
            results.append(w)

print(results)



-- 
Steve
_______________________________________________
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
https://mail.python.org/mailman/listinfo/tutor

Re: [Tutor] really basic py/regex

Reply via email to