On Apr 29, 8:46 am, Julien <[EMAIL PROTECTED]> wrote: > I'd like to select terms in a string, so I can then do a search in my > database. > > query = ' " some words" with and "without quotes " ' > p = re.compile(magic_regular_expression) $ <--- the magic happens > m = p.match(query) > > I'd like m.groups() to return: > ('some words', 'with', 'and', 'without quotes') > > Is that achievable with a single regular expression, and if so, what > would it be? >
Julien - I dabbled with re's for a few minutes trying to get your solution, then punted and used pyparsing instead. Pyparsing will run slower than re, but many people find it much easier to work with readable class names and instances rather than re's typoglyphics: from pyparsing import OneOrMore, Word, printables, dblQuotedString, removeQuotes # when a quoted string is found, remove the quotes, # then strip whitespace from the contents dblQuotedString.setParseAction(removeQuotes, lambda s:s[0].strip()) # define terms to be found in query string term = dblQuotedString | Word(printables) query_terms = OneOrMore(term) # parse query string to extract terms query = ' " some words" with and "without quotes " ' print tuple(query_terms.parseString(query)) Gives: ('some words', 'with', 'and', 'without quotes') The pyparsing wiki is at http://pyparsing.wikispaces.com. You'll find an examples page that includes a search query parser, and pointers to a number of online documentation and presentation sources. -- Paul -- http://mail.python.org/mailman/listinfo/python-list