Steven D'Aprano wrote: > Oh, a further thought... > > > On Thursday 05 May 2016 16:46, Stephen Hansen wrote: > >> On Wed, May 4, 2016, at 11:04 PM, Steven D'Aprano wrote: >>> Start by writing a function or a regex that will distinguish strings >>> that match your conditions from those that don't. A regex might be >>> faster, but here's a function version. >>> ... snip ... >> >> Yikes. I'm all for the idea that one shouldn't go to regex when Python's >> powerful string type can answer the problem more clearly, but this seems >> to go out of its way to do otherwise. >> >> I don't even care about faster: Its overly complicated. Sometimes a >> regular expression really is the clearest way to solve a problem. > > Putting non-ASCII letters aside for the moment, how would you match these > specs as a regular expression? > > - All uppercase ASCII letters (A to Z only), optionally separated into > words by either a bare ampersand (e.g. "AAA&AAA") or an ampersand with > leading and > trailing spaces (spaces only, not arbitrary whitespace): "AAA & AAA". > > - The number of spaces on either side of the ampersands need not be the > same: "AAA& BBB & CCC" should match. > > - Leading or trailing spaces, or spaces not surrounding an ampersand, must > not match: "AAA BBB" must be rejected. > > - Leading or trailing ampersands must also be rejected. This includes the > case where the string is nothing but ampersands. > > - Consecutive ampersands "AAA&&&BBB" and the empty string must be > rejected. > > > I get something like this: > > r"(^[A-Z]+$)|(^([A-Z]+[ ]*\&[ ]*[A-Z]+)+$)" > > > but it fails on strings like "AA & A & A". What am I doing wrong? > > > For the record, here's my brief test suite: > > > def test(pat): > for s in ("", " ", "&" "A A", "A&", "&A", "A&&A", "A& &A"): > assert re.match(pat, s) is None > for s in ("A", "A & A", "AA&A", "AA & A & A"): > assert re.match(pat, s)
>>> def test(pat): ... for s in ("", " ", "&" "A A", "A&", "&A", "A&&A", "A& &A"): ... assert re.match(pat, s) is None ... for s in ("A", "A & A", "AA&A", "AA & A & A"): ... assert re.match(pat, s) ... >>> test("^A+( *& *A+)*$") >>> -- https://mail.python.org/mailman/listinfo/python-list