Op 09-03-15 om 16:17 schreef Tim Chase: > On 2015-03-09 15:29, Antoon Pardon wrote: >> Op 09-03-15 om 13:50 schreef Tim Chase: >>>> (?:(?!_|\d)\w)\w+ >>> If you don't have to treat it as an atom, you can simplify that to >>> just >>> >>> (?!_|\d)\w+ >>> >>> which just means that the first character can't be an underscore >>> or digit. >>> >>> Though for a Py3 identifier, the underscore is acceptable as a >>> first character ("__init__"), so you can simplify it even further >>> to just >>> >>> (?!\d)\w+ >> No that doesn't work. To begin with my attempt above shoud have >> been: >> >> (?:(?!_|\d)\w)\w* > Did you actually test my suggestion? The "(?!\d)\w+" means "one or > more Word characters, but the first one can't be a digit" because > the "(?!...)" is zero-width. This should match single-character > strings including a single underscore.
I had done some tests, but due to a misunderstanding I broke off testing prematurely. I didn't grasp the look ahead nature of the (?! combination and saw it just as a negation of the regular expression involved. But IIUC the (?!\d) will check that the next charachter is not a digit without advancing the position in the string. So that later checking for \w+ happens as if (?!\d) hadn't been present. So in effect you have part of the string that is checked against to sub regular expresssions. -- Antoon Pardon -- https://mail.python.org/mailman/listinfo/python-list