On Wed, Feb 06, 2019 at 11:48:51AM +0100, Kashyap Chamarthy wrote: > On Tue, Feb 05, 2019 at 08:44:12PM -0500, John Snow wrote: > > On 2/5/19 8:49 AM, Marc-André Lureau wrote:
[...] > > > < command-name > [ arg-name1=arg1 ] ... [ arg-nameN=argN ] > > > """ > > > - cmdargs = shlex.split(cmdline) > > > + cmdargs = > > > re.findall(r'''(?:[^\s"']|"(?:\\.|[^"])*"|'(?:\\.|[^'])*')+''', cmdline) Dan Berrangé explained on IRC this way: In plain english what that is saying is: give me all blocks of text which are: (a) set of chars not including " or ' (b) a set of chars surrounded by ".." (c) a set of chars surrounded by '... > > It might really be nice to have a comment briefly explaining the regex. > > This is pretty close to symbol soup. > > Yeah, a little comment explaining it would be nice. For my own education today I learned that ?: is called a "non-capturing match". For reference, quoting from the `perldoc perlretut`: [quote] Non-capturing groupings A group that is required to bundle a set of alternatives may or may not be useful as a capturing group. If it isn't, it just creates a superfluous addition to the set of available capture group values, inside as well as outside the regexp. Non-capturing groupings, denoted by "(?:regexp)", still allow the regexp to be treated as a single unit, but don't establish a capturing group at the same time. Both capturing and non-capturing groupings are allowed to co-exist in the same regexp. Because there is no extraction, non-capturing groupings are faster than capturing groupings. Non-capturing groupings are also handy for choosing exactly which parts of a regexp are to be extracted to matching variables: # match a number, $1-$4 are set, but we only want $1 /([+-]?\ *(\d+(\.\d*)?|\.\d+)([eE][+-]?\d+)?)/; # match a number faster , only $1 is set /([+-]?\ *(?:\d+(?:\.\d*)?|\.\d+)(?:[eE][+-]?\d+)?)/; # match a number, get $1 = whole number, $2 = exponent /([+-]?\ *(?:\d+(?:\.\d*)?|\.\d+)(?:[eE]([+-]?\d+))?)/; Non-capturing groupings are also useful for removing nuisance elements gathered from a split operation where parentheses are required for some reason: $x = '12aba34ba5'; @num = split /(a|b)+/, $x; # @num = ('12','a','34','a','5') @num = split /(?:a|b)+/, $x; # @num = ('12','34','5') [/quote] [...] -- /kashyap