Forum: Cfengine Help Subject: Re: regex help Author: sauer Link to topic: https://cfengine.com/forum/read.php?3,21705,21776#msg-21776
Yes, I'm saying put a .* at the end. :) To handle the zero-width look-arounds, imagine that there's an index which keeps track of which character you're at in the string. The index keeps track of which character you're comparing to. Meanwhile, the regex keeps track of which expression it's evaluating. Say we have the expression /\d+/. Comparing it against the string 123abc456, the regex parser indicates "well, first I'm looking for a member of the set a to z". So, it starts at 1, doesn't match. Goes to 2, doesn't match. Eventually gets to a, and matches. Now it's found 1, but it's to find one or more. So, it looks for either more members of that set, or a digit. It plods along, eventually finding a digit. And then it will optimally stop there, since that's the end of the expression, and "one or more" is fulfilled by just one. Put the expression in parens, and it'll find all the digits due to the + being greedy. I think we're on the same page up to this point. But then we throw in the zero width expressions. When the regex matching encounters a zero-width, imagine that the parser says "ok, hang on a second. I'm gonna copy the index we were at and, in a child process, go check this pattern out". So, it heads off, checking the zero width expression to see if it preceeds or follows the place where the main index is located. But then, after checking that, the important thing to remember is that it returns to the original index before continuing to check the pattern. If the zero-width positive look-ahead is at the end of the pattern, fine. However, that's not the end of the pattern in an anchored cfengine expression. The end of the pattern is a $, ie, a pattern which matches the end of the line. So, you have to follow the zero-width look ahead assertion with a pattern which matches from the prior-to-the-zero-width pointer to the end of the line. Say you want to match one or more lower-case letters followed by a number, but the number can not start with a 1. You'd say: /+(?!1)\d+/ You match the letters with the +, then note that it should not be followed by a 1, but *should* be followed by numbers. Compare it against abc123. Both the (?!1) and the \d start matching immediately after the "c", which is the end of the + match. The + ends on a non-letter. If you don't want to start the numeric sequence with a 0 or a 1, one hard way to express that could be /+(?!1)(?!2)\d+/ You've got two zero-width expressions with nothing separating them, so they both start just after the c in my example, and then the \d still also starts after the "c". The same deal applies with the lookbehind, except that you go backwards instead of forwards. :) Basically, your expression should match the whole string without the zero-width expressions, and then the zero-width parts add additional description to the pattern without consuming any space in the match. In your string, you want to match the command and anything else to the end of the line (the .*), but that "anything else" can not include the redirection to /dev/null (the negative lookahead). I feel like there should be an animated gif or something showing the progression through a regex here. :) Until that time, does this clarify or further muddy things? _______________________________________________ Help-cfengine mailing list Help-cfengine@cfengine.org https://cfengine.org/mailman/listinfo/help-cfengine