Mark- I'm Cc'ing this to the mailing list so that the rest of user community can chime in (since I may be barking up the wrong tree). This is in response to the many bug reports I see which involve regex's (that are in fact, not bugs but misunderstandings as to what are regex characters). It's a thought that came to me in the "if I had it all to do over" frame of mind, and maybe you'll find a way to incorporate it in the future somehow... consider it a long-term feature-request. It may be possible to do this with a control promise, too, where we change the overall behavior of regex interpretation.
The biggest problem is that to a n00b, it is not always clear where the regex is, or when a regex is expected. Currently, wherever a regex is allowed, the string contains the regex (which may or may not contain regex characters, and which looks like any other string). So, you get code that looks like this: > bundle edit_line comment_lines_matching > { > vars: > > "regexes" slist => { "one.*", "two.*", "four.*" }; > > replace_patterns: > > "^($(regexes))$" > replace_with => comment("# "); > > ".*foo.*" > replace_with => comment("# "); > } > > bundle agent wintest > { > vars: > > "dim_array" > int => readstringarray("array_name","/tmp/array","#[^\n]*",":",10,4000); > > > files: > "c:\tmp\file" > delete => nodir, > pathtype => "literal"; # force literal string interpretation > > > "C:/windows/tmp/f\d" > delete => nodir, > pathtype => "regex"; # force regular expression interpretation > } I propose introducing a new semantic distinction with a control promise and two new syntactic elements. 0) The default (current) behavior is unchanged, but a new behavior can be introduced with the control promise "explicit_regex" 1) With the new behavioral mode, a quoted string like "foo.*" is just a string containing 5 characters ("foo" a literal '.' and a literal '*'). 2) Also with the new behavioral mode, an explicit regex string like /foo.*/ or r"foo.*" or r'foo.*' is a string that contains a regex. The r"str" notation is a convenience, to spare lots of backslashes in filenames. This means that the above code (with the appropriate control variable) will now look like this: > body common control > { > explicit_regex => "true"; > } > > bundle edit_line comment_lines_matching > { > vars: > > "regexes" slist => { "one.*", "two.*", "four.*" }; > > replace_patterns: > > /^($(regexes))$/ # The slist contains strings, but they are > expanded and then the result is interpreted as a regex > replace_with => comment("# "); > > /.*foo.*/ > replace_with => comment("# "); > } > > bundle agent wintest > { > vars: > > "dim_array" > int => > readstringarray("array_name","/tmp/array",/#[^\n]*/,":",10,4000); # > 3rd parameter is an explicit regex, 4th param is a string (both could be > regex's) > > > files: > "c:\tmp\file" > delete => nodir ; # literal string interpretation is automatic, > because the string is simple and not a regex string > > > r"C:/windows/tmp/f\d" > delete => nodir; # regular expression interpretation is automatic, > because the string is a regex > } The biggest problem now is that a user may specify a file as "file.ext", and not realize that this can also select a file named "filesext" or "file-ext" or "file_ext" (and to get just the one file, they need to say "file\.ext"). I also just saw user report where > perms => system("0400", "DOMAIN+USER", "sysmgt") didn't work as expected, and that it needed to be > perms => system("0400", "DOMAIN\+USER", "sysmgt") I think it would be so much clearer if the regex/string distinction was explicit. What do you all think? I'd be willing to help implement this, of course. -Dan _______________________________________________ Help-cfengine mailing list Help-cfengine@cfengine.org https://cfengine.org/mailman/listinfo/help-cfengine