Jason Bailey wrote: > My script first reads the DHCPD configuration file into memory - > variable "filebody". It then utilizes the re module to find the > configuration details for the wanted "shared network". > > The config file might look something like this: > > ###################################### > > shared-network My-Network-MOHE { > subnet 192.168.0.0 netmask 255.255.248.0 { > option routers 192.168.0.1; > option tftp-server-name "192.168.90.12"; > pool { > deny dynamic bootp clients; > range 192.168.0.20 192.168.7.254; > } > } > } > > shared-network My-Network-CDCO { > subnet 192.168.8.0 netmask 255.255.248.0 { > option routers 10.101.8.1; > option tftp-server-name "192.168.90.12"; > pool { > deny dynamic bootp clients; > range 192.168.8.20 192.168.15.254; > } > } > } > > shared-network My-Network-FECO { > subnet 192.168.16.0 netmask 255.255.248.0 { > option routers 192.168.16.1; > option tftp-server-name "192.168.90.12"; > pool { > deny dynamic bootp clients; > range 192.168.16.20 192.168.23.254; > } > } > } > > ###################################### > > Suppose I'm trying to grab the shared network called "My-Network-FECO" > from the above config file stored in the variable 'filebody'. > > First I have my variable 'shared_network' which contains the string > "My-Network-FECO". > > I compile my regex: > m = re.compile(r"^(shared\-network (" + re.escape(shared_network) + r") > \{((\n|.|\r\n)*?)(^\}))", re.MULTILINE|re.UNICODE)
This code does not run as posted. Applying Occam’s Razor, I think you meant to post m = re.compile(r"^(shared\-network (" + re.escape(shared_network) + r") \{((\n|.|\r\n)*?)(^\}))", re.MULTILINE|re.UNICODE) (If you post long lines, know where your automatic word wrap happens.) > I search for regex matches in my config file: > m.search(filebody) I find using the identifier “m” for the expression very strange. Usually I reserve “m” to hold the *matches* for an expression on a string. Consider “r” or “rx” or something else instead of “m” for the expression. > Unfortunately, I get no matches. From output on the command line, I can > see that Python is adding extra backslashes to my re.compile string. I > have added the raw 'r' in front of the strings to prevent it, but to no > avail. Python is adding the extra backslashes because you used “r”. Note that the console-printed string representations of strings do not have an “r” in front of them. What you see is what you would have needed to write for equivalent code if you had not used “r”. (Different from some other languages, Python does not distinguish between single-quoted and double- quoted strings with regard to parsing. Hence the r'…' feature, the triple- quoted string, and the .format() method.) You get no matches because you have escaped the HYPHEN-MINUSes (“-”). You never need to escape those characters, in fact you must not do that here because r'\-' is not an (unnecessarily) escaped HYPHEN-MINUS, it is a literal backslash followed by a HYPHEN-MINUS, a character sequence that does not occur in your string. Outside of a character class you do not need to do that, and in a character class you can put it as first or last character instead (“[-…]” or “[…-]”). You have escaped the first HYPHEN-MINUS; re.escape() has escaped the other two for you: | >>> re.escape('-') | '\\-' I presume this behavior is because of character classes, and the idea that the return value should work at any position in a character class. ISTM that you cannot use re.escape() here, and you must escape special characters yourself (using re.sub()), should they be possible in the file. I do not see a reason for making the entire expression a group (but for making the network name a group). You should refrain from parsing non-regular languages with a *single* regular expression (multiple expressions or expressions with alternation in a loop are usually fine; this can be used for building efficient parsers), even though Python’s regular expressions, which are not an exception there, are not exactly “regular” in the theoretical computer science sense. See the Chomsky hierarchy and Jeffrey E. F. Friedl’s insightful textbook “Mastering Regular Expressions”. It is possible that there is a Python module for parsing ISC dhcpd configuration files already. If so, you should use that instead. -- PointedEars Twitter: @PointedEars2 Please do not cc me. / Bitte keine Kopien per E-Mail. -- https://mail.python.org/mailman/listinfo/python-list