Re: python3 regex?
Hey, all; thanks for the replies - reading data in one slurp vs line by line was the issue. In my perl programs, when reading files, I generally do it all in one swell foop and will probably end up doing so again in this case due to the layout of the text; but, that's my issue. Thanks again. I appreciate the tip. Doug O'Leary -- https://mail.python.org/mailman/listinfo/python-list
iterating over multi-line string
Hey; I have a multi-line string that's the result of reading a file filled with 'dirty' text. I read the file in one swoop to make data cleanup a bit easier - getting rid of extraneous tabs, spaces, newlines, etc. That part's done. Now, I want to collect data in each section of the data. Sections are started with a specific header and end when the next header is found. ^1\. Upgrade to the latest version of Apache HTTPD ^2\. Disable insecure TLS/SSL protocol support ^3\. Disable SSLv2, SSLv3, and TLS 1.0. The best solution is to only have TLS 1.2 enabled ^4\. Disable HTTP TRACE Method for Apache [[snip]] There's something like 60 lines of worthless text before that first header line so I thought I'd skip through them with: x=0 # Current index hx=1 # human readable index rgs = '^' + str(hx) + r'\. ' + monster['vulns'][x] hdr = re.compile(rgs) for l in data.splitlines(): while not hdr.match(l): next(l) print(l) which resulted in a typeerror stating that str is not an iterator. More googling resulted in: iterobj = iter(data.splitlines()) for l in iterobj: while not hdr.match(l): next(iterobj) print(l) I'm hoping to see that first header; however, I'm getting another error: Traceback (most recent call last): File "./testies.py", line 30, in next(iterobj) StopIteration I'm not quite sure what that means... Does that mean I got to the end of data w/o finding my header? Thanks for any hints/tips/suggestions. Doug O'Leary -- https://mail.python.org/mailman/listinfo/python-list
Re: iterating over multi-line string
Hey; Never mind; I finally found the meaning of stopiteration. I guess my google-foo is a bit weak this morning. Thanks Doug -- https://mail.python.org/mailman/listinfo/python-list
more python3 regex?
Hey This one seems like it should be easy but I'm not getting the expected results. I have a chunk of data over which I can iterate line by line and print out the expected results: for l in q.findall(data): # if re.match(r'(Name|")', l): # continue print(l) $ ./testies.py | wc -l 197 I would like to skip any line that starts with 'Name' or a double quote: $ ./testies.py | perl -ne 'print if (m{^Name} || m{^"})' Name IP Address,Site, "",,7 of 64 Name,IP Address,Site, "",,,8 of 64 Name,IP Address,Site, "",,,9 of 64 Name,IP Address,Site, "",,,10 of 64 Name,IP Address,Site, "",,,11 of 64 Name IP Address,Site, $ ./testies.py | perl -ne 'print unless (m{^Name} || m{^"})' | wc -l 186 When I run with the two lines uncommented, *everything* gets skipped: $ ./testies.py $ Same thing when I use a pre-defined pattern object: skippers = re.compile(r'Name|"') for l in q.findall(data): if skippers.match(l): continue print(l) Like I said, this seems like it should be pretty straight forward so I'm obviously missing something basic. Any hints/tips/suggestions gratefully accepted. Doug O'Leary -- https://mail.python.org/mailman/listinfo/python-list
Re: more python3 regex?
Hey, all; The print suggestion was the key clue. Turned out my loop was slurping the whole of data in one big line. Searching for a line that begins with Name when it's in the middle of the string is... obviously not going to work so well. Took me a bit to get that working and, once I did, I realized I was on the wrong track altogether. In perl, if possible, I will read a file entirely as manipulation of one large data structure is easier in some ways. Even with perl, though, that approach is the wrong one for this data. While I learned lots, the key lesson is forcing data to match an algorithm works as well in python as it does in perl. Go figure. My 200+ script that didn't work so well is now 63 lines, including comments... and works perfectly. Outstanding! Thanks for putting up with noob questions Doug -- https://mail.python.org/mailman/listinfo/python-list
xml parsing with lxml
Hey; I'm trying to gather information from a number of weblogic configuration xml files using lxml. I've found any number of tutorials on the web but they all seem to assume a knowledge that I apparently don't have... that, or I'm just being rock stupid today - that's distinct possibility too. The xml looks like: Domain1 10.3.5.0 [[snipp]] [[realm children snipped] myrealm [[snip]] [[snip]] [[snip]] [[snip]] [[snip]] byTime 14 02:00 Info [[snip]] 40024 true snip]] [[children snipped]] ${hostname} ${hostname} 40022 javac [[children snipped] [[rest snipped] The tutorials all start out well enough with: $ python Python 3.5.2 (default, Aug 22 2016, 09:04:07) [GCC 4.4.7 20120313 (Red Hat 4.4.7-4)] on linux Type "help", "copyright", "credits" or "license" for more information. >>> from lxml import etree >>> doc = etree.parse('config.xml') Now what? For instance, how do I list the top level children of .*?? In that partial list, it'd be name, domain-version, security-configuration, log, and server. For some reason, I'm not able to make the conceptual leap to get to the first step of those tutorials. The end goal of this exercise is to programatically identify weblogic clusters and their hosts. thanks Doug O'Leary -- https://mail.python.org/mailman/listinfo/python-list
Re: xml parsing with lxml
On Friday, October 7, 2016 at 3:21:43 PM UTC-5, John Gordon wrote: > root = doc.getroot() > for child in root: > print(child.tag) > Excellent! thank, you sir! that'll get me started. Appreciate the reply. Doug O'Leary -- https://mail.python.org/mailman/listinfo/python-list
lxml and xpath(?)
Hey; Reasonably new to python and incredibly new to xml much less trying to parse it. I need to identify cluster nodes from a series of weblogic xml configuration files. I've figured out how to get 75% of them; now, I'm going after the edge case and I'm unsure how to proceed. Weblogic xml config files start with namespace definitions then a number of child elements some of which have children of their own. The element that I'm interested in is which will usually have a subelement called containing the hostname that I'm looking for. Following the paradigm of "we love standards, we got lots of them", this model doesn't work everywhere. Where it doesn't work, I need to look for a subelement of called . That element contains an alias which is expanded in a different root child, at the same level as . So, picture worth a 1000 words: < [[ heinous namespace xml snipped ]] > [[text]] ... EDIServices_MS1 ... EDIServices_MC1 ... EDIServices_MS2 ... EDIServices_MC2 ... EDIServices_MC1 EDIServices_MC1 SSL host001 7001 EDIServices_MC2 EDIServices_MC2 host002 7001 So, running it on 'normal' config, I get: $ ./lxml configs/EntsvcSoa_Domain_config.xml EntsvcSoa_CS=> host003.myco.com EntsvcSoa_CS => host004.myco.com Running it against the abi-normal config, I'm currently getting: $ ./lxml configs/EDIServices_Domain_config.xml EDIServices_CS => EDIServices_MC1 EDIServices_CS => EDIServices_MC2 Using the examples above, I would like to translate EDIServices_MC1 and EDIServices_MC2 to host001 and host002 respectively. The primary loop is: for server in root.findall('ns:server', namespaces): cs = server.find('ns:cluster', namespaces) if cs is None: continue # cluster_name = server.find('ns:cluster', namespaces).text cluster_name = cs.text listen_address = server.find('ns:listen-address', namespaces) server_name = listen_address.text if server_name is None: machine = server.find('ns:machine', namespaces) if machine is None: continue else: server_name = machine.text print("%-15s => %s" % (cluster_name, server_name)) (it's taken me days to write 12 lines of code... good thing I don't do this for a living :) ) Rephrased, I need to find the under the child who's name matches the name under the corresponding child. From some of the examples on the web, I believe xpath might help but I've not been able to get even the simple examples working. Go figure, I just figured out what a namespace is... Any hints/tips/suggestions greatly appreciated especially with complete noob tutorials for xpath. Thanks for your time. Doug O'Leary -- https://mail.python.org/mailman/listinfo/python-list