On 12/31/1969 05:00 PM, wrote: > Hello everybody, > I'm coming from a Perl background and try to parse some Exim Logfiles into a > data structure of dictionaries. The regex and geoip part works fine and I'd > like to save the email adress, the countries (from logins) and the count of > logins. > > The structure I'd like to have: > > result = { > 'f...@bar.de': { > 'Countries': [DE,DK,UK] > 'IP': ['192.168.1.1','172.10.10.10'] > 'Count': [12] > } > 'b...@foo.de': { > 'Countries': [DE,SE,US] > 'IP': ['192.168.1.2','172.10.10.11'] > 'Count': [23] > } > }
I presume that's pseudo-code, since it's missing punctuation (commas between elements) and the country codes are not quoted.... > > I don't have a problem when I do these three seperately like this with a one > dimensonial dict (snippet): > > result = defaultdict(list) > > with open('/var/log/exim4/mainlog',encoding="latin-1") as logfile: > for line in logfile: > result = pattern.search(line) > if (result): > login_ip = result.group("login_ip") > login_auth = result.group("login_auth") > response = reader.city(login_ip) > login_country = response.country.iso_code > if login_auth in result and login_country in result[login_auth]: > continue > else: > result[login_auth].append(login_country) > else: > continue > > This checks if the login_country exists within the list of the specific > login_auth key, adds them if they don't exist and gives me the results I want. > This also works for the ip addresses and the number of logins without any > problems. > > As I don't want to repeat these loops three times with three different data > structures I'd like to do this in one step. There are two main problems I > don't understand right now: > > 1. How do I check if a value exists within a list which is the value of a key > which is again a value of a key in my understanding exists? What I like to do: > > if login_auth in result and (login_country in result[login_auth][Countries]) > continue you don't actually need to check (there's a Python aphorism that goes something like "It's better to ask forgiveness than permission"). You can do: try: result[login_auth]['Countries'].append(login_country) except KeyError: # means there was no entry for login_auth # so add one here that will happily add another instance of a country if it's already there, but there's no problem with going and cleaning the 'Countries' value later (one trick is to take that list, convert it to a set, then (if you want) convert it back to a list if you need unique values. you're overloading the name result here so this won't work literally - you default it outside the loop, then also set it to the regex answer... I assume you can figure out how to fix that up. _______________________________________________ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: https://mail.python.org/mailman/listinfo/tutor