> I'd try to avoid copying the list and instead just iterate over it: > > > def iterate_by_key(l, key): > for d in l: > try: > yield l[key] > except: > continue
Hm, that won't work for me b/c I don't know all the keys beforehand. I could certainly do a unique(list.keys()) or something like that beforehand, but I guess this does away with the speed advantage. > Since your operation not only iterates over a list but first sorts it, it > requires a modification which must not happen while iterating. You work > around this by copying the list first. So when I go like for item in list: item[1].sort() I actually modify *list*? I didn't realize that; I thought it'd just be a copy of it. Anyway, I could just try for item in list: newitem = sorted( item[1] ) in that case. > which is a no-no. Create a custom iterator function (IIRC they are > called "generators") and you should be fine. I'll look into this, thanks for the hint. Cheers, Nico On Tue, May 4, 2010 at 12:46 PM, Ulrich Eckhardt <eckha...@satorlaser.com> wrote: > Nico Schlömer wrote: >> I ran into a bit of an unexpected issue here with itertools, and I >> need to say that I discovered itertools only recently, so maybe my way >> of approaching the problem is "not what I want to do". >> >> Anyway, the problem is the following: >> I have a list of dictionaries, something like >> >> [ { "a": 1, "b": 1, "c": 3 }, >> { "a": 1, "b": 1, "c": 4 }, >> ... >> ] >> >> and I'd like to iterate through all items with, e.g., "a":1. What I do >> is sort and then groupby, >> >> my_list.sort( key=operator.itemgetter('a') ) >> my_list_grouped = itertools.groupby( my_list, operator.itemgetter('a') ) >> >> and then just very simply iterate over my_list_grouped, >> >> for my_item in my_list_grouped: >> # do something with my_item[0], my_item[1] > > I'd try to avoid copying the list and instead just iterate over it: > > > def iterate_by_key(l, key): > for d in l: > try: > yield l[key] > except: > continue > > Note that you could also ask the dictionary first if it has the key, but I'm > told this way is even faster since it only requires a single lookup > attempt. > > >> Now, inside this loop I'd like to again iterate over all items with >> the same 'b'-value -- no problem, just do the above inside the loop: >> >> for my_item in my_list_grouped: >> # group by keyword "b" >> my_list2 = list( my_item[1] ) >> my_list2.sort( key=operator.itemgetter('b') ) >> my_list_grouped = itertools.groupby( my_list2, >> operator.itemgetter('b') ) >> for e in my_list_grouped: >> # do something with e[0], e[1] >> >> That seems to work all right. > > Since your operation not only iterates over a list but first sorts it, it > requires a modification which must not happen while iterating. You work > around this by copying the list first. > >> Now, the problem occurs when this all is wrapped into an outer loop, such >> as >> >> for k in [ 'first pass', 'second pass' ]: >> for my_item in my_list_grouped: >> # bla, the above >> >> To be able to iterate more than once through my_list_grouped, I have >> to convert it into a list first, so outside all loops, I go like >> >> my_list.sort( key=operator.itemgetter('a') ) >> my_list_grouped = itertools.groupby( my_list, operator.itemgetter('a') ) >> my_list_grouped = list( my_list_grouped ) >> >> This, however, makes it impossible to do the inner sort and >> groupby-operation; you just get the very first element, and that's it. > > I believe that you are doing a modifying operation inside the the iteration, > which is a no-no. Create a custom iterator function (IIRC they are > called "generators") and you should be fine. Note that this should also > perform better since copying and sorting are not exactly for free, though > you may not notice that with small numbers of objects. > > Uli > > -- > Sator Laser GmbH > Geschäftsführer: Thorsten Föcking, Amtsgericht Hamburg HR B62 932 > > -- > http://mail.python.org/mailman/listinfo/python-list > -- http://mail.python.org/mailman/listinfo/python-list