Why not write a helper function? Something like
def group_by(iterable, groupfunc, itemfunc=lambda x:x, sortfunc=lambda
x:x): # Python 2 & 3 compatible!
D = {}
for x in iterable:
group = groupfunc(x)
D[group] = D.get(group, []) + [itemfunc(x)]
if sortfunc is not None:
for group in D:
D[group] = sorted(D[group], key=sortfunc)
return D
Then:
student_list = [ ('james', 'Dublin'), ('jim', 'Cork'), ('mary', 'Cork'),
('fred', 'Dublin') ]
student_by_school = group_by(student_list, lambda stu_sch : stu_sch[1],
lambda stu_sch : stu_sch[0])
print (student_by_school)
{'Dublin': ['fred', 'james'], 'Cork': ['jim', 'mary']}
Regards
Rob Cliffe
On 28/06/2018 16:25, Nicolas Rolin wrote:
Hi,
I use list and dict comprehension a lot, and a problem I often have is
to do the equivalent of a group_by operation (to use sql terminology).
For example if I have a list of tuples (student, school) and I want to
have the list of students by school the only option I'm left with is
to write
student_by_school = defaultdict(list)
for student, school in student_school_list:
student_by_school[school].append(student)
What I would expect would be a syntax with comprehension allowing me
to write something along the lines of:
student_by_school = {group_by(school): student for school, student
in student_school_list}
or any other syntax that allows me to regroup items from an iterable.
Small FAQ:
Q: Why include something in comprehensions when you can do it in a
small number of lines ?
A: A really appreciable part of the list and dict comprehension is the
fact that it allows the developer to be really explicit about what he
wants to do at a given line.
If you see a comprehension, you know that the developer wanted to have
an iterable and not have any side effect other than depleting the
iterator (if he respects reasonable code guidelines).
Initializing an object and doing a for loop to construct it is both
too long and not explicit enough about what is intended.
It should be reserved for intrinsically complex operations, not one of
the base operation one can want to do with lists and dicts.
Q: Why group by in particular ?
A: If we take SQL queries
(https://en.wikipedia.org/wiki/SQL_syntax#Queries) as a reasonable way
of seeing how people need to manipulate data on a day-to-day basis, we
can see that dict comprehensions already covers most of the base
operations, the only missing operations being group by and having.
Q: Why not use it on list with syntax such as
student_by_school = [
school, student
for school, student in student_school_list
group by school
]
?
A: It would create either a discrepancy with iterators or a perhaps
misleading semantic (the one from itertools.groupby, which requires
the iterable to be sorted in order to be useful).
Having the option do do it with a dict remove any ambiguity and should
be enough to cover most "group by" applications.
Examples:
edible_list = [('fruit', 'orange'), ('meat', 'eggs'), ('meat',
'spam'), ('fruit', 'apple'), ('vegetable', 'fennel'), ('fruit',
'pineapple'), ('fruit', 'pineapple'), ('vegetable', 'carrot')]
edible_list_by_food_type = {group_by(food_type): edible for
food_type, edible in edible_list}
print(edible_list_by_food_type)
{'fruit': ['orange', 'pineapple'], 'meat': ['eggs', 'spam'],
'vegetable': ['fennel', 'carrot']}
bank_transactions = [200.0, -357.0, -9.99, -15.6, 4320.0, -12000]
splited_bank_transactions = {group_by('credit' if amount > 0 else
'debit'): amount for amount in bank_transactions}
print(splited_bank_transactions)
{'credit': [200.0, 4320.0], 'debit': [-357.0, -9.99, -15.6, -1200.0]}
--
Nicolas Rolin
<http://www.avg.com/email-signature?utm_medium=email&utm_source=link&utm_campaign=sig-email&utm_content=emailclient>
Virus-free. www.avg.com
<http://www.avg.com/email-signature?utm_medium=email&utm_source=link&utm_campaign=sig-email&utm_content=emailclient>
<#DAB4FAD8-2DD7-40BB-A1B8-4E2AA1F9FDF2>
_______________________________________________
Python-ideas mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/
_______________________________________________
Python-ideas mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/