Hello,
`itertools.groupby` is evaluated lazily and the `g`s in your code are
generators not lists. This might cause your problem. Casting everything to
lists might help here, e.g.:
grp2 = [(k, list(g)) for k,g in groupby(grp1, lambda e: e[1])]
HTH
Eike
2016-08-05 7:31 GMT+02:00 林家銘 :
> Hi
Hi
I wrote a map function to aggregate data in a partition, and this function
using itertools.groupby for more than twice, then there comes the pickle
error .
Here is what I do
===Driver Code===
pair_count = df.mapPartitions(lambda iterable: pair_func_cnt(iterable))
pair_count.collection()
===M