Hi,

I wanted to obtain a grouped by frame from a dataframe.

A snippet of the column on which I need to perform groupby is below.

> df.select("To").show()

To
ArrayBuffer(vance...
ArrayBuffer(vance...
ArrayBuffer(rober...
ArrayBuffer(richa...
ArrayBuffer(guill...
ArrayBuffer(m..pr...
ArrayBuffer(rich....
ArrayBuffer(issue...
ArrayBuffer(jim.f...
ArrayBuffer(richa...


A sample field is as below

> df.select("To").collect()[0]

Row(To=[u'[email protected]', u'[email protected]',
u'[email protected]', u'[email protected]',
u'[email protected]', u'[email protected]',
u'[email protected]', u'[email protected]',
u'[email protected]', u'[email protected]',
u'[email protected]', u'[email protected]',
u'[email protected]'])

I want to perform a group by on "To" column but perform it by each
recipient of the email rather than the entire field.

Is there a way to do this using the dataframe groupBy command ?


Regards,
Suraj

Reply via email to