hi,
im using spark dataframe API.
i'm trying to give sum() a list parameter containing columns names as
strings.
when i'm putting columns names directly into the function- the script works'
when i'm trying to provide it to the function as a parameter of type list- i
get the error: 
"
py4j.protocol.Py4JJavaError: An error occurred while calling o155.sum.
: java.lang.ClassCastException: java.util.ArrayList cannot be cast to
java.lang.String
"
using same kind of list parameter for groupBy() is working.
this is my script:

groupBy_cols = ['date_expense_int', 'customer_id']
agged_cols_list = ['total_customer_exp_last_m','total_customer_exp_last_3m']

df = df.groupBy(groupBy_cols).sum(agged_cols_list)


when i write it like so it works:
df =
df.groupBy(groupBy_cols).sum('total_customer_exp_last_m','total_customer_exp_last_3m')

i tryied also to give sum() a list of column by using

agged_cols_list2 = []
for i in agged_cols_list:
    agged_cols_list2.append(col(i))

also didn't work



--
Sent from: http://apache-spark-user-list.1001560.n3.nabble.com/

---------------------------------------------------------------------
To unsubscribe e-mail: user-unsubscr...@spark.apache.org

Reply via email to