hi, im using spark dataframe API. i'm trying to give sum() a list parameter containing columns names as strings. when i'm putting columns names directly into the function- the script works' when i'm trying to provide it to the function as a parameter of type list- i get the error: " py4j.protocol.Py4JJavaError: An error occurred while calling o155.sum. : java.lang.ClassCastException: java.util.ArrayList cannot be cast to java.lang.String " using same kind of list parameter for groupBy() is working. this is my script:
groupBy_cols = ['date_expense_int', 'customer_id'] agged_cols_list = ['total_customer_exp_last_m','total_customer_exp_last_3m'] df = df.groupBy(groupBy_cols).sum(agged_cols_list) when i write it like so it works: df = df.groupBy(groupBy_cols).sum('total_customer_exp_last_m','total_customer_exp_last_3m') i tryied also to give sum() a list of column by using agged_cols_list2 = [] for i in agged_cols_list: agged_cols_list2.append(col(i)) also didn't work -- Sent from: http://apache-spark-user-list.1001560.n3.nabble.com/ --------------------------------------------------------------------- To unsubscribe e-mail: user-unsubscr...@spark.apache.org