Re: [Spark SQL]: Can't write DataFrame after using explode function on multiple columns.

2020-08-03 Thread Henrique Oliveira
ger than it has to be. You might see if defining > those columns with list comprehensions forming a single select() statement > makes for a smaller DAG. > > On Mon, Aug 3, 2020 at 10:06 AM Henrique Oliveira > wrote: > >> Hi Patrick, thank you for your quick response.

Re: [Spark SQL]: Can't write DataFrame after using explode function on multiple columns.

2020-08-03 Thread Henrique Oliveira
Hi Patrick, thank you for your quick response. That's exactly what I think. Actually, the result of this processing is an intermediate table that is going to be used for other views generation. Another approach I'm trying now, is to move the "explosion" step for this "view generation" step, this wa

[Spark SQL]: Can't write DataFrame after using explode function on multiple columns.

2020-08-01 Thread Henrique Oliveira
I have a PySpark method that applies the explode function on every Array column on the DataFrame. def explode_column(df, column): select_cols = list(df.columns) col_position = select_cols.index(column) select_cols[col_position] = explode_outer(column).alias(column) return df.select