Dropping nested dataframe column

Ross.Cramblit Thu, 10 Mar 2016 11:49:00 -0800

Is there any support for dropping a nested column in a dataframe? I have tried 
dropping with the Column reference as well as a string of the column name, but 
the returned dataframe is unchanged.


>>> df = sqlContext.jsonRDD(sc.parallelize(['{"properties": {"col1": "a", 
>>> "col2": "b"}}']))
>>> df.printSchema()
root
 |-- properties: struct (nullable = true)
 |    |-- col1: string (nullable = true)
 |    |-- col2: string (nullable = true)

>>> df.drop(df['properties']['col1']).printSchema()
root
 |-- properties: struct (nullable = true)
 |    |-- col1: string (nullable = true)
 |    |-- col2: string (nullable = true)

>>> df.drop('col1').printSchema()
root
 |-- properties: struct (nullable = true)
 |    |-- col1: string (nullable = true)
 |    |-- col2: string (nullable = true)

Dropping nested dataframe column

Reply via email to