[PySpark DataFrame] When a Row is not a Row

Nicholas Chammas Mon, 11 May 2015 13:50:30 -0700

This is really strange.

>>> # Spark 1.3.1
>>> print type(results)
<class 'pyspark.sql.dataframe.DataFrame'>


>>> a = results.take(1)[0]

>>> print type(a)
<class 'pyspark.sql.types.Row'>

>>> print pyspark.sql.types.Row
<class 'pyspark.sql.types.Row'>

>>> print type(a) == pyspark.sql.types.Row
False
>>> print isinstance(a, pyspark.sql.types.Row)
False

If I set a as follows, then the type checks pass fine.

a = pyspark.sql.types.Row('name')('Nick')

Is this a bug? What can I do to narrow down the source?

results is a massive DataFrame of spark-perf results.

Nick

[PySpark DataFrame] When a Row is not a Row

Reply via email to