from:"fsacerdoti"

Re: IOError on createDataFrame

2015-08-31 Thread fsacerdoti

There are two issues here: 1. Suppression of the true reason for failure. The spark runtime reports "TypeError" but that is not why the operation failed. 2. The low performance of loading a pandas dataframe. DISCUSSION Number (1) is easily fixed, and the primary purpose for my post. Number (2)

IOError on createDataFrame

2015-08-28 Thread fsacerdoti

Hello, Similar to the thread below [1], when I tried to create an RDD from a 4GB pandas dataframe I encountered the error TypeError: cannot create an RDD from type: However looking into the code shows this is raised from a generic "except Exception:" predicate (pyspark/sql/context.py:238 in