Github user dondrake commented on the pull request:
https://github.com/apache/spark/pull/4521#issuecomment-74328474
This failure comes from my test, but it shouldn't fail when saving a Long
with the exception can't convert Integer to Long.
```
File
"/home/jenkins/workspace/SparkPullRequestBuilder/python/pyspark/sql.py", line
1475, in pyspark.sql.SQLContext.parquetFile
Failed example:
srdd.saveAsParquetFile(parquetFile)
Exception raised:
Traceback (most recent call last):
File "/usr/lib64/python2.6/doctest.py", line 1253, in __run
compileflags, 1) in test.globs
File "<doctest pyspark.sql.SQLContext.parquetFile[4]>", line 1, in
<module>
srdd.saveAsParquetFile(parquetFile)
File
"/home/jenkins/workspace/SparkPullRequestBuilder/python/pyspark/sql.py", line
1906, in saveAsParquetFile
self._jschema_rdd.saveAsParquetFile(path)
File
"/home/jenkins/workspace/SparkPullRequestBuilder/python/lib/py4j-0.8.2.1-src.zip/py4j/java_gateway.py",
line 538, in __call__
self.target_id, self.name)
File
"/home/jenkins/workspace/SparkPullRequestBuilder/python/lib/py4j-0.8.2.1-src.zip/py4j/protocol.py",
line 300, in get_return_value
format(target_id, '.', name), value)
Py4JJavaError: An error occurred while calling o715.saveAsParquetFile.
: org.apache.spark.SparkException: Job aborted due to stage failure:
Task 3 in stage 55.0 failed 1 times, most recent failure: Lost task 3.0 in
stage 55.0 (TID 147, localhost): java.lang.ClassCastException:
java.lang.Integer cannot be cast to java.lang.Long
at scala.runtime.BoxesRunTime.unboxToLong(BoxesRunTime.java:110)
at
org.apache.spark.sql.catalyst.expressions.GenericRow.getLong(Row.scala:153)
at
org.apache.spark.sql.parquet.MutableRowWriteSupport.consumeType(ParquetTableSupport.scala:350)
at
org.apache.spark.sql.parquet.MutableRowWriteSupport.write(ParquetTableSupport.scala:328)
at
org.apache.spark.sql.parquet.MutableRowWriteSupport.write(ParquetTableSupport.scala:314)
at
```
It appears to be casting (not converting) an Integer to a Long, which you
can't do. But, why does it think this is an Integer in the first place when
it's defined as a LongType in Python and the spark Scala code??
I can confirm that I did see this in Spark 1.2.0 which motivated me to
start this JIRA, and why I added this additional test case.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]