[ https://issues.apache.org/jira/browse/SPARK-50815?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Wenchen Fan resolved SPARK-50815. --------------------------------- Fix Version/s: 4.1.0 Resolution: Fixed Issue resolved by pull request 49487 [https://github.com/apache/spark/pull/49487] > Fix bug where passing null Variants in createDataFrame causes it to fail > ------------------------------------------------------------------------ > > Key: SPARK-50815 > URL: https://issues.apache.org/jira/browse/SPARK-50815 > Project: Spark > Issue Type: Bug > Components: SQL > Affects Versions: 4.0.0 > Reporter: Harsh Motwani > Assignee: Harsh Motwani > Priority: Major > Labels: pull-request-available > Fix For: 4.1.0 > > > Passing "None" as one of the Variant values causes createDataFrame to fail. > This could also cause issues in other code-paths such as UDFs. > ``` > >>> spark.createDataFrame([(VariantVal(bytearray([12, 1]), bytearray([1, 0, > >>> 0])),), (None,)], "v variant").show() > Traceback (most recent call last): > File "<stdin>", line 1, in <module> > File "/Users/harsh.motwani/spark/python/pyspark/sql/session.py", line 1567, > in createDataFrame > return super(SparkSession, self).createDataFrame( # type: > ignore[call-overload] > File "/Users/harsh.motwani/spark/python/pyspark/sql/session.py", line 1611, > in _create_dataframe > if not is_remote_only() and isinstance(data, RDD): > File "/Users/harsh.motwani/spark/python/pyspark/sql/session.py", line 1191, > in _createFromLocal > print("TUPLE DATA: ", tupled_data) > File "/Users/harsh.motwani/spark/python/pyspark/sql/session.py", line 1191, > in <listcomp> > print("TUPLE DATA: ", tupled_data) > File "/Users/harsh.motwani/spark/python/pyspark/sql/types.py", line 1493, > in toInternal > return tuple( > File "/Users/harsh.motwani/spark/python/pyspark/sql/types.py", line 1494, > in <genexpr> > f.toInternal(v) if c else v > File "/Users/harsh.motwani/spark/python/pyspark/sql/types.py", line 1095, > in toInternal > return self.dataType.toInternal(obj) > File "/Users/harsh.motwani/spark/python/pyspark/sql/types.py", line 1587, > in toInternal > assert isinstance(variant, VariantVal) > AssertionError > ``` -- This message was sent by Atlassian Jira (v8.20.10#820010) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org