[ https://issues.apache.org/jira/browse/SPARK-50839?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17913531#comment-17913531 ]
Hyukjin Kwon commented on SPARK-50839: -------------------------------------- Actually I think we should just leave this behavior as is ... maybe we had to allow this in Spark Classic too.. Spark Connect is already out, and I want to avoid breaking changes. Let's leave as is for now. > Spark Connect createDataFrame behavior does not match with classic Spark > ------------------------------------------------------------------------ > > Key: SPARK-50839 > URL: https://issues.apache.org/jira/browse/SPARK-50839 > Project: Spark > Issue Type: Bug > Components: Connect, PySpark > Affects Versions: 4.0.0 > Reporter: Harsh Motwani > Priority: Major > > Classic Spark and Spark Connect behave differently when `List(Integer)` is > passed as the first argument instead of `List(Row)` or `List(Tuple)`. I > personally agree with the classic Spark behavior and Spark Connect should > fail in this case as well. > In classic Spark shell (PySpark): > {code:java} > >>> spark.createDataFrame([1], "v int").show() > Traceback (most recent call last): > File "<stdin>", line 1, in <module> > ... > pyspark.errors.exceptions.base.PySparkTypeError: > [FIELD_DATA_TYPE_UNACCEPTABLE] StructType([StructField('v', IntegerType(), > True)]) can not accept object 1 in type <class 'int'>. > {code} > In Spark Connect shell (PySpark): > {code:java} > >>> spark.createDataFrame([1], "v int").show() > 25/01/15 14:05:01 WARN CheckAllocator: More than one DefaultAllocationManager > on classpath. Choosing first found > +---+ > | v| > +---+ > | 1| > +---+ > {code} -- This message was sent by Atlassian Jira (v8.20.10#820010) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org