[ https://issues.apache.org/jira/browse/SPARK-51312?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Max Gekk reassigned SPARK-51312: -------------------------------- Assignee: Mihailo Milosevic > Fix createDataFrame from RDD[Row] > --------------------------------- > > Key: SPARK-51312 > URL: https://issues.apache.org/jira/browse/SPARK-51312 > Project: Spark > Issue Type: Bug > Components: Spark Core > Affects Versions: 4.1.0 > Reporter: Mihailo Milosevic > Assignee: Mihailo Milosevic > Priority: Major > Labels: pull-request-available > > It was noticed when we run following query, that execution fails: > {code:java} > val schema = new org.apache.spark.sql.types.StructType().add("a", "date") > val rdd = spark.sparkContext.parallelize( > org.apache.spark.sql.Row(java.time.LocalDate.of(2020, 5, 13)) :: Nil) > spark.createDataFrame(rdd, schema).collect() {code} > > This happens due to spark.sql.datetime.java8API.enabled being off by default. > If we flip the flag, we would get the correct result. However, since Rows are > something that users can provide, we need to make createDataFrame lenient to > java.sql.Date and java.time.LocalDate inputs. This is important, as catching > errors is really hard in this case, since createDataFrame is a transformation > (as in RDDs are lazy), and it does not perform any collection until we call a > collection action. -- This message was sent by Atlassian Jira (v8.20.10#820010) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org