@Olivier, did you use scala's parallel collections by any chance? If not, what form of concurrency were you using?
2015-09-10 13:01 GMT-07:00 Andrew Or <and...@databricks.com>: > Thanks for reporting this, I have filed > https://issues.apache.org/jira/browse/SPARK-10548. > > 2015-09-10 9:09 GMT-07:00 Olivier Toupin <olivier.tou...@gmail.com>: > >> Look at this code: >> >> >> https://github.com/apache/spark/blob/branch-1.5/sql/core/src/main/scala/org/apache/spark/sql/execution/SQLExecution.scala#L42 >> >> and >> >> >> https://github.com/apache/spark/blob/branch-1.5/sql/core/src/main/scala/org/apache/spark/sql/execution/SQLExecution.scala#L87 >> >> This exception is there to prevent "nested `withNewExecutionId`" but what >> if >> there is two concurrent commands that happens to run on the same thread? >> Then the thread local getLocalProperty will returns an execution id, >> triggering that exception. >> >> This is not hypothetical, one of our spark job crash randomly with the >> following stack trace (Using Spark 1.5, it ran without problem in Spark >> 1.4.1): >> >> java.lang.IllegalArgumentException: spark.sql.execution.id is already set >> at >> >> org.apache.spark.sql.execution.SQLExecution$.withNewExecutionId(SQLExecution.scala:87) >> at >> org.apache.spark.sql.DataFrame.withNewExecutionId(DataFrame.scala:1904) >> at org.apache.spark.sql.DataFrame.collect(DataFrame.scala:1385) >> >> >> Also imagine the following: >> >> future { df1.count() } >> future { df2.count() } >> >> Could we double check this if this an issue? >> >> >> >> >> -- >> View this message in context: >> http://apache-spark-developers-list.1001551.n3.nabble.com/Concurrency-issue-in-SQLExecution-withNewExecutionId-tp14035.html >> Sent from the Apache Spark Developers List mailing list archive at >> Nabble.com. >> >> --------------------------------------------------------------------- >> To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org >> For additional commands, e-mail: dev-h...@spark.apache.org >> >> >