Re: Error while running Spark SQL join when using Spark 1.0.1

2014-07-15 Thread Keith Simmons
Cool. So Michael's hunch was correct, it is a thread issue. I'm currently using a tarball build, but I'll do a spark build with the patch as soon as I have a chance and test it out. Keith On Tue, Jul 15, 2014 at 4:14 PM, Zongheng Yang wrote: > Hi Keith & gorenuru, > > This patch (https://git

Re: Error while running Spark SQL join when using Spark 1.0.1

2014-07-15 Thread Zongheng Yang
Hi Keith & gorenuru, This patch (https://github.com/apache/spark/pull/1423) solves the errors for me in my local tests. If possible, can you guys test this out to see if it solves your test programs? Thanks, Zongheng On Tue, Jul 15, 2014 at 3:08 PM, Zongheng Yang wrote: > - user@incubator > > H

Re: Error while running Spark SQL join when using Spark 1.0.1

2014-07-15 Thread Zongheng Yang
- user@incubator Hi Keith, I did reproduce this using local-cluster[2,2,1024], and the errors look almost the same. Just wondering, despite the errors did your program output any result for the join? On my machine, I could see the correct output. Zongheng On Tue, Jul 15, 2014 at 1:46 PM, Micha

Re: Error while running Spark SQL join when using Spark 1.0.1

2014-07-15 Thread Michael Armbrust
Thanks for the extra info. At a quick glance the query plan looks fine to me. The class IntegerType does build a type tag I wonder if you are seeing the Scala issue manifest in some new way. We will attempt to reproduce locally. On Tue, Jul 15, 2014 at 1:41 PM, gorenuru wrote: > Just my

Re: Error while running Spark SQL join when using Spark 1.0.1

2014-07-15 Thread gorenuru
Just my "few cents" on this. I having the same problems with v 1.0.1 but this bug is sporadic and looks like is relayed to object initialization. Even more, i'm not using any SQL or something. I just have utility class like this: object DataTypeDescriptor { type DataType = String val BOOLE

Re: Error while running Spark SQL join when using Spark 1.0.1

2014-07-15 Thread Keith Simmons
Sure thing. Here you go: == Logical Plan == Sort [key#0 ASC] Project [key#0,value#1,value#2] Join Inner, Some((key#0 = key#3)) SparkLogicalPlan (ExistingRdd [key#0,value#1], MapPartitionsRDD[2] at mapPartitions at basicOperators.scala:176) SparkLogicalPlan (ExistingRdd [value#2,key#3], M

Re: Error while running Spark SQL join when using Spark 1.0.1

2014-07-15 Thread Michael Armbrust
Can you print out the queryExecution? (i.e. println(sql().queryExecution)) On Tue, Jul 15, 2014 at 12:44 PM, Keith Simmons wrote: > To give a few more details of my environment in case that helps you > reproduce: > > * I'm running spark 1.0.1 downloaded as a tar ball, not built myself > *

Re: Error while running Spark SQL join when using Spark 1.0.1

2014-07-15 Thread Keith Simmons
To give a few more details of my environment in case that helps you reproduce: * I'm running spark 1.0.1 downloaded as a tar ball, not built myself * I'm running in stand alone mode, with 1 master and 1 worker, both on the same machine (though the same error occurs with two workers on two machines

Re: Error while running Spark SQL join when using Spark 1.0.1

2014-07-15 Thread Zongheng Yang
FWIW, I am unable to reproduce this using the example program locally. On Tue, Jul 15, 2014 at 11:56 AM, Keith Simmons wrote: > Nope. All of them are registered from the driver program. > > However, I think we've found the culprit. If the join column between two > tables is not in the same colu

Re: Error while running Spark SQL join when using Spark 1.0.1

2014-07-15 Thread Keith Simmons
Nope. All of them are registered from the driver program. However, I think we've found the culprit. If the join column between two tables is not in the same column position in both tables, it triggers what appears to be a bug. For example, this program fails: import org.apache.spark.SparkConte

Re: Error while running Spark SQL join when using Spark 1.0.1

2014-07-15 Thread Michael Armbrust
Are you registering multiple RDDs of case classes as tables concurrently? You are possibly hitting SPARK-2178 which is caused by SI-6240 . On Tue, Jul 15, 2014 at 10:49 AM, Keith Simmons wrote: > H