SparkSQL TPC-H query 3 joining multiple tables

2014-09-03 Thread Samay
7; AND o_orderdate < '1995-03-15' AND l_shipdate > '1995-03-15' GROUP BY l_orderkey, o_orderdate, o_shippriority ORDER BY revenue desc, o_orderdate LIMIT 10; The same syntax works when I join 2 tables (TPC-H query 12 for instance). Any ideas as to what the issue is? Thanks

SparkSQL: Key not valid while running TPC-H

2014-09-22 Thread Samay
k.storage.blockManagerSlaveTimeoutMs 10 spark.shuffle.memoryFraction 0.3 spark.shuffle.consolidateFiles true spark.shuffle.file.buffer.kb 512 spark.akka.timeout 600 spark.akka.framesize 512 spark.akka.threads 8 spark.core.connection.ack.wait.timeout 600 spark.spark.sql.shuffle.partitions 320 Re

SparkSQL: Freezing while running TPC-H query 5

2014-09-23 Thread Samay
I am seeing similar behaviour on several other queries where there are long pauses of 200-300s before the query starts making progress on the master. Some of the queries complete while the others do not. Any help would be appreciated. Regards, Samay spark-defaults.conf <http://apache-spark-user-

Re: SparkSQL: Freezing while running TPC-H query 5

2014-09-23 Thread Samay
Hey Dan, Thanks for your reply. I have a couple of questions. 1) Were you able to verify that this is because of GC? If yes, then could you let me know how. 2) If this is GC, then do you know of any tuning I can do to reduce this GC pause? Regards, Samay On Tue, Sep 23, 2014 at 11:15 PM, Dan