Hi, I can't successfully execute a query with WINDOW function.
The statements are following: val orcFile = sqlContext.read.parquet("/data/flash/spark/dat14sn").filter("upper(project)='EN'") orcFile.registerTempTable("d1") sqlContext.sql("SELECT day,page,dense_rank() OVER (PARTITION BY day ORDER BY pageviews DESC) as rank FROM d1").filter("rank <= 20").sort($"day",$"rank").collect().foreach(println) with default spark.driver.memory I am getting java.lang.OutOfMemoryError: Java heap space. The same if I set spark.driver.memory=10g. When I set spark.driver.memory=45g (the box has 256GB of RAM) the execution fails with a different error: 15/12/29 23:03:19 WARN HeartbeatReceiver: Removing executor 0 with no recent heartbeats: 129324 ms exceeds timeout 120000 ms And I see that GC takes a lot of time. What is a proper way to execute statements above? I see the similar problems reported http://stackoverflow.com/questions/32196859/org-apache-spark-shuffle-fetchfailedexception http://stackoverflow.com/questions/32544478/spark-memory-settings-for-count-action-in-a-big-table -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Problem-with-WINDOW-functions-tp25833.html Sent from the Apache Spark User List mailing list archive at Nabble.com. --------------------------------------------------------------------- To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org