Hi,

I can't successfully execute a query with WINDOW function.

The statements are following:

val orcFile =
sqlContext.read.parquet("/data/flash/spark/dat14sn").filter("upper(project)='EN'")
orcFile.registerTempTable("d1")
 sqlContext.sql("SELECT day,page,dense_rank() OVER (PARTITION BY day ORDER
BY pageviews DESC) as rank FROM d1").filter("rank <=
20").sort($"day",$"rank").collect().foreach(println)

with default
spark.driver.memory 

I am getting java.lang.OutOfMemoryError: Java heap space.
The same if I set spark.driver.memory=10g.

When I set spark.driver.memory=45g (the box has 256GB of RAM) the execution
fails with a different error:

15/12/29 23:03:19 WARN HeartbeatReceiver: Removing executor 0 with no recent
heartbeats: 129324 ms exceeds timeout 120000 ms

And I see that GC takes a lot of time.

What is a proper way to execute statements above?

I see the similar problems reported
http://stackoverflow.com/questions/32196859/org-apache-spark-shuffle-fetchfailedexception
http://stackoverflow.com/questions/32544478/spark-memory-settings-for-count-action-in-a-big-table









--
View this message in context: 
http://apache-spark-user-list.1001560.n3.nabble.com/Problem-with-WINDOW-functions-tp25833.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org

Reply via email to