Can you try to write the result into another file instead? Let's see if there
any issue in the executors side .
sqlContext.sql("SELECT day,page,dense_rank() OVER (PARTITION BY day ORDER BY
pageviews DESC) as rank FROM d1").filter("rank <=
20").sort($"day",$"rank").write.parquet("/path/to/file")
-----Original Message-----
From: vadimtk [mailto:[email protected]]
Sent: Wednesday, December 30, 2015 9:29 AM
To: [email protected]
Subject: Problem with WINDOW functions?
Hi,
I can't successfully execute a query with WINDOW function.
The statements are following:
val orcFile =
sqlContext.read.parquet("/data/flash/spark/dat14sn").filter("upper(project)='EN'")
orcFile.registerTempTable("d1")
sqlContext.sql("SELECT day,page,dense_rank() OVER (PARTITION BY day ORDER BY
pageviews DESC) as rank FROM d1").filter("rank <=
20").sort($"day",$"rank").collect().foreach(println)
with default
spark.driver.memory
I am getting java.lang.OutOfMemoryError: Java heap space.
The same if I set spark.driver.memory=10g.
When I set spark.driver.memory=45g (the box has 256GB of RAM) the execution
fails with a different error:
15/12/29 23:03:19 WARN HeartbeatReceiver: Removing executor 0 with no recent
heartbeats: 129324 ms exceeds timeout 120000 ms
And I see that GC takes a lot of time.
What is a proper way to execute statements above?
I see the similar problems reported
http://stackoverflow.com/questions/32196859/org-apache-spark-shuffle-fetchfailedexception
http://stackoverflow.com/questions/32544478/spark-memory-settings-for-count-action-in-a-big-table
--
View this message in context:
http://apache-spark-user-list.1001560.n3.nabble.com/Problem-with-WINDOW-functions-tp25833.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected] For additional
commands, e-mail: [email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]