n jira.
From: Reynold Xin [mailto:r...@databricks.com]
Sent: Thursday, September 17, 2015 12:28 AM
To: Pete Robbins
Cc: Dev
Subject: Re: Unable to acquire memory errors in HiveCompatibilitySuite
SparkEnv for the driver was created in SparkContext. The default parallelism
field is set to the num
SparkEnv for the driver was created in SparkContext. The default
parallelism field is set to the number of slots (max number of active
tasks). Maybe we can just use the default parallelism to compute that in
local mode.
On Wednesday, September 16, 2015, Pete Robbins wrote:
> so forcing the Shuff
so forcing the ShuffleMemoryManager to assume 32 cores and therefore
calculate a pagesize of 1MB passes the tests.
How can we determine the correct value to use in getPageSize rather than
Runtime.getRuntime.availableProcessors()?
On 16 September 2015 at 10:17, Pete Robbins wrote:
> I see what y
I see what you are saying. Full stack trace:
java.io.IOException: Unable to acquire 4194304 bytes of memory
at
org.apache.spark.util.collection.unsafe.sort.UnsafeExternalSorter.acquireNewPage(UnsafeExternalSorter.java:368)
at
org.apache.spark.util.collection.unsafe.sort.UnsafeExternalS
Can you paste the entire stacktrace of the error? In your original email
you only included the last function call.
Maybe I'm missing something here, but I still think the bad heuristics is
the issue.
Some operators pre-reserve memory before running anything in order to avoid
starvation. For examp
ok so let me try again ;-)
I don't think that the page size calculation matters apart from hitting the
allocation limit earlier if the page size is too large.
If a task is going to need X bytes, it is going to need X bytes. In this
case, for at least one of the tasks, X > maxmemory/no_active_task
It is exactly the issue here, isn't it?
We are using memory / N, where N should be the maximum number of active
tasks. In the current master, we use the number of cores to approximate the
number of tasks -- but it turned out to be a bad approximation in tests
because it is set to 32 to increase co
Oops... I meant to say "The page size calculation is NOT the issue here"
On 16 September 2015 at 06:46, Pete Robbins wrote:
> The page size calculation is the issue here as there is plenty of free
> memory, although there is maybe a fair bit of wasted space in some pages.
> It is that when we ha
The page size calculation is the issue here as there is plenty of free
memory, although there is maybe a fair bit of wasted space in some pages.
It is that when we have a lot of tasks each is only allowed to reach 1/n of
the available memory and several of the tasks bump in to that limit. With
task
Maybe we can change the heuristics in memory calculation to use
SparkContext.defaultParallelism if it is local mode.
On Tue, Sep 15, 2015 at 10:28 AM, Pete Robbins wrote:
> Yes and at least there is an override by setting spark.sql.test.master to
> local[8] , in fact local[16] worked on my 8 c
Yes and at least there is an override by setting spark.sql.test.master to
local[8] , in fact local[16] worked on my 8 core box.
I'm happy to use this as a workaround but the 32 hard-coded will fail
running build/tests on a clean checkout if you only have 8 cores.
On 15 September 2015 at 17:40, M
That test explicitly sets the number of executor cores to 32.
object TestHive
extends TestHiveContext(
new SparkContext(
System.getProperty("spark.sql.test.master", "local[32]"),
On Mon, Sep 14, 2015 at 11:22 PM, Reynold Xin wrote:
> Yea I think this is where the heuristics is faili
This is the culprit:
https://issues.apache.org/jira/browse/SPARK-8406
"2. Make `TestHive` use a local mode `SparkContext` with 32 threads to
increase parallelism
The major reason for this is that, the original parallelism of 2 is too
low to reproduce
the data loss issue. Also, higher concu
Ok so it looks like the max number of active tasks reaches 30. I'm not
setting anything as it is a clean environment with clean spark code
checkout. I'll dig further to see why so many tasks are active.
Cheers,
On 15 September 2015 at 07:22, Reynold Xin wrote:
> Yea I think this is where the he
Yea I think this is where the heuristics is failing -- it uses 8 cores to
approximate the number of active tasks, but the tests somehow is using 32
(maybe because it explicitly sets it to that, or you set it yourself? I'm
not sure which one)
On Mon, Sep 14, 2015 at 11:06 PM, Pete Robbins wrote:
Reynold, thanks for replying.
getPageSize parameters: maxMemory=515396075, numCores=0
Calculated values: cores=8, default=4194304
So am I getting a large page size as I only have 8 cores?
On 15 September 2015 at 00:40, Reynold Xin wrote:
> Pete - can you do me a favor?
>
>
> https://github.com
Pete - can you do me a favor?
https://github.com/apache/spark/blob/master/core/src/main/scala/org/apache/spark/shuffle/ShuffleMemoryManager.scala#L174
Print the parameters that are passed into the getPageSize function, and
check their values.
On Mon, Sep 14, 2015 at 4:32 PM, Reynold Xin wrote:
Is this on latest master / branch-1.5?
out of the box we reserve only 16% (0.2 * 0.8) of the memory for execution
(e.g. aggregate, join) / shuffle sorting. With a 3GB heap, that's 480MB. So
each task gets 480MB / 32 = 15MB, and each operator reserves at least one
page for execution. If your page s
18 matches
Mail list logo