found the reporting bug in 1.6.2-
Utils.bytesToString(maxMemory - blocksMemoryUsed)) should've been used

https://github.com/apache/spark/blob/v2.0.1/core/src/main/scala/org/apache/spark/storage/memory/MemoryStore.scala#L154

https://github.com/apache/spark/blob/v1.6.2/core/src/main/scala/org/apache/spark/storage/MemoryStore.scala#L396

Regards,

Sushrut Ikhar
[image: https://]about.me/sushrutikhar
<https://about.me/sushrutikhar?promo=email_sig>


On Fri, Oct 28, 2016 at 7:57 PM, Sushrut Ikhar <sushrutikha...@gmail.com>
wrote:

> Hi,
> I am seeing wrong computation of storage memory available which is leading
> to executor failures. I have allocated 8g memory with params:
> spark.memory.fraction=0.7
> spark.memory.storageFraction=0.4
> As expected, I was able to see 5.2 GB storage memory in UI.
> However, as per memory store logs I am seeing that free memory is
> increasing as the rdds are getting cached; which should have been ideally
> decreasing from 5.2 to 0.
>
> Eventually, the executor runs OOM whereas the Memory store is reporting it
> has 5G available.
>
>
> Attached executor logs -
>
> 2016-10-28 12:36:51,996 INFO  [Executor task launch worker-10]
> executor.Executor (Logging.scala:logInfo(58)) - Running task 324.0 in stage
> 1.0 (TID 760)
> 2016-10-28 12:36:52,019 INFO  [Executor task launch worker-10]
> spark.CacheManager (Logging.scala:logInfo(58)) - Partition rdd_10_324 not
> found, computing it
> 2016-10-28 12:36:52,031 INFO  [Executor task launch worker-8]
> storage.MemoryStore (Logging.scala:logInfo(58)) - *Block rdd_10_145
> stored as values in memory (estimated size 65.0 MB, free 260.7 MB)*
> 2016-10-28 12:36:52,062 INFO  [Executor task launch worker-10] 
> storage.ShuffleBlockFetcherIterator
> (Logging.scala:logInfo(58)) - Getting 426 non-empty blocks out of 426 blocks
> 2016-10-28 12:36:52,109 INFO  [Executor task launch worker-10] 
> storage.ShuffleBlockFetcherIterator
> (Logging.scala:logInfo(58)) - Started 37 remote fetches in 47 ms
> 2016-10-28 12:36:52,130 INFO  [Executor task launch worker-8]
> executor.Executor (Logging.scala:logInfo(58)) - Finished task 145.0 in
> stage 1.0 (TID 581). 268313 bytes result sent to driver
> 2016-10-28 12:36:52,150 INFO  [dispatcher-event-loop-7] 
> executor.CoarseGrainedExecutorBackend
> (Logging.scala:logInfo(58)) - Got assigned task 761
> 2016-10-28 12:36:52,150 INFO  [Executor task launch worker-8]
> executor.Executor (Logging.scala:logInfo(58)) - Running task 325.0 in stage
> 1.0 (TID 761)
> 2016-10-28 12:36:52,164 INFO  [Executor task launch worker-8]
> spark.CacheManager (Logging.scala:logInfo(58)) - Partition rdd_10_325 not
> found, computing it
> 2016-10-28 12:36:52,247 INFO  [Executor task launch worker-8] 
> storage.ShuffleBlockFetcherIterator
> (Logging.scala:logInfo(58)) - Getting 426 non-empty blocks out of 426 blocks
> 2016-10-28 12:36:52,264 INFO  [Executor task launch worker-8] 
> storage.ShuffleBlockFetcherIterator
> (Logging.scala:logInfo(58)) - Started 37 remote fetches in 18 ms
> 2016-10-28 12:36:52,591 INFO  [Executor task launch worker-6]
> storage.MemoryStore (Logging.scala:logInfo(58)) -* Block rdd_10_45 stored
> as values in memory (estimated size 65.0 MB, free 325.7 MB)*
> 2016-10-28 12:36:52,646 INFO  [Executor task launch worker-6]
> executor.Executor (Logging.scala:logInfo(58)) - Finished task 45.0 in stage
> 1.0 (TID 481). 266368 bytes result sent to driver
>
> Eventual Failures logs-
>
> 2016-10-28 12:53:06,718 WARN  [Executor task launch worker-13]
> storage.MemoryStore (Logging.scala:logWarning(70)) - Not enough space to
> cache rdd_10_656 in memory! (computed 45.2 MB so far)
> 2016-10-28 12:53:06,718 INFO  [Executor task launch worker-13]
> storage.MemoryStore (Logging.scala:logInfo(58)) - Memory use = 5.0 GB
> (blocks) + 211.4 MB (scratch space shared across 103 tasks(s)) = 5.2 GB.
> Storage limit = 5.2 GB.
> 2016-10-28 12:53:06,718 INFO  [Executor task launch worker-13]
> storage.BlockManager (Logging.scala:logInfo(58)) - Found block rdd_10_656
> locally
> 2016-10-28 12:53:06,719 INFO  [Executor task launch worker-12]
> storage.MemoryStore (Logging.scala:logInfo(58)) - 1 blocks selected for
> dropping
> 2016-10-28 12:53:06,720 INFO  [Executor task launch worker-12]
> storage.BlockManager (Logging.scala:logInfo(58)) - Dropping block
> rdd_10_719 from memory
> 2016-10-28 12:53:06,720 INFO  [Executor task launch worker-12]
> storage.BlockManager (Logging.scala:logInfo(58)) - Writing block rdd_10_719
> to disk
> 2016-10-28 12:53:06,736 ERROR [Executor task launch worker-15]
> executor.Executor (Logging.scala:logError(95)) - Exception in task 657.0 in
> stage 4.0 (TID 4565)
> java.lang.OutOfMemoryError: Unable to acquire 262144 bytes of memory, got
> 85027
> at org.apache.spark.memory.MemoryConsumer.allocateArray(
> MemoryConsumer.java:91)
>
>
> Is this a bug or I am setting something wrong?
>
> Regards,
>
> Sushrut Ikhar
> [image: https://]about.me/sushrutikhar
> <https://about.me/sushrutikhar?promo=email_sig>
>
>

Reply via email to