found the reporting bug in 1.6.2- Utils.bytesToString(maxMemory - blocksMemoryUsed)) should've been used
https://github.com/apache/spark/blob/v2.0.1/core/src/main/scala/org/apache/spark/storage/memory/MemoryStore.scala#L154 https://github.com/apache/spark/blob/v1.6.2/core/src/main/scala/org/apache/spark/storage/MemoryStore.scala#L396 Regards, Sushrut Ikhar [image: https://]about.me/sushrutikhar <https://about.me/sushrutikhar?promo=email_sig> On Fri, Oct 28, 2016 at 7:57 PM, Sushrut Ikhar <sushrutikha...@gmail.com> wrote: > Hi, > I am seeing wrong computation of storage memory available which is leading > to executor failures. I have allocated 8g memory with params: > spark.memory.fraction=0.7 > spark.memory.storageFraction=0.4 > As expected, I was able to see 5.2 GB storage memory in UI. > However, as per memory store logs I am seeing that free memory is > increasing as the rdds are getting cached; which should have been ideally > decreasing from 5.2 to 0. > > Eventually, the executor runs OOM whereas the Memory store is reporting it > has 5G available. > > > Attached executor logs - > > 2016-10-28 12:36:51,996 INFO [Executor task launch worker-10] > executor.Executor (Logging.scala:logInfo(58)) - Running task 324.0 in stage > 1.0 (TID 760) > 2016-10-28 12:36:52,019 INFO [Executor task launch worker-10] > spark.CacheManager (Logging.scala:logInfo(58)) - Partition rdd_10_324 not > found, computing it > 2016-10-28 12:36:52,031 INFO [Executor task launch worker-8] > storage.MemoryStore (Logging.scala:logInfo(58)) - *Block rdd_10_145 > stored as values in memory (estimated size 65.0 MB, free 260.7 MB)* > 2016-10-28 12:36:52,062 INFO [Executor task launch worker-10] > storage.ShuffleBlockFetcherIterator > (Logging.scala:logInfo(58)) - Getting 426 non-empty blocks out of 426 blocks > 2016-10-28 12:36:52,109 INFO [Executor task launch worker-10] > storage.ShuffleBlockFetcherIterator > (Logging.scala:logInfo(58)) - Started 37 remote fetches in 47 ms > 2016-10-28 12:36:52,130 INFO [Executor task launch worker-8] > executor.Executor (Logging.scala:logInfo(58)) - Finished task 145.0 in > stage 1.0 (TID 581). 268313 bytes result sent to driver > 2016-10-28 12:36:52,150 INFO [dispatcher-event-loop-7] > executor.CoarseGrainedExecutorBackend > (Logging.scala:logInfo(58)) - Got assigned task 761 > 2016-10-28 12:36:52,150 INFO [Executor task launch worker-8] > executor.Executor (Logging.scala:logInfo(58)) - Running task 325.0 in stage > 1.0 (TID 761) > 2016-10-28 12:36:52,164 INFO [Executor task launch worker-8] > spark.CacheManager (Logging.scala:logInfo(58)) - Partition rdd_10_325 not > found, computing it > 2016-10-28 12:36:52,247 INFO [Executor task launch worker-8] > storage.ShuffleBlockFetcherIterator > (Logging.scala:logInfo(58)) - Getting 426 non-empty blocks out of 426 blocks > 2016-10-28 12:36:52,264 INFO [Executor task launch worker-8] > storage.ShuffleBlockFetcherIterator > (Logging.scala:logInfo(58)) - Started 37 remote fetches in 18 ms > 2016-10-28 12:36:52,591 INFO [Executor task launch worker-6] > storage.MemoryStore (Logging.scala:logInfo(58)) -* Block rdd_10_45 stored > as values in memory (estimated size 65.0 MB, free 325.7 MB)* > 2016-10-28 12:36:52,646 INFO [Executor task launch worker-6] > executor.Executor (Logging.scala:logInfo(58)) - Finished task 45.0 in stage > 1.0 (TID 481). 266368 bytes result sent to driver > > Eventual Failures logs- > > 2016-10-28 12:53:06,718 WARN [Executor task launch worker-13] > storage.MemoryStore (Logging.scala:logWarning(70)) - Not enough space to > cache rdd_10_656 in memory! (computed 45.2 MB so far) > 2016-10-28 12:53:06,718 INFO [Executor task launch worker-13] > storage.MemoryStore (Logging.scala:logInfo(58)) - Memory use = 5.0 GB > (blocks) + 211.4 MB (scratch space shared across 103 tasks(s)) = 5.2 GB. > Storage limit = 5.2 GB. > 2016-10-28 12:53:06,718 INFO [Executor task launch worker-13] > storage.BlockManager (Logging.scala:logInfo(58)) - Found block rdd_10_656 > locally > 2016-10-28 12:53:06,719 INFO [Executor task launch worker-12] > storage.MemoryStore (Logging.scala:logInfo(58)) - 1 blocks selected for > dropping > 2016-10-28 12:53:06,720 INFO [Executor task launch worker-12] > storage.BlockManager (Logging.scala:logInfo(58)) - Dropping block > rdd_10_719 from memory > 2016-10-28 12:53:06,720 INFO [Executor task launch worker-12] > storage.BlockManager (Logging.scala:logInfo(58)) - Writing block rdd_10_719 > to disk > 2016-10-28 12:53:06,736 ERROR [Executor task launch worker-15] > executor.Executor (Logging.scala:logError(95)) - Exception in task 657.0 in > stage 4.0 (TID 4565) > java.lang.OutOfMemoryError: Unable to acquire 262144 bytes of memory, got > 85027 > at org.apache.spark.memory.MemoryConsumer.allocateArray( > MemoryConsumer.java:91) > > > Is this a bug or I am setting something wrong? > > Regards, > > Sushrut Ikhar > [image: https://]about.me/sushrutikhar > <https://about.me/sushrutikhar?promo=email_sig> > >