I will increase memory for the job...that will also fix it right ?
On Apr 10, 2015 12:43 PM, "Reza Zadeh" <r...@databricks.com> wrote:

> You should pull in this PR: https://github.com/apache/spark/pull/5364
> It should resolve that. It is in master.
> Best,
> Reza
>
> On Fri, Apr 10, 2015 at 8:32 AM, Debasish Das <debasish.da...@gmail.com>
> wrote:
>
>> Hi,
>>
>> I am benchmarking row vs col similarity flow on 60M x 10M matrices...
>>
>> Details are in this JIRA:
>>
>> https://issues.apache.org/jira/browse/SPARK-4823
>>
>> For testing I am using Netflix data since the structure is very similar:
>> 50k x 17K near dense similarities..
>>
>> Items are 17K and so I did not activate threshold in colSimilarities yet
>> (it's at 1e-4)
>>
>> Running Spark on YARN with 20 nodes, 4 cores, 16 gb, shuffle threshold 0.6
>>
>> I keep getting these from col similarity code from 1.2 branch. Should I
>> use Master ?
>>
>> 15/04/10 11:08:36 WARN BlockManagerMasterActor: Removing BlockManager
>> BlockManagerId(5, tblpmidn36adv-hdp.tdc.vzwcorp.com, 44410) with no
>> recent heart beats: 50315ms exceeds 45000ms
>>
>> 15/04/10 11:09:12 ERROR ContextCleaner: Error cleaning broadcast 1012
>>
>> java.util.concurrent.TimeoutException: Futures timed out after [30
>> seconds]
>>
>> at scala.concurrent.impl.Promise$DefaultPromise.ready(Promise.scala:219)
>>
>> at scala.concurrent.impl.Promise$DefaultPromise.result(Promise.scala:223)
>>
>> at scala.concurrent.Await$$anonfun$result$1.apply(package.scala:107)
>>
>> at
>> scala.concurrent.BlockContext$DefaultBlockContext$.blockOn(BlockContext.scala:53)
>>
>> at scala.concurrent.Await$.result(package.scala:107)
>>
>> at
>> org.apache.spark.storage.BlockManagerMaster.removeBroadcast(BlockManagerMaster.scala:137)
>>
>> at
>> org.apache.spark.broadcast.TorrentBroadcast$.unpersist(TorrentBroadcast.scala:227)
>>
>> at
>> org.apache.spark.broadcast.TorrentBroadcastFactory.unbroadcast(TorrentBroadcastFactory.scala:45)
>>
>> at
>> org.apache.spark.broadcast.BroadcastManager.unbroadcast(BroadcastManager.scala:66)
>>
>> at
>> org.apache.spark.ContextCleaner.doCleanupBroadcast(ContextCleaner.scala:185)
>>
>> at
>> org.apache.spark.ContextCleaner$$anonfun$org$apache$spark$ContextCleaner$$keepCleaning$1$$anonfun$apply$mcV$sp$2.apply(ContextCleaner.scala:147)
>>
>> at
>> org.apache.spark.ContextCleaner$$anonfun$org$apache$spark$ContextCleaner$$keepCleaning$1$$anonfun$apply$mcV$sp$2.apply(ContextCleaner.scala:138)
>>
>> at scala.Option.foreach(Option.scala:236)
>>
>> at
>> org.apache.spark.ContextCleaner$$anonfun$org$apache$spark$ContextCleaner$$keepCleaning$1.apply$mcV$sp(ContextCleaner.scala:138)
>>
>> at
>> org.apache.spark.ContextCleaner$$anonfun$org$apache$spark$ContextCleaner$$keepCleaning$1.apply(ContextCleaner.scala:134)
>>
>> at
>> org.apache.spark.ContextCleaner$$anonfun$org$apache$spark$ContextCleaner$$keepCleaning$1.apply(ContextCleaner.scala:134)
>>
>> at org.apache.spark.util.Utils$.logUncaughtExceptions(Utils.scala:1468)
>>
>> at org.apache.spark.ContextCleaner.org
>> $apache$spark$ContextCleaner$$keepCleaning(ContextCleaner.scala:133)
>>
>> at org.apache.spark.ContextCleaner$$anon$3.run(ContextCleaner.scala:65)
>>
>> I knew how to increase the 45 ms to something higher as it is compute
>> heavy job but in YARN, I am not sure how to set that config..
>>
>> But in any-case that's a warning and should not affect the job...
>>
>> Any idea how to improve the runtime other than increasing threshold to
>> 1e-2 ? I will do that next
>>
>> Was netflix dataset benchmarked for col based similarity flow before ?
>> similarity output from this dataset becomes near dense and so it is
>> interesting for stress testing...
>>
>> Thanks.
>>
>> Deb
>>
>
>

Reply via email to