I have lot of joint SQL operations, which is blocking me write data and unresisted the data, if not useful.
On Oct 24, 2016 7:50 PM, "Mich Talebzadeh" <mich.talebza...@gmail.com> wrote: > OK so you are disabling broadcasting although it is not obvious how this > helps in this case! > > Dr Mich Talebzadeh > > > > LinkedIn * > https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw > <https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw>* > > > > http://talebzadehmich.wordpress.com > > > *Disclaimer:* Use it at your own risk. Any and all responsibility for any > loss, damage or destruction of data or any other property which may arise > from relying on this email's technical content is explicitly disclaimed. > The author will in no case be liable for any monetary damages arising from > such loss, damage or destruction. > > > > On 24 October 2016 at 15:08, Sankar Mittapally <sankar.mittapally@ > creditvidya.com> wrote: > >> sc <- sparkR.session(master = "spark://ip-172-31-6-116:7077" >> ,sparkConfig=list(spark.executor.memory="10g",spark.app.name >> ="Testing",spark.driver.memory="14g",spark.executor.extraJavaOption="-Xms2g >> -Xmx5g -XX:-UseGCOverheadLimit",spark.driver.extraJavaOption="-Xms2g >> -Xmx5g -XX:-UseGCOverheadLimit",spark.cores.max="2",spark.sql.autoB >> roadcastJoinThreshold="-1")) >> >> On Mon, Oct 24, 2016 at 7:33 PM, Mich Talebzadeh < >> mich.talebza...@gmail.com> wrote: >> >>> OK so what is your full launch code now? I mean equivalent to >>> spark-submit >>> >>> >>> >>> Dr Mich Talebzadeh >>> >>> >>> >>> LinkedIn * >>> https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw >>> <https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw>* >>> >>> >>> >>> http://talebzadehmich.wordpress.com >>> >>> >>> *Disclaimer:* Use it at your own risk. Any and all responsibility for >>> any loss, damage or destruction of data or any other property which may >>> arise from relying on this email's technical content is explicitly >>> disclaimed. The author will in no case be liable for any monetary damages >>> arising from such loss, damage or destruction. >>> >>> >>> >>> On 24 October 2016 at 14:57, Sankar Mittapally < >>> sankar.mittapa...@creditvidya.com> wrote: >>> >>>> Hi Mich, >>>> >>>> I am able to write the files to storage after adding extra parameter. >>>> >>>> FYI.. >>>> >>>> This one I used. >>>> >>>> spark.sql.autoBroadcastJoinThreshold="-1" >>>> >>>> >>>> >>>> On Mon, Oct 24, 2016 at 7:22 PM, Mich Talebzadeh < >>>> mich.talebza...@gmail.com> wrote: >>>> >>>>> Rather strange as you have plenty free memory there. >>>>> >>>>> Try reducing driver memory to 2GB and executer memory to 2GB and run >>>>> it again >>>>> >>>>> ${SPARK_HOME}/bin/spark-submit \ >>>>> --driver-memory 2G \ >>>>> --num-executors 2 \ >>>>> --executor-cores 1 \ >>>>> --executor-memory 2G \ >>>>> --master spark://IPAddress:7077 \ >>>>> >>>>> HTH >>>>> >>>>> >>>>> >>>>> Dr Mich Talebzadeh >>>>> >>>>> >>>>> >>>>> LinkedIn * >>>>> https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw >>>>> <https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw>* >>>>> >>>>> >>>>> >>>>> http://talebzadehmich.wordpress.com >>>>> >>>>> >>>>> *Disclaimer:* Use it at your own risk. Any and all responsibility for >>>>> any loss, damage or destruction of data or any other property which may >>>>> arise from relying on this email's technical content is explicitly >>>>> disclaimed. The author will in no case be liable for any monetary damages >>>>> arising from such loss, damage or destruction. >>>>> >>>>> >>>>> >>>>> On 24 October 2016 at 13:15, Sankar Mittapally < >>>>> sankar.mittapa...@creditvidya.com> wrote: >>>>> >>>>>> Hi Mich, >>>>>> >>>>>> Yes, I am using standalone mode cluster, We have two executors with >>>>>> 10G memory each. We have two workers. >>>>>> >>>>>> FYI.. >>>>>> >>>>>> >>>>>> >>>>>> On Mon, Oct 24, 2016 at 5:22 PM, Mich Talebzadeh < >>>>>> mich.talebza...@gmail.com> wrote: >>>>>> >>>>>>> Sounds like you are running in standalone mode. >>>>>>> >>>>>>> Have you checked the UI on port 4040 (default) to see where memory >>>>>>> is going. Why do you need executor memory of 10GB? >>>>>>> >>>>>>> How many executors are running and plus how many slaves started? >>>>>>> >>>>>>> In standalone mode executors run on workers (UI 8080) >>>>>>> >>>>>>> >>>>>>> HTH >>>>>>> >>>>>>> Dr Mich Talebzadeh >>>>>>> >>>>>>> >>>>>>> >>>>>>> LinkedIn * >>>>>>> https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw >>>>>>> <https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw>* >>>>>>> >>>>>>> >>>>>>> >>>>>>> http://talebzadehmich.wordpress.com >>>>>>> >>>>>>> >>>>>>> *Disclaimer:* Use it at your own risk. Any and all responsibility >>>>>>> for any loss, damage or destruction of data or any other property which >>>>>>> may >>>>>>> arise from relying on this email's technical content is explicitly >>>>>>> disclaimed. The author will in no case be liable for any monetary >>>>>>> damages >>>>>>> arising from such loss, damage or destruction. >>>>>>> >>>>>>> >>>>>>> >>>>>>> On 24 October 2016 at 12:19, sankarmittapally < >>>>>>> sankar.mittapa...@creditvidya.com> wrote: >>>>>>> >>>>>>>> Hi, >>>>>>>> >>>>>>>> I have a three node cluster with 30G of Memory. I am trying to >>>>>>>> analyzing >>>>>>>> the data of 200MB and running out of memory every time. This is the >>>>>>>> command >>>>>>>> I am using >>>>>>>> >>>>>>>> Driver Memory = 10G >>>>>>>> Executor memory=10G >>>>>>>> >>>>>>>> sc <- sparkR.session(master = >>>>>>>> "spark://ip-172-31-6-116:7077",sparkConfig=list(spark.execut >>>>>>>> or.memory="10g",spark.app.name="Testing",spark.driver.memory >>>>>>>> ="14g",spark.executor.extraJavaOption="-Xms2g >>>>>>>> -Xmx5g -XX:MaxPermSize=1024M",spark.driver.extraJavaOption="-Xms2g >>>>>>>> -Xmx5g >>>>>>>> -XX:MaxPermSize=1024M",spark.cores.max="2")) >>>>>>>> >>>>>>>> >>>>>>>> [D 16:43:51.437 NotebookApp] 200 GET >>>>>>>> /api/contents?type=directory&_=1477289197671 (123.176.38.226) >>>>>>>> 7.96ms >>>>>>>> Exception in thread "broadcast-exchange-0" >>>>>>>> java.lang.OutOfMemoryError: Java >>>>>>>> heap space >>>>>>>> at >>>>>>>> org.apache.spark.sql.execution.joins.LongToUnsafeRowMap.appe >>>>>>>> nd(HashedRelation.scala:539) >>>>>>>> at >>>>>>>> org.apache.spark.sql.execution.joins.LongHashedRelation$.app >>>>>>>> ly(HashedRelation.scala:803) >>>>>>>> at >>>>>>>> org.apache.spark.sql.execution.joins.HashedRelation$.apply(H >>>>>>>> ashedRelation.scala:105) >>>>>>>> at >>>>>>>> org.apache.spark.sql.execution.joins.HashedRelationBroadcast >>>>>>>> Mode.transform(HashedRelation.scala:816) >>>>>>>> at >>>>>>>> org.apache.spark.sql.execution.joins.HashedRelationBroadcast >>>>>>>> Mode.transform(HashedRelation.scala:812) >>>>>>>> at >>>>>>>> org.apache.spark.sql.execution.exchange.BroadcastExchangeExe >>>>>>>> c$$anonfun$relationFuture$1$$anonfun$apply$1.apply(Broadcast >>>>>>>> ExchangeExec. >>>>>>>> scala:90) >>>>>>>> at >>>>>>>> org.apache.spark.sql.execution.exchange.BroadcastExchangeExe >>>>>>>> c$$anonfun$relationFuture$1$$anonfun$apply$1.apply(Broadcast >>>>>>>> ExchangeExec. >>>>>>>> scala:72) >>>>>>>> at >>>>>>>> org.apache.spark.sql.execution.SQLExecution$.withExecutionId >>>>>>>> (SQLExecution.scala:94) >>>>>>>> at >>>>>>>> org.apache.spark.sql.execution.exchange.BroadcastExchangeExe >>>>>>>> c$$anonfun$relationFuture$1.apply(BroadcastExchangeExec.scala:72) >>>>>>>> at >>>>>>>> org.apache.spark.sql.execution.exchange.BroadcastExchangeExe >>>>>>>> c$$anonfun$relationFuture$1.apply(BroadcastExchangeExec.scala:72) >>>>>>>> at >>>>>>>> scala.concurrent.impl.Future$PromiseCompletingRunnable.lifte >>>>>>>> dTree1$1(Future.scala:24) >>>>>>>> at >>>>>>>> scala.concurrent.impl.Future$PromiseCompletingRunnable.run(F >>>>>>>> uture.scala:24) >>>>>>>> at >>>>>>>> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPool >>>>>>>> Executor.java:1142) >>>>>>>> at >>>>>>>> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoo >>>>>>>> lExecutor.java:617) >>>>>>>> at java.lang.Thread.run(Thread.java:745) >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> -- >>>>>>>> View this message in context: http://apache-spark-user-list. >>>>>>>> 1001560.n3.nabble.com/JAVA-heap-space-issue-tp27950.html >>>>>>>> Sent from the Apache Spark User List mailing list archive at >>>>>>>> Nabble.com. >>>>>>>> >>>>>>>> ------------------------------------------------------------ >>>>>>>> --------- >>>>>>>> To unsubscribe e-mail: user-unsubscr...@spark.apache.org >>>>>>>> >>>>>>>> >>>>>>> >>>>>> >>>>> >>>> >>> >> >