subject:"Re\: How can we control CPU and Memory per Spark job operation.."

Re: How can we control CPU and Memory per Spark job operation..

2016-07-22 Thread Pedro Rodriguez

Sorry, wasn’t very clear (looks like Pavan’s response was dropped from list for some reason as well). I am assuming that: 1) the first map is CPU bound 2) the second map is heavily memory bound To be specific, lets saw you are using 4 m3.2xlarge instances which have 8 CPUs and 30GB of ram each

Re: How can we control CPU and Memory per Spark job operation..

2016-07-17 Thread Jacek Laskowski

Hi, How would that help?! Why would you do that? Jacek On 17 Jul 2016 7:19 a.m., "Pedro Rodriguez" wrote: > You could call map on an RDD which has “many” partitions, then call > repartition/coalesce to drastically reduce the number of partitions so that > your second map job has less things ru

Re: How can we control CPU and Memory per Spark job operation..

2016-07-16 Thread Pedro Rodriguez

You could call map on an RDD which has “many” partitions, then call repartition/coalesce to drastically reduce the number of partitions so that your second map job has less things running. — Pedro Rodriguez PhD Student in Large-Scale Machine Learning | CU Boulder Systems Oriented Data Scientist

Re: How can we control CPU and Memory per Spark job operation..

2016-07-16 Thread Jacek Laskowski

Hi, My understanding is that these two map functions will end up as a job with one stage (as if you wrote the two maps as a single map) so you really need as much vcores and memory as possible for map1 and map2. I initially thought about dynamic allocation of executors that may or may not help you

Re: How can we control CPU and Memory per Spark job operation..

Re: How can we control CPU and Memory per Spark job operation..

Re: How can we control CPU and Memory per Spark job operation..

Re: How can we control CPU and Memory per Spark job operation..

4 matches

Site Navigation

Mail list logo

Footer information