Re: Weird error while serialization

2016-04-09 Thread SURAJ SHETH
x : [x[0],getRows(x[1])]).cache()\ .groupBy(lambda x : x[0].split('\t')[1]).mapValues(lambda x : list(x)).cache() text1.count() Thanks and Regards, Suraj Sheth On Sun, Apr 10, 2016 at 1:19 AM, Ted Yu wrote: > The value was out of the range of integer. > > Which

Weird error while serialization

2016-04-09 Thread SURAJ SHETH
n$1.(PythonRDD.scala:207) at org.apache.spark.api.python.PythonRunner.compute(PythonRDD.scala:125) at org.apache.spark.api.python.PythonRDD.compute(PythonRDD.scala:70) Thanks and Regards, Suraj Sheth

Yarn Spark on EMR

2015-11-15 Thread SURAJ SHETH
Hi, Yarn UI on 18080 stops receiving updates Spark jobs/tasks immediately after it starts. We see only one task completed in the UI while the other hasn't got any resources while in reality, more than 5 tasks would have completed. Hadoop - Amazon 2.6 Spark - 1.5 Thanks and Regards, Suraj Sheth

Re: Networking issues with Spark on EC2

2015-09-25 Thread SURAJ SHETH
, Suraj Sheth On Sat, Sep 26, 2015 at 10:36 AM, Natu Lauchande wrote: > Hi, > > Are you using EMR ? > > Natu > > On Sat, Sep 26, 2015 at 6:55 AM, SURAJ SHETH wrote: > >> Hi Ankur, >> Thanks for the reply. >> This is already done. >> If I wait for a

Re: Networking issues with Spark on EC2

2015-09-25 Thread SURAJ SHETH
Regards, Suraj Sheth On Fri, Sep 25, 2015 at 2:10 AM, Ankur Srivastava < ankur.srivast...@gmail.com> wrote: > Hi Suraj, > > Spark uses a lot of ports to communicate between nodes. Probably your > security group is restrictive and does not allow instances to communicate >

Networking issues with Spark on EC2

2015-09-24 Thread SURAJ SHETH
can be fixed? Thanks and Regards, Suraj Sheth

Re: Running spark over HDFS

2015-04-20 Thread SURAJ SHETH
Hi Madhvi, I think the memory requested by your job, i.e. 2.0 GB is higher than what is available. Please request for 256 MB explicitly while creating Spark Context and try again. Thanks and Regards, Suraj Sheth

Re: Running spark over HDFS

2015-04-20 Thread SURAJ SHETH
I think the memory requested by your job 2.0 GB is higher than what is requested. Please request for 256 MB explicitly which creating Spark Context and try again. Thanks and Regards, Suraj Sheth On Mon, Apr 20, 2015 at 2:44 PM, madhvi wrote: > PFA screenshot of my cluster UI > > Th

Re: MLlib: issue with increasing maximum depth of the decision tree

2014-08-21 Thread SURAJ SHETH
Hi Sameer, http://apache-spark-user-list.1001560.n3.nabble.com/MLLib-Decision-Tree-not-getting-built-for-5-or-more-levels-maxDepth-5-and-the-one-built-for-3-levelsy-td7401.html Thanks and Regards, Suraj Sheth On Thu, Aug 21, 2014 at 10:52 PM, Sameer Tilak wrote: > Resending this: >

Re: MLLib : Decision Tree not getting built for 5 or more levels(maxDepth=5) and the one built for 3 levels is performing poorly

2014-06-11 Thread SURAJ SHETH
it keeps running for hours, the amount of free memory available is more than 70%. So, it doesn't seem to be a Memory issue either. Thanks and Regards, Suraj Sheth On Wed, Jun 11, 2014 at 10:19 PM, filipus wrote: > well I guess your problem is quite unbalanced and due to the information &

MLLib : Decision Tree not getting built for 5 or more levels(maxDepth=5) and the one built for 3 levels is performing poorly

2014-06-11 Thread SURAJ SHETH
est in less than 5 minutes and gives a good accuracy. The amount of memory and other resources available to Spark and to Mahout are comparable. Spark had a memory of 30GB * 3 workers = 90GB in total. Thanks and Regards, Suraj Sheth