Re: One click to run Spark on Kubernetes

2022-02-23 Thread Sarath Annareddy
Hi bo How do we start? Is there a plan? Onboarding, Arch/design diagram, tasks lined up etc Thanks Sarath Sent from my iPhone > On Feb 23, 2022, at 10:27 AM, bo yang wrote: > >  > Hi Sarath, thanks for your interest and willing to contribute! The project > supports lo

Re: One click to run Spark on Kubernetes

2022-02-23 Thread Sarath Annareddy
Hi bo I am interested to contribute. But I don’t have free access to any cloud provider. Not sure how I can get free access. I know Google, aws, azure only provides temp free access, it may not be sufficient. Guidance is appreciated. Sarath Sent from my iPhone > On Feb 23, 2022, at 2

Unsubscribe

2016-08-15 Thread Sarath Chandra

Issue with wholeTextFiles

2016-03-21 Thread Sarath Chandra
I'm using Hadoop 1.0.4 and Spark 1.2.0. I'm facing a strange issue. I have a requirement to read a small file from HDFS and all it's content has to be read at one shot. So I'm using spark context's wholeTextFiles API passing the HDFS URL for the file. When I try this from a spark shell it's works

Re: Assign unique link ID

2015-10-31 Thread Sarath Chandra
count and their types. Any ideas how to tackle this? Regards, Sarath. On Sat, Oct 31, 2015 at 4:37 PM, ayan guha wrote: > Can this be a solution? > > 1. Write a function which will take a string and convert to md5 hash > 2. From your base table, generate a string out of all columns yo

Assign unique link ID

2015-10-31 Thread Sarath Chandra
t;SJ").withColumn("LINK_ID", linkIDUDF(src_join("S1.RECORD_ID"),src("S2.RECORD_ID")));* Then in further lines I'm not able to refer to "s1" columns from "src_link" like - *var src_link_s1 = src_link.as <http://src_link.as>("SL").select($"S1.RECORD_ID");* Please guide me. Regards, Sarath.

Re: PermGen Space Error

2015-07-29 Thread Sarath Chandra
erify your executor/driver actually started with this option to > rule out a config problem. > > On Wed, Jul 29, 2015 at 10:45 AM, Sarath Chandra > wrote: > > Yes. > > > > As mentioned in my mail at the end, I tried with both 256 and 512 > opt

Re: PermGen Space Error

2015-07-29 Thread Sarath Chandra
ingle node mesos cluster on my laptop having 4 CPUs and 12GB RAM. On Wed, Jul 29, 2015 at 2:49 PM, fightf...@163.com wrote: > Hi, Sarath > > Did you try to use and increase spark.excecutor.extraJaveOptions > -XX:PermSize= -XX:MaxPermSize= > > > ----

PermGen Space Error

2015-07-29 Thread Sarath Chandra
en I run the same from a spark shell it works fine. As mentioned in some posts and blogs I tried using the option spark.driver.extraJavaOptions to increase the size, tried with 256 and 512 but still no luck. Please help me in resolving the space issue Thanks & Regards, Sarath.

MLLib SVMWithSGD is failing for large dataset

2015-04-28 Thread sarath
I am trying to train a large dataset consisting of 8 million data points and 20 million features using SVMWithSGD. But it is failing after running for some time. I tried increasing num-partitions, driver-memory, executor-memory, driver-max-resultSize. Also I tried by reducing the size of dataset f

MLLib SVMWithSGD : java.lang.OutOfMemoryError: Java heap space

2015-04-16 Thread sarath
Hi, I'm trying to train an SVM on KDD2010 dataset (available from libsvm). But I'm getting "java.lang.OutOfMemoryError: Java heap space" error. The dataset is really sparse and have around 8 million data points and 20 million features. I'm using a cluster of 8 nodes (each with 8 cores and 64G RAM)

Re: Unable to submit spark job to mesos cluster

2015-03-04 Thread Sarath Chandra
h$mVc$sp(Range.scala:141)* * at org.apache.spark.util.Utils$.startServiceOnPort(Utils.scala:1450)* * at org.apache.spark.util.AkkaUtils$.createActorSystem(AkkaUtils.scala:56)* * at org.apache.spark.SparkEnv$.create(SparkEnv.scala:156)* * at org.apache.spark.SparkContext.(SparkContext.scala:203)* * at Test.main

Unable to submit spark job to mesos cluster

2015-03-04 Thread Sarath Chandra
ciliation.execution.utils.ExecutionUtils.(ExecutionUtils.java:130) ... 2 more Regards, Sarath.

Parallel spark jobs on mesos cluster

2014-09-30 Thread Sarath Chandra
;)* * .setSparkHome("/usr/local/spark-1.0.1-bin-hadoop1")* * .set("spark.executor.memory", "3g")* * .set("spark.cores.max", "4")* * .set("spark.task.cpus","4")* * .set("spark.executor.uri", &qu

Parallel spark jobs on standalone cluster

2014-09-25 Thread Sarath Chandra
ange in the behavior. Also in the spark job submission program I'm calling SparkContext.stop at the end of execution. Some times all jobs fail with status as "Exited". Please let me know what is going wrong and how to overcome the issue? ~Sarath

Worker state is 'killed'

2014-09-21 Thread Sarath Chandra
illed". And I'm not finding any exceptions being thrown in the logs. What could be going wrong? ... var newLines = lines.flatMap(line => process(line)); newLines.saveAsTextFile(hdfsPath); ... def process(line: String): Array[String] = { ... Array(str1, str2); } ... ~Sarath.

Saving RDD with array of strings

2014-09-21 Thread Sarath Chandra
ine)); newLines.saveAsTextFile(hdfsPath); ... ... def myfunc(line: String):Array[String] = { line.split(";"); } Thanks, ~Sarath.

Re: Task not serializable

2014-09-10 Thread Sarath Chandra
Thanks Sean. Please find attached my code. Let me know your suggestions/ideas. Regards, *Sarath* On Wed, Sep 10, 2014 at 8:05 PM, Sean Owen wrote: > You mention that you are creating a UserGroupInformation inside your > function, but something is still serializing it. You should sho

Re: Task not serializable

2014-09-10 Thread Sarath Chandra
s inside map method, does it create a new instance for every RDD it is processing? Thanks & Regards, *Sarath* On Sat, Sep 6, 2014 at 4:32 PM, Sean Owen wrote: > I disagree that the generally right change is to try to make the > classes serializable. Usually, classes that are not seriali

Re: Task not serializable

2014-09-06 Thread Sarath Chandra
written it's contents as anonymous function inside map function. This time the execution succeeded. I understood the explanation of Sean. But request for references to a more detailed explanation and examples for writing efficient spark programs avoiding such pitfalls. ~Sarath On 06-Sep-2014 4:

Re: Task not serializable

2014-09-05 Thread Sarath Chandra
Hi Akhil, I've done this for the classes which are in my scope. But what to do with classes that are out of my scope? For example org.apache.hadoop.io.Text Also I'm using several 3rd part libraries like "jeval". ~Sarath On Fri, Sep 5, 2014 at 7:40 PM, Akhil Das wrote: &

Task not serializable

2014-09-05 Thread Sarath Chandra
.hadoop.io.Text. How to overcome these exceptions? ~Sarath.

Re: Simple record matching using Spark SQL

2014-07-17 Thread Sarath Chandra
Added below 2 lines just before the sql query line - *...* *file1_schema.count;* *file2_schema.count;* *...* and it started working. But I couldn't get the reason. Can someone please explain me? What was happening earlier and what is happening with addition of these 2 lines? ~Sarath O

Re: Simple record matching using Spark SQL

2014-07-17 Thread Sarath Chandra
;m killing it by pressing Ctrl+C in the terminal. But the same code runs perfectly when executed from spark shell. ~Sarath On Thu, Jul 17, 2014 at 1:05 PM, Sonal Goyal wrote: > Hi Sarath, > > Are you explicitly stopping the context? > > sc.stop() > > > > > Best Re

Re: Simple record matching using Spark SQL

2014-07-17 Thread Sarath Chandra
Hi Michael, Soumya, Can you please check and let me know what is the issue? what am I missing? Let me know if you need any logs to analyze. ~Sarath On Wed, Jul 16, 2014 at 8:24 PM, Sarath Chandra < sarathchandra.jos...@algofusiontech.com> wrote: > Hi Michael, > > Tried it.

Re: Simple record matching using Spark SQL

2014-07-16 Thread Sarath Chandra
ATH $CONFIG_OPTS test.Test4 spark://master:7077 "/usr/local/spark-1.0.1-bin-hadoop1" hdfs://master:54310/user/hduser/file1.csv hdfs://master:54310/user/hduser/file2.csv* ~Sarath On Wed, Jul 16, 2014 at 8:14 PM, Michael Armbrust wrote: > What if you just run something like: > *sc.te

Re: Simple record matching using Spark SQL

2014-07-16 Thread Sarath Chandra
2014 at 7:59 PM, Soumya Simanta wrote: > > > Can you try submitting a very simple job to the cluster. > > On Jul 16, 2014, at 10:25 AM, Sarath Chandra < > sarathchandra.jos...@algofusiontech.com> wrote: > > Yes it is appearing on the Spark UI, and remains there wit

Re: Simple record matching using Spark SQL

2014-07-16 Thread Sarath Chandra
Yes it is appearing on the Spark UI, and remains there with state as "RUNNING" till I press Ctrl+C in the terminal to kill the execution. Barring the statements to create the spark context, if I copy paste the lines of my code in spark shell, runs perfectly giving the desired output. ~

Re: Simple record matching using Spark SQL

2014-07-16 Thread Sarath Chandra
anything going wrong, all are info messages. What else do I need check? ~Sarath On Wed, Jul 16, 2014 at 7:23 PM, Soumya Simanta wrote: > Check your executor logs for the output or if your data is not big collect > it in the driver and print it. > > > > On Jul 16, 2014, at 9:21 AM

Simple record matching using Spark SQL

2014-07-16 Thread Sarath Chandra
7;m forcibly killing it. But the same program is working well when executed from a spark shell. What is going wrong? What am I missing? ~Sarath