Re: Kafka streams vs Spark streaming

2017-10-11 Thread Sachin Mittal
. > > Regards > Sab > > On 11 Oct 2017 1:44 pm, "Sachin Mittal" wrote: > >> Kafka streams has a lower learning curve and if your source data is in >> kafka topics it is pretty simple to integrate it with. >> It can run like a library inside your main pr

Re: Kafka streams vs Spark streaming

2017-10-11 Thread Sachin Mittal
a topic. And when we add more machines, rebalancing > auto distributes the partitions among the new stream threads. > > Regards > Sab > > On 11 Oct 2017 1:44 pm, "Sachin Mittal" wrote: > >> Kafka streams has a lower learning curve and if your source data is in &g

Re: Kafka streams vs Spark streaming

2017-10-11 Thread Sachin Mittal
Kafka streams has a lower learning curve and if your source data is in kafka topics it is pretty simple to integrate it with. It can run like a library inside your main programs. So as compared to spark streams 1. Is much simpler to implement. 2. Is not much heavy on hardware unlike spark. On th

Re: How can we connect RDD from previous job to next job

2016-08-29 Thread Sachin Mittal
D alive. > > On Mon, Aug 29, 2016 at 5:30 AM, Sachin Mittal wrote: > > Hi, > > I would need some thoughts or inputs or any starting point to achieve > > following scenario. > > I submit a job using spark-submit with a certain set of parameters. > > > > It r

How can we connect RDD from previous job to next job

2016-08-28 Thread Sachin Mittal
Hi, I would need some thoughts or inputs or any starting point to achieve following scenario. I submit a job using spark-submit with a certain set of parameters. It reads data from a source, does some processing on RDDs and generates some output and completes. Then I submit same job again with ne

Re: How to export a project to a JAR in Scala IDE for eclipse Correctly?

2016-07-26 Thread Sachin Mittal
Why don't you install sbt and try sbt assembly to create a scala jar. You can using this jar to your spark submit jobs. In case there are additional dependencies these can be passed as --jars (comma separated jar paths) option to spark submit. On Wed, Jul 27, 2016 at 11:53 AM, wrote: > Hi the

Re: Understanding spark concepts cluster, master, slave, job, stage, worker, executor, task

2016-07-20 Thread Sachin Mittal
... >> 3) Yes, if you have HT, it double. My servers have 12 cores, but HT, so >> it makes 24. >> 4) From my understanding: Slave is the logical computational unit and >> Worker is really the one doing the job. >> 5) Dunnoh >> 6) Dunnoh >> >> On Jul 20,

Re: Building standalone spark application via sbt

2016-07-20 Thread Sachin Mittal
appen often) > > Btw did you get the NoClassDefFoundException at compile time or run > time?if at run time, what is your Spark Version and what is the spark > libraries version you used in your sbt? > are you using a Spark version pre 1.4? > > kr > marco > > > > >

Understanding spark concepts cluster, master, slave, job, stage, worker, executor, task

2016-07-20 Thread Sachin Mittal
Hi, I was able to build and run my spark application via spark submit. I have understood some of the concepts by going through the resources at https://spark.apache.org but few doubts still remain. I have few specific questions and would be glad if someone could share some light on it. So I submi

Re: Building standalone spark application via sbt

2016-07-20 Thread Sachin Mittal
t; >> http://stackoverflow.com/questions/28459333/how-to-build-an-uber-jar-fat-jar-using-sbt-within-intellij-idea >> >> under three answers the top one. >> >> I started reading the official SBT tutorial >> <http://www.scala-sbt.org/0.13/tutorial/>.

Re: Building standalone spark application via sbt

2016-07-20 Thread Sachin Mittal
everal other jars. Here’s the > list of dependencies: > https://github.com/apache/spark/blob/master/core/pom.xml#L35 > > Whether you need spark-sql depends on whether you will use the DataFrame > API. Without spark-sql, you will just have the RDD API. > > On Jul 19, 2016, at 7:0

Building standalone spark application via sbt

2016-07-19 Thread Sachin Mittal
Hi, Can someone please guide me what all jars I need to place in my lib folder of the project to build a standalone scala application via sbt. Note I need to provide static dependencies and I cannot download the jars using libraryDependencies. So I need to provide all the jars upfront. So far I f