Discover SparkUI port for spark streaming job running in cluster mode

2015-12-14 Thread Ashish Nigam
Hi, I run spark streaming job in cluster mode. This means that driver can run in any data node. And Spark UI can run in any dynamic port. At present, I know about the port by looking at container logs that look something like this - server.AbstractConnector: Started SelectChannelConnector@0.0.0.0:

Re: spark streaming job fails to restart after checkpointing due to DStream initialization errors

2015-06-26 Thread Ashish Nigam
Make sure you're following the docs regarding setting up a streaming > checkpoint. > > Post your code if you can't get it figured out. > > On Fri, Jun 26, 2015 at 3:45 PM, Ashish Nigam > wrote: > >> I bring up spark streaming job that uses Kafka as input s

spark streaming job fails to restart after checkpointing due to DStream initialization errors

2015-06-26 Thread Ashish Nigam
I bring up spark streaming job that uses Kafka as input source. No data to process and then shut it down. And bring it back again. This time job does not start because it complains that DStream is not initialized. 15/06/26 01:10:44 ERROR yarn.ApplicationMaster: User class threw exception: org.apac

Re: spark streaming - checkpointing - looking at old application directory and failure to start streaming context

2015-06-11 Thread Ashish Nigam
Any idea why this happens? On Wed, Jun 10, 2015 at 9:28 AM, Ashish Nigam wrote: > BTW, I am using spark streaming 1.2.0 version. > > On Wed, Jun 10, 2015 at 9:26 AM, Ashish Nigam > wrote: > >> I did not change driver program. I just shutdown the context and again >>

Re: spark streaming - checkpointing - looking at old application directory and failure to start streaming context

2015-06-10 Thread Ashish Nigam
Jun 10, 2015 at 9:18 AM, Akhil Das wrote: > Delete the checkpoint directory, you might have modified your driver > program. > > Thanks > Best Regards > > On Wed, Jun 10, 2015 at 9:44 PM, Ashish Nigam > wrote: > >> Hi, >> If checkpoint data is already pres

spark streaming - checkpointing - looking at old application directory and failure to start streaming context

2015-06-10 Thread Ashish Nigam
Hi, If checkpoint data is already present in HDFS, driver fails to load as it is performing lookup on previous application directory. As that folder already exists, it fails to start context. Failed job's application id was application_1432284018452_0635 and job was performing lookup on application

Re: Unable to find org.apache.spark.sql.catalyst.ScalaReflection class

2015-02-28 Thread Ashish Nigam
Also, can scala version play any role here? I am using scala 2.11.5 but all spark packages have dependency to scala 2.11.2 Just wanted to make sure that scala version is not an issue here. On Sat, Feb 28, 2015 at 9:18 AM, Ashish Nigam wrote: > Hi, > I wrote a very simple program in sc

Re: Tools to manage workflows on Spark

2015-02-28 Thread Ashish Nigam
ote: > > Thanks, Ashish! Is Oozie integrated with Spark? I knew it can accommodate > some Hadoop jobs. > > > On Sat, Feb 28, 2015 at 6:07 PM, Ashish Nigam <mailto:ashnigamt...@gmail.com>> wrote: > Qiang, > Did you look at Oozie? > We use oozie to run spark jobs in pro

Re: Tools to manage workflows on Spark

2015-02-28 Thread Ashish Nigam
Qiang, Did you look at Oozie? We use oozie to run spark jobs in production. > On Feb 28, 2015, at 2:45 PM, Qiang Cao wrote: > > Hi Everyone, > > We need to deal with workflows on Spark. In our scenario, each workflow > consists of multiple processing steps. Among different steps, there could

Re: Unable to find org.apache.spark.sql.catalyst.ScalaReflection class

2015-02-28 Thread Ashish Nigam
as in the classpath ? > > Cheers > > On Sat, Feb 28, 2015 at 9:18 AM, Ashish Nigam <mailto:ashnigamt...@gmail.com>> wrote: > Hi, > I wrote a very simple program in scala to convert an existing RDD to > SchemaRDD. > But createSchemaRDD function i

Unable to find org.apache.spark.sql.catalyst.ScalaReflection class

2015-02-28 Thread Ashish Nigam
Hi, I wrote a very simple program in scala to convert an existing RDD to SchemaRDD. But createSchemaRDD function is throwing exception Exception in thread "main" scala.ScalaReflectionException: class org.apache.spark.sql.catalyst.ScalaReflection in JavaMirror with primordial classloader with boot