from:"Qiao, Richard"

Re: Programmatically get status of job (WAITING/RUNNING)

2017-12-07 Thread Qiao, Richard

For your question of example, the answer is yes. “For example, if an application wanted 4 executors (spark.executor.instances=4) but the spark cluster can only provide 1 executor. This means that I will only receive 1 onExecutorAdded event. Will the application state change to RUNNI

Re: Programmatically get status of job (WAITING/RUNNING)

2017-12-07 Thread Qiao, Richard

For #2, do you mean “RUNNING” showing in “Driver” table? If yes, that is not a problem, because driver does run, while there is no executor available, as can be a status for you to catch – Driver running while no executors. Comparing #1 and #3, my understanding of “submitted” is “the jar is submi

Re: Do I need to do .collect inside forEachRDD

2017-12-07 Thread Qiao, Richard

: "Qiao, Richard" Cc: Gerard Maas , "user @spark" Subject: Re: Do I need to do .collect inside forEachRDD Hi Richard, I had tried your sample code now and several times in the past as well. The problem seems to be kafkaProducer is not serializable. so I get "Task not seria

Re: Do I need to do .collect inside forEachRDD

2017-12-07 Thread Qiao, Richard

ali Date: Thursday, December 7, 2017 at 2:30 AM To: Gerard Maas Cc: "Qiao, Richard" , "user @spark" Subject: Re: Do I need to do .collect inside forEachRDD @Richard I had pasted the two versions of the code below and I still couldn't figure out why it wouldn'

Re: unable to connect to connect to cluster 2.2.0

2017-12-06 Thread Qiao, Richard

Are you now building your app using spark 2.2 or 2.1? Best Regards Richard From: Imran Rajjad Date: Wednesday, December 6, 2017 at 2:45 AM To: "user @spark" Subject: unable to connect to connect to cluster 2.2.0 Hi, Recently upgraded from 2.1.1 to 2.2.0. My Streaming job seems to have broken

Re: Do I need to do .collect inside forEachRDD

2017-12-05 Thread Qiao, Richard

In the 2nd case, is there any producer’s error thrown in executor’s log? Best Regards Richard From: kant kodali Date: Tuesday, December 5, 2017 at 4:38 PM To: "Qiao, Richard" Cc: "user @spark" Subject: Re: Do I need to do .collect inside forEachRDD Reads from Kafka and

Re: Do I need to do .collect inside forEachRDD

2017-12-05 Thread Qiao, Richard

Where do you check the output result for both case? Sent from my iPhone > On Dec 5, 2017, at 15:36, kant kodali wrote: > > Hi All, > > I have a simple stateless transformation using Dstreams (stuck with the old > API for one of the Application). The pseudo code is rough like this > > dstream

Re: Access to Applications metrics

2017-12-04 Thread Qiao, Richard

It works to collect Job level, through Jolokia java agent. Best Regards Richard From: Nick Dimiduk Date: Monday, December 4, 2017 at 6:53 PM To: "user@spark.apache.org" Subject: Re: Access to Applications metrics Bump. On Wed, Nov 15, 2017 at 2:28 PM, Nick Dimiduk mailto:ndimi...@gmail.com>

Re: Add snappy support for spark in Windows

2017-12-04 Thread Qiao, Richard

Junjeng, it worth a try to start your spark local with hadoop.dll/winutils.exe etc hadoop windows support package in HADOOP_HOME, if you didn’t do that yet. Best Regards Richard From: Junfeng Chen Date: Monday, December 4, 2017 at 3:53 AM To: "Qiao, Richard" Cc: "user@s

Re: Add snappy support for spark in Windows

2017-12-04 Thread Qiao, Richard

It seems a common mistake that the path is not accessible by workers/executors. Best regards Richard Sent from my iPhone On Dec 3, 2017, at 22:32, Junfeng Chen mailto:darou...@gmail.com>> wrote: I am working on importing snappy compressed json file into spark rdd or dataset. However I meet t

Re: Dynamic Resource allocation in Spark Streaming

2017-12-03 Thread Qiao, Richard

Sourav: I’m using spark streaming 2.1.0 and can confirm spark.dynamicAllocation.enabled is enough. Best Regards Richard From: Sourav Mazumder Date: Sunday, December 3, 2017 at 12:31 PM To: user Subject: Dynamic Resource allocation in Spark Streaming Hi, I see the following jir

Re: [Spark streaming] No assigned partition error during seek

2017-12-01 Thread Qiao, Richard

In your case, it looks it’s trying to make 2 versions Kafka existed in the same JVM at runtime. There is version conflict. About “I dont find the spark async commit useful for our needs”, do you mean to say the code like below? kafkaDStream.asInstanceOf[CanCommitOffsets].commitAsync(ranges) B

Re: Programmatically get status of job (WAITING/RUNNING)

Re: Programmatically get status of job (WAITING/RUNNING)

Re: Do I need to do .collect inside forEachRDD

Re: Do I need to do .collect inside forEachRDD

Re: unable to connect to connect to cluster 2.2.0

Re: Do I need to do .collect inside forEachRDD

Re: Do I need to do .collect inside forEachRDD

Re: Access to Applications metrics

Re: Add snappy support for spark in Windows

Re: Add snappy support for spark in Windows

Re: Dynamic Resource allocation in Spark Streaming

Re: [Spark streaming] No assigned partition error during seek

12 matches

Site Navigation

Mail list logo

Footer information