Thanks All. To reiterate - stages inside a job can be run parallely as long
as - (a) there is no sequential dependency (b) the job has sufficient
resources.
however, my code was launching 2 jobs and they are sequential as you
rightly pointed out.
The issue which I was trying to highlight with that
Thanks a lot.
Thanks & Regards
Saroj Kumar Choudhury
Tata Consultancy Services
(UNIT-I)- KALINGA PARK
IT/ITES SPECIAL ECONOMIC ZONE (SEZ),PLOT NO. 35,
CHANDAKA INDUSTRIAL ESTATE, PATIA,
Bhubaneswar - 751 024,Orissa
India
Ph:- +91 674 664 5154
Mailto: saro...@tcs.com
Website: http://www.tcs.com
__
i think my words also misunderstood. My point is they will not submit
together since they are the part of one thread.
val spark = SparkSession.builder()
.appName("practice")
.config("spark.scheduler.mode","FAIR")
.enableHiveSupport().getOrCreate()
val sc = spark.sparkContext
sc.parallelize(
My words cause misunderstanding.
Step 1:A is submited to spark.
Step 2:B is submitted to spark.
Spark gets two independent jobs.The FAIR is used to schedule A and B.
Jeffrey' code did not cause two submit.
---Original---
From: "Pralabh Kumar"
Date: 2017/6/27 12:09:27
To: "??"<1
Hi
I don't think so spark submit ,will receive two submits . Its will execute
one submit and then to next one . If the application is multithreaded ,and
two threads are calling spark submit and one time , then they will run
parallel provided the scheduler is FAIR and task slots are available .
I think the spark cluster receives two submits, A and B.
The FAIR is used to schedule A and B.
I am not sure about this.
---Original---
From: "Bryan Jeffrey"
Date: 2017/6/27 08:55:42
To: "satishl";
Cc: "user";
Subject: Re: Question about Parallel Stages in Spark
Hello.
The driver is running
thank you?9?9
---Original---
From: "Ted Yu"
Date: 2017/6/27 10:18:18
To: "??"<1427357...@qq.com>;
Cc: "user";"dev";
Subject: Re: how to mention others in JIRA comment please?
You can find the JIRA handle of the person you want to mention by going to a
JIRA where that person has comme
You can find the JIRA handle of the person you want to mention by going to
a JIRA where that person has commented.
e.g. you want to find the handle for Joseph.
You can go to:
https://issues.apache.org/jira/browse/SPARK-6635
and click on his name in comment:
https://issues.apache.org/jira/secure/V
Hi all,
how to mention others in JIRA comment please?
I added @ before other members' name, but it didn't work.
Would you like help me please?
thanks
Fei Shao
Hi Kodali,
I feel puzzled about the
"Kafka Streaming can indeed do map, reduce, join and window operations ".
Do you mean Kafka have API like map or Kafka do't have API but Kafka can do it
please?
In my memory, kafka do not have API like map and so on.
---Original---
From: "kant kodali"
Hi Owen,
Would you like help me check this issue please?
Is it a potential bug please or not?
thanks
Fei Shao
---Original---
From: "??"<1427357...@qq.com>
Date: 2017/6/25 21:44:41
To: "user";"dev";
Subject: Re: issue about the windows slice of stream
Hi all,
Let me add more i
Hello.
The driver is running the individual operations in series, but each
operation is parallelized internally. If you want them run in parallel you
need to provide the driver a mechanism to thread the job scheduling out:
val rdd1 = sc.parallelize(1 to 10)
val rdd2 = sc.parallelize(1 to 200
Hi SRK,
what is the slideduration and parentduration in your code please?
you can search "issue about the windows slice of stream" in the maillist.
Perhaps they are related.
---Original---
From: "SRK"
Date: 2017/6/27 03:53:22
To: "user";
Subject: Spark Streaming reduceByKeyAndWindow with inver
For the below code, since rdd1 and rdd2 dont depend on each other - i was
expecting that both first and second printlns would be interwoven. However -
the spark job runs all "first " statements first and then all "seocnd"
statements next in serial fashion. I have set spark.scheduler.mode = FAIR.
o
Thanks. I saw it earlier but did not whether this is the official way of
doing Spark with ZeroMQ. Thanks, I will have a look.
- Aashish
On Mon, Jun 26, 2017 at 3:01 PM Shixiong(Ryan) Zhu
wrote:
> It's moved to http://bahir.apache.org/
>
> You can find document there.
>
> On Mon, Jun 26, 2017 at
Unfortunately the way reduceByKeyAndWindow is implemented, it does iterate
through all the counts. To have something more efficient, you may have to
implement your own windowing logic using mapWithState. Something like
eventDStream.flatmap { event =>
// find the windows each even maps to, and r
Hi,
We have reduceByKeyAndWindow with inverse function feature in our Streaming
job to calculate rolling counts for the past hour and for the past 24 hours.
It seems that the functionality is iterating over all the keys in the window
even though they are not present in the current batch causing th
It's moved to http://bahir.apache.org/
You can find document there.
On Mon, Jun 26, 2017 at 11:58 AM, Aashish Chaudhary <
aashish.chaudh...@kitware.com> wrote:
> Hi there,
>
> I am a beginner when it comes to Spark streaming. I was looking for some
> examples related to ZeroMQ and Spark and real
Hi there,
I am a beginner when it comes to Spark streaming. I was looking for some
examples related to ZeroMQ and Spark and realized that ZeroMQUtils is no
longer present in Spark 2.x.
I would appreciate if someone can shed some light on the history and what I
could do to use ZeroMQ with Spark St
Hi Swetha,
We have dealt with this issue a couple years ago and have solved it. The
key insight here was that adding to a HashSet and removing from a HashSet
are actually not inverse operations of each other.
For example, if you added a key K1 in batch1 and then again added that same
key K1 durin
First Spark project.
I have a Java method that returns a Dataset. I want to convert this to
a Dataset, where the Object is named StatusChangeDB. I have created a POJO
StatusChangeDB.java and coded it with all the query objects found in the
mySQL table.
I then create a Encoder and convert the Datas
For SHC documentation, please refer the README in SHC github, which is kept
up-to-date.
On Mon, Jun 26, 2017 at 5:46 AM, ayan guha wrote:
> Thanks all, I have found correct version of the package. Probably HDP
> documentation is little behind.
>
> Best
> Ayan
>
> On Mon, 26 Jun 2017 at 2:16 pm,
Thanks all, I have found correct version of the package. Probably HDP
documentation is little behind.
Best
Ayan
On Mon, 26 Jun 2017 at 2:16 pm, Mahesh Sawaiker <
mahesh_sawai...@persistent.com> wrote:
> Ayan,
>
> The location of the logging class was moved from Spark 1.6 to Spark 2.0.
>
> Looks
On 25 Jun 2017, at 20:57, kant kodali
mailto:kanth...@gmail.com>> wrote:
impressive! I need to learn more about scala.
What I mean stripping away conditional check in Java is this.
static final boolean isLogInfoEnabled = false;
public void logMessage(String message) {
if(isLogInfoEnabled)
Hi, all!
I have a code, serializing RDD as Kryo, and saving it as sequence file. It
works fine in 1.5.1, but, while switching to 2.1.1 it does not work.
I am trying to serialize RDD of Tuple2<> (got from PairRDD).
1. RDD consists of different heterogeneous objects (aggregates, like
HLL, QTr
Hi, all!
I have a code, serializing RDD as Kryo, and saving it as sequence file. It
works fine in 1.5.1, but, while switching to 2.1.1 it does not work.
I am trying to serialize RDD of Tuple2<> (got from PairRDD).
1. RDD consists of different heterogeneous objects (aggregates, like
HLL, QTr
26 matches
Mail list logo