>>
>>> --
>>> bit1...@163.com
>>>
>>>
>>> *From:* Haopu Wang
>>> *Date:* 2015-06-19 18:47
>>> *To:* Enno Shioji ; Tathagata Das
>>>
>>> *CC:* prajod.vettiyat...@wipro.com; Cody Koeninger ;
9 18:47
>> *To:* Enno Shioji ; Tathagata Das
>>
>> *CC:* prajod.vettiyat...@wipro.com; Cody Koeninger ;
>> bit1...@163.com; Jordan Pilat ; Will Briggs
>> ; Ashish Soni ; ayan guha
>> ; user@spark.apache.org; Sateesh Kavuri
>> ; Spark Enthusiast ;
>> Sa
> ; Ashish Soni ; ayan guha
> ; user@spark.apache.org; Sateesh Kavuri
> ; Spark Enthusiast ;
> Sabarish
> Sasidharan
> *Subject:* RE: RE: Spark or Storm
>
> My question is not directly related: about the "exactly-once semantic",
> the document (copied below) said s
Soni ; ayan guha
> ; user@spark.apache.org; Sateesh Kavuri
> ; Spark Enthusiast ;
> Sabarish
> Sasidharan
> *Subject:* RE: RE: Spark or Storm
>
> My question is not directly related: about the "exactly-once semantic",
> the document (copied below) said spark streamin
.apache.org; Sateesh
Kavuri; Spark Enthusiast; Sabarish Sasidharan
Subject: RE: RE: Spark or Storm
My question is not directly related: about the "exactly-once semantic", the
document (copied below) said spark streaming gives exactly-once semantic, but
actually from my test result, with ch
ata Das
Cc: prajod.vettiyat...@wipro.com; Cody Koeninger; bit1...@163.com;
Jordan Pilat; Will Briggs; Ashish Soni; ayan guha;
user@spark.apache.org; Sateesh Kavuri; Spark Enthusiast; Sabarish
Sasidharan
Subject: Re: RE: Spark or Storm
Fair enough, on second thought, just saying that it should be idempotent
use of checkpoints to persist the Kafka offsets in Spark
>>> Streaming itself, and not in zookeeper.
>>>
>>>
>>>
>>> Also this statement:”.. This allows one to build a Spark Streaming +
>>> Kafka pipelines with end-to-end exactly-once semantics (if
dempotent or transactional).”
>>
>>
>>
>>
>>
>> *From:* Cody Koeninger [mailto:c...@koeninger.org]
>> *Sent:* 18 June 2015 19:38
>> *To:* bit1...@163.com
>> *Cc:* Prajod S Vettiyattil (WT01 - BAS); jrpi...@gmail.com;
>> eshi...@gmail.com; wrbri...@gma
To:* bit1...@163.com
> *Cc:* Prajod S Vettiyattil (WT01 - BAS); jrpi...@gmail.com;
> eshi...@gmail.com; wrbri...@gmail.com; asoni.le...@gmail.com; ayan guha;
> user; sateesh.kav...@gmail.com; sparkenthusi...@yahoo.in;
> sabarish.sasidha...@manthan.com
> *Subject:* Re: RE: Spark or Sto
(WT01 - BAS); jrpi...@gmail.com; eshi...@gmail.com;
wrbri...@gmail.com; asoni.le...@gmail.com; ayan guha; user;
sateesh.kav...@gmail.com; sparkenthusi...@yahoo.in;
sabarish.sasidha...@manthan.com
Subject: Re: RE: Spark or Storm
That general description is accurate, but not really a specific
That general description is accurate, but not really a specific issue of
the direct steam. It applies to anything consuming from kafka (or, as
Matei already said, any streaming system really). You can't have exactly
once semantics, unless you know something more about how you're storing
results.
I am wondering how direct stream api ensures end-to-end exactly once semantics
I think there are two things involved:
1. From the spark streaming end, the driver will replay the Offset range when
it's down and restarted,which means that the new tasks will process some
already processed data.
2.
12 matches
Mail list logo