Hi Burak,

It makes sense, it boils down to any actions happens after transformations
then. Thanks for your answers.

Best,
Wei

2015-06-24 15:06 GMT-07:00 Burak Yavuz <[email protected]>:

> Hi Wei,
>
> During the action, all the transformations before it will occur in order
> leading up to the action. If you have an accumulator in any of these
> transformations, then you won't get exactly once semantics, because the
> transformation may be restarted elsewhere.
>
> Bet,
> Burak
>
> On Wed, Jun 24, 2015 at 2:25 PM, Wei Zhou <[email protected]> wrote:
>
>> Hi Burak,
>>
>> Thanks for your quick reply. I guess what confuses me is that accumulator
>> won't be updated until an action is used due to the laziness, so
>> transformation such as a map won't even update the accumulator, then how
>> would restarted the transformation ended up updating accumulator more than
>> once?
>>
>> Best,
>> Wei
>>
>> 2015-06-24 13:23 GMT-07:00 Burak Yavuz <[email protected]>:
>>
>>> Hi Wei,
>>>
>>> For example, when a straggler executor gets killed in the middle of a
>>> map operation and it's task is restarted at a different instance, the
>>> accumulator will be updated more than once.
>>>
>>> Best,
>>> Burak
>>>
>>> On Wed, Jun 24, 2015 at 1:08 PM, Wei Zhou <[email protected]> wrote:
>>>
>>>> Quoting from Spark Program guide:
>>>>
>>>> "For accumulator updates performed inside *actions only*, Spark
>>>> guarantees that each task’s update to the accumulator will only be applied
>>>> once, i.e. restarted tasks will not update the value. In transformations,
>>>> users should be aware of that each task’s update may be applied more than
>>>> once if tasks or job stages are re-executed."
>>>>
>>>> Can anyone gives me a possible scenario of when accumulator might be
>>>> updated more than once during transformation? Thanks.
>>>>
>>>> Regards,
>>>> Wei
>>>>
>>>
>>>
>>
>

Reply via email to