I think what's weird is that  non of the three stages: alignment, sync cp,
async cp takes much time.

On Tue, Sep 18, 2018 at 3:20 PM Till Rohrmann <trohrm...@apache.org> wrote:

> This behavior seems very odd Julio. Could you indeed share the debug logs
> of all Flink processes in order to see why things are taking so long?
>
> The checkpoint size of task #8 is twice as big as the second biggest
> checkpoint. But this should not cause an increase in checkpoint time of a
> factor of 8.
>
> Cheers,
> Till
>
> On Mon, Sep 17, 2018 at 5:25 AM Renjie Liu <liurenjie2...@gmail.com>
> wrote:
>
>> Hi, Julio:
>> This happens frequently? What state backend do you use? The async
>> checkpoint duration and sync checkpoint duration seems normal compared to
>> others, it seems that most of the time are spent acking the checkpoint.
>>
>> On Sun, Sep 16, 2018 at 9:24 AM vino yang <yanghua1...@gmail.com> wrote:
>>
>>> Hi Julio,
>>>
>>> Yes, it seems that fifty-five minutes is really long.
>>> However, it is linear with the time and size of the previous task
>>> adjacent to it in the diagram.
>>> I think your real application is concerned about why Flink accesses HDFS
>>> so slowly.
>>> You can call the DEBUG log to see if you can find any clues, or post the
>>> log to the mailing list to help others analyze the problem for you.
>>>
>>> Thanks, vino.
>>>
>>> Julio Biason <julio.bia...@azion.com> 于2018年9月15日周六 上午7:03写道:
>>>
>>>> (Just an addendum: Although it's not a huge problem -- we can always
>>>> increase the checkpoint timeout time -- this anomalous situation makes me
>>>> think there is something wrong in our pipeline or in our cluster, and that
>>>> is what is making the checkpoint creation go crazy.)
>>>>
>>>> On Fri, Sep 14, 2018 at 8:00 PM, Julio Biason <julio.bia...@azion.com>
>>>> wrote:
>>>>
>>>>> Hey guys,
>>>>>
>>>>> On our pipeline, we have a single slot that it's taking longer to
>>>>> create the checkpoint compared to other slots and we are wondering what
>>>>> could be causing it.
>>>>>
>>>>> The operator in question is the window metric -- the only element in
>>>>> the pipeline that actually uses the state. While the other slots take 7
>>>>> mins to create the checkpoint, this one -- and only this one -- takes
>>>>> 55mins.
>>>>>
>>>>> Is there something I should look at to understand what's going on?
>>>>>
>>>>> (We are storing all checkpoints in HDFS, in case that helps.)
>>>>>
>>>>> --
>>>>> *Julio Biason*, Sofware Engineer
>>>>> *AZION*  |  Deliver. Accelerate. Protect.
>>>>> Office: +55 51 3083 8101 <callto:+555130838101>  |  Mobile: +55 51
>>>>> <callto:+5551996209291>*99907 0554*
>>>>>
>>>>
>>>>
>>>>
>>>> --
>>>> *Julio Biason*, Sofware Engineer
>>>> *AZION*  |  Deliver. Accelerate. Protect.
>>>> Office: +55 51 3083 8101 <callto:+555130838101>  |  Mobile: +55 51
>>>> <callto:+5551996209291>*99907 0554*
>>>>
>>> --
>> Liu, Renjie
>> Software Engineer, MVAD
>>
> --
Liu, Renjie
Software Engineer, MVAD

Reply via email to