Hi Vino,

Yes, Job runs successfully, however, no checkpoints are successful. I will
update the source

Regards,
Vinay Patil


On Fri, Jul 27, 2018 at 2:00 PM vino yang <yanghua1...@gmail.com> wrote:

> Hi Vinay,
>
> Oh!  You use a collection source? That's the problem. Please use a general
> source like Kafka or others. Maybe your checkpoint has not be triggered,
> your job has stopped.
>
> Thanks, vino.
>
> 2018-07-27 16:07 GMT+08:00 Vinay Patil <vinay18.pa...@gmail.com>:
>
>> Hi Vino,
>>
>> Yes I am enabling checkpoint in the code as follows :
>>
>> StreamExecutionEnvironment env = 
>> StreamExecutionEnvironment.createRemoteEnvironment("<job_manager_host>,<job_manager_port>,getJobConfiguration(),jarPath");
>>
>>
>> env.enableCheckpointing(1000);
>>
>> env.setSateBackend(new 
>> FsStateBackend("file:///<shared_mount_point_location>"));
>>
>> env.getCheckpointConfig().setMinPauseBetweenCheckpoints(1000);
>>
>>
>> In getJobConfiguration method I have set HA related properties like
>> HA_STORAGE_PATH,HA_ZOOKEEPER_QUORUM,HA_ZOOKEEPER_ROOT,HA_MODE,HA_JOB_MANAGER_PORT_RANGE,HA_CLUSTER_ID
>>
>>
>> I can see the error in Job Manager logs where it says Collection Source
>> is not being executed at the moment. Aborting checkpoint. In the pipeline I
>> have a stream initialized using "fromCollection". I think I will have to
>> get rid of this.
>>
>> What do you suggest
>>
>> Regards,
>> Vinay Patil
>>
>>
>> On Thu, Jul 26, 2018 at 12:04 PM vino yang <yanghua1...@gmail.com> wrote:
>>
>>> Hi Vinay:
>>>
>>> Did you call specific config API refer to this documentation[1];
>>>
>>> Can you share your job program and JM Log? Or the JM log contains the
>>> log message like this pattern "Triggering checkpoint {} @ {} for job {}."?
>>>
>>> [1]:
>>> https://ci.apache.org/projects/flink/flink-docs-release-1.5/dev/stream/state/checkpointing.html#enabling-and-configuring-checkpointing
>>>
>>> Thanks, vino.
>>>
>>> 2018-07-25 19:43 GMT+08:00 Chesnay Schepler <ches...@apache.org>:
>>>
>>>> Can you provide us with the job code?
>>>>
>>>> I assume that checkpointing runs properly if you submit the same job to
>>>> a normal cluster?
>>>>
>>>>
>>>> On 25.07.2018 13:15, Vinay Patil wrote:
>>>>
>>>> No error in the logs. That is why I am not able to understand why
>>>> checkpoints are not getting triggered.
>>>>
>>>> Regards,
>>>> Vinay Patil
>>>>
>>>>
>>>> On Wed, Jul 25, 2018 at 4:44 PM Vinay Patil <vinay18.pa...@gmail.com>
>>>> wrote:
>>>>
>>>>> Hi Chesnay,
>>>>>
>>>>> No error in the logs. That is why I am not able to understand why
>>>>> checkpoints are getting triggered.
>>>>>
>>>>> Regards,
>>>>> Vinay Patil
>>>>>
>>>>>
>>>>> On Wed, Jul 25, 2018 at 4:36 PM Chesnay Schepler <ches...@apache.org>
>>>>> wrote:
>>>>>
>>>>>> Please check the job- and taskmanager logs for anything suspicious.
>>>>>>
>>>>>> On 25.07.2018 12:33, Vinay Patil wrote:
>>>>>>
>>>>>> Hi,
>>>>>>
>>>>>> I am starting the cluster using bootstrap application where in I am
>>>>>> calling Job Manager and Task Manager main class to form the cluster. The 
>>>>>> HA
>>>>>> cluster is formed correctly and I am able to submit jobs to this cluster
>>>>>> using RemoteExecutionEnvironment but when I enable checkpointing in code 
>>>>>> I
>>>>>> do not see any checkpoints triggered on Flink UI.
>>>>>>
>>>>>> Am I missing any configurations to be set for the
>>>>>> RemoteExecutionEnvironment for checkpointing to work.
>>>>>>
>>>>>>
>>>>>> Regards,
>>>>>> Vinay Patil
>>>>>>
>>>>>>
>>>>>>
>>>>
>>>
>

Reply via email to