Hi Vishal,

The znode /flink_test/da_15/leader/rest_server_lock should exist as long as
your
Flink 1.5 cluster is running. In 1.4 this znode will not be created. Are you
sure that the znode does not exist? Unfortunately you only attached the
output
of "ls /flink_test/da_15".

Can you share the complete JobManager log files from a cluster that is
(re-)starting?

Best,
Gary

On Thu, Jun 28, 2018 at 4:10 PM, Vishal Santoshi <[email protected]>
wrote:

> I am not seeing rest_server_lock. Is it transient ( ephemeral znode )
> for the duration of the cli command ?
>
>
> [zk: localhost:2181(CONNECTED) 2] ls /flink_test/da_15
>
> [jobgraphs, leader, checkpoints, leaderlatch, checkpoint-counter]
>
>
> The logs say
>
> 2018-06-28 14:02:56 INFO  ZooKeeperLeaderRetrievalService:100 - Starting
> ZooKeeperLeaderRetrievalService /leader/rest_server_lock.
>
> 2018-06-28 14:02:56 INFO  ZooKeeperLeaderRetrievalService:100 - Starting
> ZooKeeperLeaderRetrievalService /leader/dispatcher_lock.
>
> Is this a relative path,
>
> high-availability.zookeeper.path.root: /flink_test
>
> high-availability.cluster-id: /da_15
>
>
> I do not see  /leader/rest_server_lock both during the duration of the
> cli run ( or before or after ).
>
> I am a little stumped.... I do not see the above logs on 1.4 so am not
> sure whether /leader/rest_server_lock is the new code...
>
>
> On Thu, Jun 28, 2018 at 3:30 AM, Christophe Jolif <[email protected]>
> wrote:
>
>> Chesnay,
>>
>> Do you have rough idea of the 1.5.1 timeline?
>>
>> Thanks,
>> --
>> Christophe
>>
>> On Mon, Jun 25, 2018 at 4:22 PM, Chesnay Schepler <[email protected]>
>> wrote:
>>
>>> The watermark issue is know and will be fixed in 1.5.1
>>>
>>>
>>> On 25.06.2018 15:03, Vishal Santoshi wrote:
>>>
>>> Thank you....
>>>
>>> One addition
>>>
>>> I do not see WM info on the UI  ( Attached )
>>>
>>> Is this a know issue. The same pipe on our production has the WM ( In
>>> fact never had an issue with  Watermarks not appearing ) . Am I missing
>>> something ?
>>>
>>> On Mon, Jun 25, 2018 at 4:15 AM, Fabian Hueske <[email protected]>
>>> wrote:
>>>
>>>> Hi Vishal,
>>>>
>>>> 1. I don't think a rolling update is possible. Flink 1.5.0 changed the
>>>> process orchestration and how they communicate. IMO, the way to go is to
>>>> start a Flink 1.5.0 cluster, take a savepoint on the running job, start
>>>> from the savepoint on the new cluster and shut the old job down.
>>>> 2. Savepoints should be compatible.
>>>> 3. You can keep the slot configuration as before.
>>>> 4. As I said before, mixing 1.5 and 1.4 processes does not work (or at
>>>> least, it was not considered a design goal and nobody paid attention that
>>>> it is possible).
>>>>
>>>> Best, Fabian
>>>>
>>>>
>>>> 2018-06-23 13:38 GMT+02:00 Vishal Santoshi <[email protected]>:
>>>>
>>>>>
>>>>> 1.
>>>>> Can or has any one  done  a rolling upgrade from 1.4 to 1.5 ?  I am
>>>>> not sure we can. It seems that JM cannot recover jobs with this exception
>>>>>
>>>>> Caused by: java.io.InvalidClassException:
>>>>> org.apache.flink.runtime.jobgraph.tasks.CheckpointCoordinatorConfiguration;
>>>>> local class incompatible: stream classdesc serialVersionUID =
>>>>> -647384516034982626, local class serialVersionUID = 2
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> 2.
>>>>> Does SP on 1.4, resume on 1.5 ( pretty basic but no harm asking ) ?
>>>>>
>>>>>
>>>>>
>>>>> 3.
>>>>> https://ci.apache.org/projects/flink/flink-docs-release-1.5/
>>>>> release-notes/flink-1.5.html#update-configuration-for-rework
>>>>> ed-job-deployment The taskmanager.numberOfTaskSlots: What would be
>>>>> the desired setting in a stand alone ( non mesos/yarn ) cluster ?
>>>>>
>>>>>
>>>>> 4. I suspend all jobs and establish 1.5 on the JM ( the TMs are still
>>>>> running with 1.4 ) . JM refuse to start  with
>>>>>
>>>>> Jun 23 07:34:23 flink-ad21ac07.bf2.tumblr.net docker[3395]:
>>>>> 2018-06-23 11:34:23 ERROR JobManager:116 - Failed to recover job
>>>>> 454cd84a519f3b50e88bcb378d8a1330.
>>>>>
>>>>> Jun 23 07:34:23 flink-ad21ac07.bf2.tumblr.net docker[3395]:
>>>>> java.lang.InstantiationError: org.apache.flink.runtime.blob.BlobKey
>>>>>
>>>>> Jun 23 07:34:23 flink-ad21ac07.bf2.tumblr.net docker[3395]: at
>>>>> sun.reflect.GeneratedSerializationConstructorAccessor51.newInstance(Unknown
>>>>> Source)
>>>>>
>>>>> Jun 23 07:34:23 flink-ad21ac07.bf2.tumblr.net docker[3395]: at
>>>>> java.lang.reflect.Constructor.newInstance(Constructor.java:423)
>>>>>
>>>>> Jun 23 07:34:23 flink-ad21ac07.bf2.tumblr.net docker[3395]: at
>>>>> java.io.ObjectStreamClass.newInstance(ObjectStreamClass.java:1079)
>>>>>
>>>>> Jun
>>>>> .....
>>>>>
>>>>>
>>>>>
>>>>> Any feedback would be highly appreciated...
>>>>>
>>>>>
>>>>
>>>
>>>
>>
>>
>> --
>> Christophe
>>
>
>

Reply via email to