OK few things

2018-06-26 13:31:29 INFO  CliFrontend:282 -  Starting Command Line Client
(Version: 1.5.0, Rev:c61b108, Date:24.05.2018 @ 14:54:44 UTC)

...

2018-06-26 13:31:31 INFO  ClientCnxn:876 - Socket connection established to
zk-f1fb95b9.bf2.tumblr.net/10.246.218.17:2181, initiating session

2018-06-26 13:31:31 DEBUG ClientCnxn:949 - Session establishment request
sent on zk-f1fb95b9.bf2.tumblr.net/10.246.218.17:2181

2018-06-26 13:31:31 INFO  ClientCnxn:1299 - Session establishment complete
on server zk-f1fb95b9.bf2.tumblr.net/10.246.218.17:2181, sessionid =
0x35add547801ea07, negotiated timeout = 40000

2018-06-26 13:31:31 INFO  RestClient:119 - Rest client endpoint started.

2018-06-26 13:31:31 INFO  ZooKeeperLeaderRetrievalService:100 - Starting
ZooKeeperLeaderRetrievalService /leader/rest_server_lock.

2018-06-26 13:31:31 DEBUG ClientCnxn:843 - Reading reply
sessionid:0x35add547801ea07, packet:: clientPath:null serverPath:null
finished:false header:: 1,3  replyHeader:: 1,60416530560,0  request::
'/flink_test,F  response::
s{47265479496,47265479496,1489163688703,1489163688703,0,2,0,0,0,2,60416492885}


2018-06-26 13:31:31 DEBUG ClientCnxn:843 - Reading reply
sessionid:0x35add547801ea07, packet:: clientPath:null serverPath:null
finished:false header:: 2,3  replyHeader:: 2,60416530560,0  request::
'/flink_test/da_15,F  response::
s{60416492885,60416492885,1529755199131,1529755199131,0,5,0,0,0,5,60416521584}


2018-06-26 13:31:31 INFO  ZooKeeperLeaderRetrievalService:100 - Starting
ZooKeeperLeaderRetrievalService /leader/dispatcher_lock.

2018-06-26 13:31:31 DEBUG ClientCnxn:843 - Reading reply
sessionid:0x35add547801ea07, packet:: clientPath:null serverPath:null
finished:false header:: 3,3  replyHeader:: 3,60416530560,0  request::
'/flink_test,F  response::
s{47265479496,47265479496,1489163688703,1489163688703,0,2,0,0,0,2,60416492885}


2018-06-26 13:31:31 DEBUG ClientCnxn:843 - Reading reply
sessionid:0x35add547801ea07, packet:: clientPath:null serverPath:null
finished:false header:: 4,3  replyHeader:: 4,60416530560,0  request::
'/flink_test/da_15,F  response::
s{60416492885,60416492885,1529755199131,1529755199131,0,5,0,0,0,5,60416521584}


2018-06-26 13:31:31 DEBUG ClientCnxn:843 - Reading reply
sessionid:0x35add547801ea07, packet:: clientPath:null serverPath:null
finished:false header:: 5,3  replyHeader:: 5,60416530560,0  request::
'/flink_test/da_15/leader,F  response::
s{60416492887,60416492887,1529755199191,1529755199191,0,1,0,0,0,1,60416492888}


2018-06-26 13:31:31 DEBUG ClientCnxn:843 - Reading reply
sessionid:0x35add547801ea07, packet::
clientPath:/flink_test/da_15/leader/rest_server_lock
serverPath:/flink_test/da_15/leader/rest_server_lock finished:false
header:: 6,3  replyHeader:: 6,60416530560,-101  request::
'/flink_test/da_15/leader/rest_server_lock,T  response::

2018-06-26 13:31:31 DEBUG ClientCnxn:843 - Reading reply
sessionid:0x35add547801ea07, packet:: clientPath:null serverPath:null
finished:false header:: 7,3  replyHeader:: 7,60416530560,0  request::
'/flink_test,F  response::
s{47265479496,47265479496,1489163688703,1489163688703,0,2,0,0,0,2,60416492885}


2018-06-26 13:31:31 DEBUG ClientCnxn:843 - Reading reply
sessionid:0x35add547801ea07, packet:: clientPath:null serverPath:null
finished:false header:: 8,3  replyHeader:: 8,60416530560,0  request::
'/flink_test/da_15,F  response::
s{60416492885,60416492885,1529755199131,1529755199131,0,5,0,0,0,5,60416521584}


2018-06-26 13:31:31 DEBUG ClientCnxn:843 - Reading reply
sessionid:0x35add547801ea07, packet:: clientPath:null serverPath:null
finished:false header:: 9,3  replyHeader:: 9,60416530560,0  request::
'/flink_test/da_15/leader,F  response::
s{60416492887,60416492887,1529755199191,1529755199191,0,1,0,0,0,1,60416492888}


2018-06-26 13:31:31 DEBUG ClientCnxn:843 - Reading reply
sessionid:0x35add547801ea07, packet::
clientPath:/flink_test/da_15/leader/dispatcher_lock
serverPath:/flink_test/da_15/leader/dispatcher_lock finished:false header::
10,3  replyHeader:: 10,60416530560,-101  request::
'/flink_test/da_15/leader/dispatcher_lock,T  response::

2018-06-26 13:31:31 INFO  CliFrontend:914 - Waiting for response...

Waiting for response...

2018-06-26 13:31:44 DEBUG ClientCnxn:742 - Got ping response for sessionid:
0x35add547801ea07 after 0ms

2018-06-26 13:31:58 DEBUG ClientCnxn:742 - Got ping response for sessionid:
0x35add547801ea07 after 0ms

2018-06-26 13:32:01 INFO  RestClient:123 - Shutting down rest endpoint.

2018-06-26 13:32:01 INFO  RestClient:140 - Rest endpoint shutdown complete.

2018-06-26 13:32:01 INFO  ZooKeeperLeaderRetrievalService:117 - Stopping
ZooKeeperLeaderRetrievalService /leader/rest_server_lock.

2018-06-26 13:32:01 INFO  ZooKeeperLeaderRetrievalService:117 - Stopping
ZooKeeperLeaderRetrievalService /leader/dispatcher_lock.

2018-06-26 13:32:01 DEBUG CuratorFrameworkImpl:282 - Closing

2018-06-26 13:32:01 INFO  CuratorFrameworkImpl:821 -
backgroundOperationsLoop exiting

2018-06-26 13:32:01 DEBUG CuratorZookeeperClient:199 - Closing

2018-06-26 13:32:01 DEBUG ConnectionState:115 - Closing

2018-06-26 13:32:01 DEBUG ZooKeeper:673 - Closing session: 0x35add547801ea07

2018-06-26 13:32:01 DEBUG ClientCnxn:1370 - Closing client for session:
0x35add547801ea07

2018-06-26 13:32:01 DEBUG ClientCnxn:843 - Reading reply
sessionid:0x35add547801ea07, packet:: clientPath:null serverPath:null
finished:false header:: 11,-11  replyHeader:: 11,60416530561,0  request::
null response:: null

2018-06-26 13:32:01 DEBUG ClientCnxn:1354 - Disconnecting client for
session: 0x35add547801ea07

2018-06-26 13:32:01 INFO  ZooKeeper:684 - Session: 0x35add547801ea07 closed

2018-06-26 13:32:01 INFO  ClientCnxn:519 - EventThread shut down for
session: 0x35add547801ea07

2018-06-26 13:32:01 DEBUG ClientCnxn:1146 - An exception was thrown while
closing send thread for session 0x35add547801ea07 : Unable to read
additional data from server sessionid 0x35add547801ea07, likely server has
closed socket

2018-06-26 13:32:01 ERROR CliFrontend:891 - Error while running the command.

org.apache.flink.util.FlinkException: Failed to retrieve job list.

at org.apache.flink.client.cli.CliFrontend.listJobs(CliFrontend.java:429)

at
org.apache.flink.client.cli.CliFrontend.lambda$list$0(CliFrontend.java:412)


On Tue, Jun 26, 2018 at 5:43 AM, zhangminglei <18717838...@163.com> wrote:

> By the way, in HA set up.
>
> 在 2018年6月26日,下午5:39,zhangminglei <18717838...@163.com> 写道:
>
> Hi, Gary Yao
>
> Once I discovered that there was a change in the ip address[
> jobmanager.rpc.address ]. From 10.208.73.129 to localhost. I think that
> will cause the issue. What do you think ?
>
> Cheers
> Minglei
>
> 在 2018年6月26日,下午4:53,Gary Yao <g...@data-artisans.com> 写道:
>
> Hi Vishal,
>
> Could it be that you are not using the 1.5.0 client? The stacktrace you
> posted
> does not reference valid lines of code in the release-1.5.0-rc6 tag.
>
> If you have a HA setup, the host and port of the leading JM will be looked
> up
> from ZooKeeper before job submission. Therefore, the flink-conf.yaml used
> by the
> client must have the same ZooKeeper configuration as used by the Flink
> cluster.
>
> Best,
> Gary
>
> On Mon, Jun 25, 2018 at 5:32 PM, Vishal Santoshi <
> vishal.santo...@gmail.com> wrote:
>
>> I think all I need to add is
>>
>> web.port: 8081
>> rest.port: 8081
>>
>> to the JM flink conf ?
>>
>> On Mon, Jun 25, 2018 at 10:46 AM, Vishal Santoshi <
>> vishal.santo...@gmail.com> wrote:
>>
>>> Another issue I saw with flink cli...
>>>
>>> org.apache.flink.client.program.ProgramInvocationException: The program
>>> execution failed: JobManager did not respond within 120000 ms
>>> at org.apache.flink.client.program.ClusterClient.runDetached(Cl
>>> usterClient.java:524)
>>> at org.apache.flink.client.program.StandaloneClusterClient.subm
>>> itJob(StandaloneClusterClient.java:103)
>>> at org.apache.flink.client.program.ClusterClient.run(ClusterCli
>>> ent.java:456)
>>> at org.apache.flink.client.program.DetachedEnvironment.finalize
>>> Execute(DetachedEnvironment.java:77)
>>> at org.apach
>>>
>>> This was a simple submission  and it does succeed through the UI.
>>>
>>> Has there been a regression on CLI... I could not find any documentation
>>> around it.
>>>
>>> I have a HA JM setup.
>>>
>>>
>>>
>>>
>>> On Mon, Jun 25, 2018 at 10:22 AM, Chesnay Schepler <ches...@apache.org>
>>> wrote:
>>>
>>>> The watermark issue is know and will be fixed in 1.5.1
>>>>
>>>>
>>>> On 25.06.2018 15:03, Vishal Santoshi wrote:
>>>>
>>>> Thank you....
>>>>
>>>> One addition
>>>>
>>>> I do not see WM info on the UI  ( Attached )
>>>>
>>>> Is this a know issue. The same pipe on our production has the WM ( In
>>>> fact never had an issue with  Watermarks not appearing ) . Am I missing
>>>> something ?
>>>>
>>>> On Mon, Jun 25, 2018 at 4:15 AM, Fabian Hueske <fhue...@gmail.com>
>>>> wrote:
>>>>
>>>>> Hi Vishal,
>>>>>
>>>>> 1. I don't think a rolling update is possible. Flink 1.5.0 changed the
>>>>> process orchestration and how they communicate. IMO, the way to go is to
>>>>> start a Flink 1.5.0 cluster, take a savepoint on the running job, start
>>>>> from the savepoint on the new cluster and shut the old job down.
>>>>> 2. Savepoints should be compatible.
>>>>> 3. You can keep the slot configuration as before.
>>>>> 4. As I said before, mixing 1.5 and 1.4 processes does not work (or at
>>>>> least, it was not considered a design goal and nobody paid attention that
>>>>> it is possible).
>>>>>
>>>>> Best, Fabian
>>>>>
>>>>>
>>>>> 2018-06-23 13:38 GMT+02:00 Vishal Santoshi <vishal.santo...@gmail.com>
>>>>> :
>>>>>
>>>>>>
>>>>>> 1.
>>>>>> Can or has any one  done  a rolling upgrade from 1.4 to 1.5 ?  I am
>>>>>> not sure we can. It seems that JM cannot recover jobs with this exception
>>>>>>
>>>>>> Caused by: java.io.InvalidClassException:
>>>>>> org.apache.flink.runtime.jobgraph.tasks.CheckpointCoordinatorConfiguration;
>>>>>> local class incompatible: stream classdesc serialVersionUID =
>>>>>> -647384516034982626, local class serialVersionUID = 2
>>>>>>
>>>>>>
>>>>>>
>>>>>> 2.
>>>>>> Does SP on 1.4, resume on 1.5 ( pretty basic but no harm asking ) ?
>>>>>>
>>>>>>
>>>>>>
>>>>>> 3.
>>>>>> https://ci.apache.org/projects/flink/flink-docs-release-1.5/
>>>>>> release-notes/flink-1.5.html#update-configuration-for-rework
>>>>>> ed-job-deployment The taskmanager.numberOfTaskSlots: What would be
>>>>>> the desired setting in a stand alone ( non mesos/yarn ) cluster ?
>>>>>>
>>>>>>
>>>>>> 4. I suspend all jobs and establish 1.5 on the JM ( the TMs are still
>>>>>> running with 1.4 ) . JM refuse to start  with
>>>>>>
>>>>>> Jun 23 07:34:23 flink-ad21ac07.bf2.tumblr.net docker[3395]:
>>>>>> 2018-06-23 11:34:23 ERROR JobManager:116 - Failed to recover job
>>>>>> 454cd84a519f3b50e88bcb378d8a1330.
>>>>>> Jun 23 07:34:23 flink-ad21ac07.bf2.tumblr.net docker[3395]:
>>>>>> java.lang.InstantiationError: org.apache.flink.runtime.blob.BlobKey
>>>>>> Jun 23 07:34:23 flink-ad21ac07.bf2.tumblr.net docker[3395]: at
>>>>>> sun.reflect.GeneratedSerializationConstructorAccessor51.newInstance(Unknown
>>>>>> Source)
>>>>>> Jun 23 07:34:23 flink-ad21ac07.bf2.tumblr.net docker[3395]: at
>>>>>> java.lang.reflect.Constructor.newInstance(Constructor.java:423)
>>>>>> Jun 23 07:34:23 flink-ad21ac07.bf2.tumblr.net docker[3395]: at
>>>>>> java.io.ObjectStreamClass.newInstance(ObjectStreamClass.java:1079)
>>>>>> Jun
>>>>>> .....
>>>>>>
>>>>>>
>>>>>>
>>>>>> Any feedback would be highly appreciated...
>>>>>>
>>>>>>
>>>>>
>>>>
>>>>
>>>
>>
>
>
>

Reply via email to