By the way, in HA set up.
> 在 2018年6月26日,下午5:39,zhangminglei <18717838...@163.com> 写道:
>
> Hi, Gary Yao
>
> Once I discovered that there was a change in the ip address[
> jobmanager.rpc.address ]. From 10.208.73.129 to localhost. I think that will
> cause the issue. What do you think ?
>
> Cheers
> Minglei
>
>> 在 2018年6月26日,下午4:53,Gary Yao <g...@data-artisans.com
>> <mailto:g...@data-artisans.com>> 写道:
>>
>> Hi Vishal,
>>
>> Could it be that you are not using the 1.5.0 client? The stacktrace you
>> posted
>> does not reference valid lines of code in the release-1.5.0-rc6 tag.
>>
>> If you have a HA setup, the host and port of the leading JM will be looked up
>> from ZooKeeper before job submission. Therefore, the flink-conf.yaml used by
>> the
>> client must have the same ZooKeeper configuration as used by the Flink
>> cluster.
>>
>> Best,
>> Gary
>>
>> On Mon, Jun 25, 2018 at 5:32 PM, Vishal Santoshi <vishal.santo...@gmail.com
>> <mailto:vishal.santo...@gmail.com>> wrote:
>> I think all I need to add is
>>
>> web.port: 8081
>> rest.port: 8081
>>
>> to the JM flink conf ?
>>
>> On Mon, Jun 25, 2018 at 10:46 AM, Vishal Santoshi <vishal.santo...@gmail.com
>> <mailto:vishal.santo...@gmail.com>> wrote:
>> Another issue I saw with flink cli...
>>
>> org.apache.flink.client.program.ProgramInvocationException: The program
>> execution failed: JobManager did not respond within 120000 ms
>> at
>> org.apache.flink.client.program.ClusterClient.runDetached(ClusterClient.java:524)
>> at
>> org.apache.flink.client.program.StandaloneClusterClient.submitJob(StandaloneClusterClient.java:103)
>> at
>> org.apache.flink.client.program.ClusterClient.run(ClusterClient.java:456)
>> at
>> org.apache.flink.client.program.DetachedEnvironment.finalizeExecute(DetachedEnvironment.java:77)
>> at org.apach
>>
>> This was a simple submission and it does succeed through the UI.
>>
>> Has there been a regression on CLI... I could not find any documentation
>> around it.
>>
>> I have a HA JM setup.
>>
>>
>>
>>
>> On Mon, Jun 25, 2018 at 10:22 AM, Chesnay Schepler <ches...@apache.org
>> <mailto:ches...@apache.org>> wrote:
>> The watermark issue is know and will be fixed in 1.5.1
>>
>>
>> On 25.06.2018 15:03, Vishal Santoshi wrote:
>>> Thank you....
>>>
>>> One addition
>>>
>>> I do not see WM info on the UI ( Attached )
>>>
>>> Is this a know issue. The same pipe on our production has the WM ( In fact
>>> never had an issue with Watermarks not appearing ) . Am I missing
>>> something ?
>>>
>>> On Mon, Jun 25, 2018 at 4:15 AM, Fabian Hueske <fhue...@gmail.com
>>> <mailto:fhue...@gmail.com>> wrote:
>>> Hi Vishal,
>>>
>>> 1. I don't think a rolling update is possible. Flink 1.5.0 changed the
>>> process orchestration and how they communicate. IMO, the way to go is to
>>> start a Flink 1.5.0 cluster, take a savepoint on the running job, start
>>> from the savepoint on the new cluster and shut the old job down.
>>> 2. Savepoints should be compatible.
>>> 3. You can keep the slot configuration as before.
>>> 4. As I said before, mixing 1.5 and 1.4 processes does not work (or at
>>> least, it was not considered a design goal and nobody paid attention that
>>> it is possible).
>>>
>>> Best, Fabian
>>>
>>>
>>> 2018-06-23 13:38 GMT+02:00 Vishal Santoshi <vishal.santo...@gmail.com
>>> <mailto:vishal.santo...@gmail.com>>:
>>>
>>> 1.
>>> Can or has any one done a rolling upgrade from 1.4 to 1.5 ? I am not
>>> sure we can. It seems that JM cannot recover jobs with this exception
>>>
>>> Caused by: java.io.InvalidClassException:
>>> org.apache.flink.runtime.jobgraph.tasks.CheckpointCoordinatorConfiguration;
>>> local class incompatible: stream classdesc serialVersionUID =
>>> -647384516034982626, local class serialVersionUID = 2
>>>
>>>
>>>
>>> 2.
>>> Does SP on 1.4, resume on 1.5 ( pretty basic but no harm asking ) ?
>>>
>>>
>>>
>>> 3.
>>> https://ci.apache.org/projects/flink/flink-docs-release-1.5/release-notes/flink-1.5.html#update-configuration-for-reworked-job-deployment
>>>
>>> <https://ci.apache.org/projects/flink/flink-docs-release-1.5/release-notes/flink-1.5.html#update-configuration-for-reworked-job-deployment>
>>> The taskmanager.numberOfTaskSlots: What would be the desired setting in a
>>> stand alone ( non mesos/yarn ) cluster ?
>>>
>>>
>>> 4. I suspend all jobs and establish 1.5 on the JM ( the TMs are still
>>> running with 1.4 ) . JM refuse to start with
>>>
>>> Jun 23 07:34:23 flink-ad21ac07.bf2.tumblr.net
>>> <http://flink-ad21ac07.bf2.tumblr.net/> docker[3395]: 2018-06-23 11:34:23
>>> ERROR JobManager:116 - Failed to recover job
>>> 454cd84a519f3b50e88bcb378d8a1330.
>>> Jun 23 07:34:23 flink-ad21ac07.bf2.tumblr.net
>>> <http://flink-ad21ac07.bf2.tumblr.net/> docker[3395]:
>>> java.lang.InstantiationError: org.apache.flink.runtime.blob.BlobKey
>>> Jun 23 07:34:23 flink-ad21ac07.bf2.tumblr.net
>>> <http://flink-ad21ac07.bf2.tumblr.net/> docker[3395]: at
>>> sun.reflect.GeneratedSerializationConstructorAccessor51.newInstance(Unknown
>>> Source)
>>> Jun 23 07:34:23 flink-ad21ac07.bf2.tumblr.net
>>> <http://flink-ad21ac07.bf2.tumblr.net/> docker[3395]: at
>>> java.lang.reflect.Constructor.newInstance(Constructor.java:423)
>>> Jun 23 07:34:23 flink-ad21ac07.bf2.tumblr.net
>>> <http://flink-ad21ac07.bf2.tumblr.net/> docker[3395]: at
>>> java.io.ObjectStreamClass.newInstance(ObjectStreamClass.java:1079)
>>> Jun
>>> .....
>>>
>>>
>>>
>>> Any feedback would be highly appreciated...
>>>
>>>
>>>
>>
>>
>>
>>
>