Re: Few question about upgrade from 1.4 to 1.5 flink ( some very basic )

zhangminglei Tue, 26 Jun 2018 02:40:47 -0700

Hi, Gary Yao

Once I discovered that there was a change in the ip address[ 
jobmanager.rpc.address ]. From 10.208.73.129 to localhost. I think that will 
cause the issue. What do you think ?


Cheers
Minglei

> 在 2018年6月26日，下午4:53，Gary Yao <g...@data-artisans.com> 写道：
> 
> Hi Vishal,
> 
> Could it be that you are not using the 1.5.0 client? The stacktrace you posted
> does not reference valid lines of code in the release-1.5.0-rc6 tag. 
> 
> If you have a HA setup, the host and port of the leading JM will be looked up
> from ZooKeeper before job submission. Therefore, the flink-conf.yaml used by 
> the
> client must have the same ZooKeeper configuration as used by the Flink 
> cluster.
> 
> Best,
> Gary
> 
> On Mon, Jun 25, 2018 at 5:32 PM, Vishal Santoshi <vishal.santo...@gmail.com 
> <mailto:vishal.santo...@gmail.com>> wrote:
> I think all I need to add is 
> 
> web.port: 8081
> rest.port: 8081
> 
> to the JM flink conf ? 
> 
> On Mon, Jun 25, 2018 at 10:46 AM, Vishal Santoshi <vishal.santo...@gmail.com 
> <mailto:vishal.santo...@gmail.com>> wrote:
> Another issue I saw with flink cli...
> 
> org.apache.flink.client.program.ProgramInvocationException: The program 
> execution failed: JobManager did not respond within 120000 ms
>       at 
> org.apache.flink.client.program.ClusterClient.runDetached(ClusterClient.java:524)
>       at 
> org.apache.flink.client.program.StandaloneClusterClient.submitJob(StandaloneClusterClient.java:103)
>       at 
> org.apache.flink.client.program.ClusterClient.run(ClusterClient.java:456)
>       at 
> org.apache.flink.client.program.DetachedEnvironment.finalizeExecute(DetachedEnvironment.java:77)
>       at org.apach
> 
> This was a simple submission  and it does succeed through the UI. 
> 
> Has there been a regression on CLI... I could not find any documentation 
> around it. 
> 
> I have a HA JM setup.
> 
> 
> 
> 
> On Mon, Jun 25, 2018 at 10:22 AM, Chesnay Schepler <ches...@apache.org 
> <mailto:ches...@apache.org>> wrote:
> The watermark issue is know and will be fixed in 1.5.1
> 
> 
> On 25.06.2018 15:03, Vishal Santoshi wrote:
>> Thank you....  
>> 
>> One addition
>> 
>> I do not see WM info on the UI  ( Attached ) 
>> 
>> Is this a know issue. The same pipe on our production has the WM ( In fact 
>> never had an issue with  Watermarks not appearing ) . Am I missing something 
>> ?
>> 
>> On Mon, Jun 25, 2018 at 4:15 AM, Fabian Hueske <fhue...@gmail.com 
>> <mailto:fhue...@gmail.com>> wrote:
>> Hi Vishal,
>> 
>> 1. I don't think a rolling update is possible. Flink 1.5.0 changed the 
>> process orchestration and how they communicate. IMO, the way to go is to 
>> start a Flink 1.5.0 cluster, take a savepoint on the running job, start from 
>> the savepoint on the new cluster and shut the old job down.
>> 2. Savepoints should be compatible.
>> 3. You can keep the slot configuration as before.
>> 4. As I said before, mixing 1.5 and 1.4 processes does not work (or at 
>> least, it was not considered a design goal and nobody paid attention that it 
>> is possible).
>> 
>> Best, Fabian
>> 
>> 
>> 2018-06-23 13:38 GMT+02:00 Vishal Santoshi <vishal.santo...@gmail.com 
>> <mailto:vishal.santo...@gmail.com>>:
>> 
>> 1.  
>> Can or has any one  done  a rolling upgrade from 1.4 to 1.5 ?  I am not sure 
>> we can. It seems that JM cannot recover jobs with this exception
>> 
>> Caused by: java.io.InvalidClassException: 
>> org.apache.flink.runtime.jobgraph.tasks.CheckpointCoordinatorConfiguration; 
>> local class incompatible: stream classdesc serialVersionUID = 
>> -647384516034982626, local class serialVersionUID = 2
>> 
>> 
>> 
>> 2. 
>> Does SP on 1.4, resume on 1.5 ( pretty basic but no harm asking ) ?
>> 
>> 
>> 
>> 3. 
>> https://ci.apache.org/projects/flink/flink-docs-release-1.5/release-notes/flink-1.5.html#update-configuration-for-reworked-job-deployment
>>  
>> <https://ci.apache.org/projects/flink/flink-docs-release-1.5/release-notes/flink-1.5.html#update-configuration-for-reworked-job-deployment>
>>  The taskmanager.numberOfTaskSlots: What would be the desired setting in a 
>> stand alone ( non mesos/yarn ) cluster ?
>> 
>> 
>> 4. I suspend all jobs and establish 1.5 on the JM ( the TMs are still 
>> running with 1.4 ) . JM refuse to start  with 
>> 
>> Jun 23 07:34:23 flink-ad21ac07.bf2.tumblr.net 
>> <http://flink-ad21ac07.bf2.tumblr.net/> docker[3395]: 2018-06-23 11:34:23 
>> ERROR JobManager:116 - Failed to recover job 
>> 454cd84a519f3b50e88bcb378d8a1330.
>> Jun 23 07:34:23 flink-ad21ac07.bf2.tumblr.net 
>> <http://flink-ad21ac07.bf2.tumblr.net/> docker[3395]: 
>> java.lang.InstantiationError: org.apache.flink.runtime.blob.BlobKey
>> Jun 23 07:34:23 flink-ad21ac07.bf2.tumblr.net 
>> <http://flink-ad21ac07.bf2.tumblr.net/> docker[3395]: at 
>> sun.reflect.GeneratedSerializationConstructorAccessor51.newInstance(Unknown 
>> Source)
>> Jun 23 07:34:23 flink-ad21ac07.bf2.tumblr.net 
>> <http://flink-ad21ac07.bf2.tumblr.net/> docker[3395]: at 
>> java.lang.reflect.Constructor.newInstance(Constructor.java:423)
>> Jun 23 07:34:23 flink-ad21ac07.bf2.tumblr.net 
>> <http://flink-ad21ac07.bf2.tumblr.net/> docker[3395]: at 
>> java.io.ObjectStreamClass.newInstance(ObjectStreamClass.java:1079)
>> Jun 
>> .....
>> 
>> 
>> 
>> Any feedback would be highly appreciated...
>> 
>> 
>> 
> 
> 
> 
>

Re: Few question about upgrade from 1.4 to 1.5 flink ( some very basic )

Reply via email to