@Richard Did this work for you previously? From the change, it seems that the port was always set to 0 on YARN even before.
> On 28. Mar 2019, at 16:13, Richard Deurwaarder <rich...@xeli.eu> wrote: > > -1 (non-binding) > > - Ran integration tests locally (1000+) of our flink job, all succeeded. > - Attempted to run job on hadoop, failed. It failed because we have a > firewall in place and we cannot set the rest port to a specific port/port > range. > Unless I am mistaken, it seems like FLINK-11081 broke the possibility of > setting a REST port when running on yarn ( > https://github.com/apache/flink/commit/730eed71ef3f718d61f85d5e94b1060844ca56db#diff-487838863ab693af7008f04cb3359be3R102 > ) > Code-wise it seems rather straightforward to fix but I am unsure about the > reason why this is hard-coded to 0 and what the impact would be. > > It would benefit us greatly if a fix for this could make it to 1.8.0. > > Regards, > > Richard > > On Thu, Mar 28, 2019 at 9:54 AM Tzu-Li (Gordon) Tai <tzuli...@apache.org> > wrote: > >> +1 (binding) >> >> Functional checks: >> >> - Built Flink from source (`mvn clean verify`) locally, with success >> - Ran end-to-end tests locally for 5 times in a loop, no attempts failed >> (Hadoop 2.8.4, Scala 2.12) >> - Manually tested state schema evolution for POJO. Besides the tests that >> @Congxian already did, additionally tested evolution cases with POJO >> subclasses + non-registered POJOs. >> - Manually tested migration of Scala stateful jobs that use case classes / >> Scala collections as state types, performing the migration across Scala >> 2.11 to Scala 2.12. >> - Reviewed release announcement PR >> >> Misc / legal checks: >> >> - checked checksums and signatures >> - No binaries in source distribution >> - Staging area does not seem to have any missing artifacts >> >> Cheers, >> Gordon >> >> On Thu, Mar 28, 2019 at 4:52 PM Tzu-Li (Gordon) Tai <tzuli...@apache.org> >> wrote: >> >>> @Shaoxuan >>> >>> The drop in the serializerAvro benchmark, as explained earlier in >> previous >>> voting threads of earlier RCs, was due to a slower job initialization >> phase >>> caused by slower deserialization of the AvroSerializer. >>> Piotr also pointed out that after the number of records was increased in >>> the serializer benchmarks, this drop was no longer observable before / >>> after the changes in mid February. >>> IMO, this is not critical as it does not affect the per-record >> performance >>> / throughput, and therefore should not block this release. >>> >>> On Thu, Mar 28, 2019 at 1:08 AM Aljoscha Krettek <aljos...@fastmail.com> >>> wrote: >>> >>>> By now, I'm reasonably sure that the test instabilities on the >> end-to-end >>>> test are only instabilities. I pushed changes to increase timeouts to >> make >>>> the tests more stable. As in any project, there will always be bugs but >> I >>>> think we could release this RC4 and be reasonably sure that it works >> well. >>>> >>>> Now, we only need to have the required number of PMC votes. >>>> >>>> On Wed, Mar 27, 2019, at 07:22, Congxian Qiu wrote: >>>>> +1 (non-binding) >>>>> >>>>> • checked signature and checksum ok >>>>> • mvn clean package -DskipTests ok >>>>> • Run job on yarn ok >>>>> • Test state migration with POJO type (both heap and rocksdb) ok >>>>> • - 1.6 -> 1.8 >>>>> • - 1.7 -> 1.8 >>>>> • - 1.8 -> 1.8 >>>>> >>>>> >>>>> Best, Congxian >>>>> On Mar 27, 2019, 10:26 +0800, vino yang <yanghua1...@gmail.com>, >> wrote: >>>>>> +1 (non-binding) >>>>>> >>>>>> - checked JIRA release note >>>>>> - ran "mvn package -DskipTests" >>>>>> - checked signature and checksum >>>>>> - started a cluster locally and ran some examples in binary >>>>>> - checked web site announcement's PR >>>>>> >>>>>> Best, >>>>>> Vino >>>>>> >>>>>> >>>>>> Xiaowei Jiang <xiaow...@gmail.com> 于2019年3月26日周二 下午8:20写道: >>>>>> >>>>>>> +1 (non-binding) >>>>>>> >>>>>>> - checked checksums and GPG files >>>>>>> - build from source successfully- run end-to-end precommit tests >>>>>>> successfully- run end-to-end nightly tests successfully >>>>>>> Xiaowei >>>>>>> On Tuesday, March 26, 2019, 8:09:19 PM GMT+8, Yu Li < >>>> car...@gmail.com> >>>>>>> wrote: >>>>>>> >>>>>>> +1 (non-binding) >>>>>>> >>>>>>> - Checked release notes: OK >>>>>>> - Checked sums and signatures: OK >>>>>>> - Source release >>>>>>> - contains no binaries: OK >>>>>>> - contains no 1.8-SNAPSHOT references: OK >>>>>>> - build from source: OK (8u101) >>>>>>> - mvn clean verify: OK (8u101) >>>>>>> - Binary release >>>>>>> - no examples appear to be missing >>>>>>> - started a cluster; WebUI reachable, example ran successfully >>>>>>> - end-to-end test (all but K8S and docker ones): OK (8u101) >>>>>>> - Repository appears to contain all expected artifacts >>>>>>> >>>>>>> Best Regards, >>>>>>> Yu >>>>>>> >>>>>>> >>>>>>> On Tue, 26 Mar 2019 at 14:28, Kurt Young <ykt...@gmail.com> >> wrote: >>>>>>> >>>>>>>> +1 (non-binding) >>>>>>>> >>>>>>>> Checked items: >>>>>>>> - checked checksums and GPG files >>>>>>>> - verified that the source archives do not contains any binaries >>>>>>>> - checked that all POM files point to the same version >>>>>>>> - build from source successfully >>>>>>>> >>>>>>>> Best, >>>>>>>> Kurt >>>>>>>> >>>>>>>> >>>>>>>> On Tue, Mar 26, 2019 at 10:57 AM Shaoxuan Wang < >>>> wshaox...@gmail.com> >>>>>>>> wrote: >>>>>>>> >>>>>>>>> +1 (non-binding) >>>>>>>>> >>>>>>>>> I tested RC4 with the following items: >>>>>>>>> - Maven Central Repository contains all artifacts >>>>>>>>> - Built the source with Maven (ensured all source files have >>>> Apache >>>>>>>>> headers), and executed built-in tests via "mvn clean verify" >>>>>>>>> - Manually executed the tests in IntelliJ IDE >>>>>>>>> - Verify that the quickstarts for Scala and Java are working >>>> with the >>>>>>>>> staging repository in IntelliJ >>>>>>>>> - Checked the benchmark results. The perf regression of >>>>>>>>> tuple-key-by/statebackend/tumblingWindow are gone, but the >>>> regression >>>>>>> on >>>>>>>>> serializer still exists. >>>>>>>>> >>>>>>>>> Regards, >>>>>>>>> Shaoxuan >>>>>>>>> >>>>>>>>> On Tue, Mar 26, 2019 at 8:06 AM jincheng sun < >>>> sunjincheng...@gmail.com >>>>>>>> >>>>>>>>> wrote: >>>>>>>>> >>>>>>>>>> Hi Aljoscha, I think you are right, increase the timeout >>>> config will >>>>>>>> fix >>>>>>>>>> this issue. this depends on the resource of Travis. I would >>>> like >>>>>>> share >>>>>>>>>> some phenomenon during my test (not the flink problem) as >>>> follows: >>>>>>> :-) >>>>>>>>>> >>>>>>>>>> During my testing, `mvn clean verify` and `nightly >> end-to-end >>>> test ` >>>>>>>> both >>>>>>>>>> consume a lot of machine resources (especially >>>> memory/network), and >>>>>>> the >>>>>>>>>> network bandwidth requirements of `nightly end-to-end test ` >>>> are also >>>>>>>>> very >>>>>>>>>> high. In China, need to use VPN acceleration (100~200Kb >> before >>>>>>>>>> acceleration, 3~4Mb after acceleration), I have encountered: >>>> [Avro >>>>>>>>>> Confluent Schema Registry nightly end-to-end test' failed >>>> after 18 >>>>>>>>> minutes >>>>>>>>>> and 15 seconds! Test exited with exit Code 1] takes more >> than >>>> 18 >>>>>>>> minutes, >>>>>>>>>> the download failed because the network bandwidth is not >>>> enough. and >>>>>>> it >>>>>>>>>> runs smoothly when using VPN acceleration. The overall >>>> end-to-end run >>>>>>>> was >>>>>>>>>> passed twice. The Docker resource configuration (CUPs 7, >> Mem: >>>> 28.7G, >>>>>>>>> Swap: >>>>>>>>>> 3.5G). See detail log here >>>>>>>>>> < >>>>>>>>>> >>>>>>>>> >>>>>>>> >>>>>>> >>>> >> https://docs.google.com/document/d/1CcyTCyZmMmP57pkKv4drjSuxW61_u78HR3q1fJJODMw/edit?usp=sharing >>>>>>>>>>> >>>>>>>>>> . >>>>>>>>>> >>>>>>>>>> Just now, I had checked the Travis for your last commit >>>> (Increase >>>>>>>> startup >>>>>>>>>> timeout in end-to-end tests), in addition to the Cleanup >>>> phase, other >>>>>>>>>> phases are successful. here >>>>>>>>>> <https://travis-ci.org/apache/flink/builds/511071777> >>>>>>>>>> >>>>>>>>>> In order to verify that our speculation is accurate, I can >>>> help with >>>>>>> 10 >>>>>>>>> and >>>>>>>>>> 20 seconds timeout config on my repo verification to see if >>>> 100% >>>>>>>>> recurring >>>>>>>>>> timeout problem. It is already running, we are waiting for >> the >>>>>>> result. >>>>>>>>>> 10seconds < >>>>>>> https://travis-ci.org/sunjincheng121/flink/builds/511235749 >>>>>>>>> >>>>>>>>>> 20seconds < >>>>>>> https://travis-ci.org/sunjincheng121/flink/builds/511235598 >>>>>>>>> >>>>>>>>>> >>>>>>>>>> Best, >>>>>>>>>> Jincheng >>>>>>>>>> >>>>>>>>>> Aljoscha Krettek <aljos...@apache.org> 于2019年3月26日周二 >>>> 上午1:04写道: >>>>>>>>>> >>>>>>>>>>> Thanks for the testing done so far! >>>>>>>>>>> >>>>>>>>>>> There has been quite some flakiness on Travis lately, see >>>> here: >>>>>>>>>>> https://travis-ci.org/apache/flink/branches < >>>>>>>>>>> https://travis-ci.org/apache/flink/branches>. I’m a bit >>>> hesitant >>>>>>> to >>>>>>>>>>> release in this state. Looking at the tests you can see >>>> that all of >>>>>>>> the >>>>>>>>>>> end-to-end tests fail because waiting for the dispatcher >> to >>>> come up >>>>>>>>> times >>>>>>>>>>> out. I also noticed that this usually takes about 5-8 >>>> seconds on >>>>>>>>> Travis, >>>>>>>>>> so >>>>>>>>>>> a 10 second timeout might be a bit low. I pushed commits >> to >>>>>>> increase >>>>>>>>> that >>>>>>>>>>> to 20 secs. Let’s see what will happen. >>>>>>>>>>> >>>>>>>>>>> I’ll keep you posted! >>>>>>>>>>> Aljoscha >>>>>>>>>>> >>>>>>>>>>>> On 25. Mar 2019, at 13:13, jincheng sun < >>>>>>> sunjincheng...@gmail.com> >>>>>>>>>>> wrote: >>>>>>>>>>>> >>>>>>>>>>>> Great thanks for preparing the RC4 of Flink 1.8.0, >>>> Aljoscha! >>>>>>>>>>>> >>>>>>>>>>>> +1 (non-binding) >>>>>>>>>>>> >>>>>>>>>>>> I checked the functional things as follows(Without >>>> performance >>>>>>>>>>>> verification): >>>>>>>>>>>> >>>>>>>>>>>> 1. Checking Artifacts: >>>>>>>>>>>> >>>>>>>>>>>> 1). Download the release source code - SUCCESS >>>>>>>>>>>> 2). Check Source release flink-1.8.0-src.tgz.sha512 - >>>> SUCCESS >>>>>>>>>>>> 3). Download the released JAR - SUCCESS >>>>>>>>>>>> 4). Check if checksums and GPG files match the >>>> corresponding >>>>>>>>> release >>>>>>>>>>>> files - SUCCESS. >>>>>>>>>>>> 5). Verify that the source archives do not contain any >>>>>>> binaries >>>>>>>> - >>>>>>>>>>>> SUCCESS. >>>>>>>>>>>> 6). Build the source with `mvn clean verify -DskipTests` >>>> to >>>>>>>> ensure >>>>>>>>>> all >>>>>>>>>>>> source files have Apache headers - SUCCESS >>>>>>>>>>>> 7). Check that all POM files point to the same version - >>>>>>> SUCCESS >>>>>>>>>>>> 8). Read the `README.md` file to ensure there is nothing >>>>>>>>> unexpected >>>>>>>>>> - >>>>>>>>>>>> SUCCESS >>>>>>>>>>>> >>>>>>>>>>>> 2. Testing Larger Setups >>>>>>>>>>>> >>>>>>>>>>>> Cluster Environment:7 nodes, jm 1024m, tm 4096m >>>>>>>>>>>> Testing Jobs: WordCount(Batch&Streaming), >>>>>>>>>> DataStreamAllroundTestProgram >>>>>>>>>>>> >>>>>>>>>>>> 1). Use local&hdfs file systems for checkpoints - >> SUCCESS >>>>>>>>>>>> 2). Use hdfs file systems for input/output -SUCCESS >>>>>>>>>>>> 3). Run examples on YARN(with or without session) - >>>> SUCCESS >>>>>>>>>>>> 4). Test failover and recovery. - SUCCESS >>>>>>>>>>>> 5). Test incremental&non-incremental checkpoint - >> SUCCESS >>>>>>>>>>>> 6). Test connector - kafka -SUCCESS >>>>>>>>>>>> >>>>>>>>>>>> 3. Testing Functionality >>>>>>>>>>>> >>>>>>>>>>>> 1). Built-in tests(linux&mac os) >>>>>>>>>>>> - `mvn cealn verify` (some test timeout error and test >>>> case >>>>>>>> bug >>>>>>>>>> see >>>>>>>>>>>> FLINK-12001 < >>>> https://issues.apache.org/jira/browse/FLINK-12001>, >>>>>>>> all >>>>>>>>>> of >>>>>>>>>>>> them are not the blocker) >>>>>>>>>>>> - build for scala 2.11(mvn clean install -P scala-2.11 >>>>>>>>>> -DskipTests) >>>>>>>>>>>> - SUCCESS >>>>>>>>>>>> - Run the scripted nightly end-to-end test - SUCCESS >>>>>>>>>>>> >>>>>>>>>>>> 2). Quickstarts >>>>>>>>>>>> - Verify that the quickstarts for Scala with the staging >>>>>>>>>> repository >>>>>>>>>>>> in IntelliJ - SUCCESS >>>>>>>>>>>> - Verify that the quickstarts for Java with the staging >>>>>>>>> repository >>>>>>>>>>> in >>>>>>>>>>>> IntelliJ - SUCCESS >>>>>>>>>>>> >>>>>>>>>>>> 3). Simple Starter Experience and Use Cases >>>>>>>>>>>> >>>>>>>>>>>> - run all examples from IntelliJ IDE - SUCCESS >>>>>>>>>>>> - Start a local cluster and verify that the processes - >>>>>>>> SUCCESS >>>>>>>>>>>> a. Examine the *.out files (should be empty) and the log >>>>>>>> files >>>>>>>>>>>> (should contain no exceptions) >>>>>>>>>>>> b. Test for Linux, MacOS >>>>>>>>>>>> c. Shutdown and verify there are no exceptions in the >> log >>>>>>>>> output >>>>>>>>>>>> (after shutdown) >>>>>>>>>>>> >>>>>>>>>>>> - Verify that the examples are running from both >>>> ./bin/flink >>>>>>>> and >>>>>>>>>>> from >>>>>>>>>>>> the web-based job submission tool(following items) - >>>> SUCCESS >>>>>>>>>>>> a. Start multiple task managers in the local cluster >>>>>>>>>>>> b. Change the flink-conf.yml to define more than one >> task >>>>>>>> slot >>>>>>>>>> (2) >>>>>>>>>>>> c. Run the examples with a parallelism > 1 >>>>>>>>>>>> d. Examine the log output - no error messages should be >>>>>>>>>>> encountered >>>>>>>>>>>> >>>>>>>>>>>> 4. Review the PR >>>>>>>>>>>> - [Add 1.8 Release Blog Post] - Just a reminder, updated >>>> the >>>>>>>>>> release >>>>>>>>>>>> date to correct date before merging. >>>>>>>>>>>> >>>>>>>>>>>> Cheers, >>>>>>>>>>>> Jincheng >>>>>>>>>>>> >>>>>>>>>>>> Piotr Nowojski <pi...@ververica.com> 于2019年3月25日周一 >>>> 下午4:11写道: >>>>>>>>>>>> >>>>>>>>>>>>> +1 from my side. Previously spotted performance >>>> regression seems >>>>>>>> to >>>>>>>>> be >>>>>>>>>>>>> gone, or mostly gone. >>>>>>>>>>>>> >>>>>>>>>>>>> Piotrek >>>>>>>>>>>>> >>>>>>>>>>>>>> On 21 Mar 2019, at 17:52, Aljoscha Krettek < >>>>>>> aljos...@apache.org> >>>>>>>>>>> wrote: >>>>>>>>>>>>>> >>>>>>>>>>>>>> Hi everyone, >>>>>>>>>>>>>> Please review and vote on the release candidate 4 >> for >>>> Flink >>>>>>>> 1.8.0, >>>>>>>>> as >>>>>>>>>>>>> follows: >>>>>>>>>>>>>> [ ] +1, Approve the release >>>>>>>>>>>>>> [ ] -1, Do not approve the release (please provide >>>> specific >>>>>>>>> comments) >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> The complete staging area is available for your >>>> review, which >>>>>>>>>> includes: >>>>>>>>>>>>>> * JIRA release notes [1], >>>>>>>>>>>>>> * the official Apache source release and binary >>>> convenience >>>>>>>>> releases >>>>>>>>>> to >>>>>>>>>>>>> be deployed to dist.apache.org [2], which are signed >>>> with the >>>>>>> key >>>>>>>>>> with >>>>>>>>>>>>> fingerprint F2A67A8047499BBB3908D17AA8F4FD97121D7293 >>>> [3], >>>>>>>>>>>>>> * all artifacts to be deployed to the Maven Central >>>> Repository >>>>>>>> [4], >>>>>>>>>>>>>> * source code tag "release-1.8.0-rc4" [5], >>>>>>>>>>>>>> * website pull request listing the new release [6] >>>>>>>>>>>>>> * website pull request adding announcement blog post >>>> [7]. >>>>>>>>>>>>>> >>>>>>>>>>>>>> The vote will be open for at least 72 hours. It is >>>> adopted by >>>>>>>>>> majority >>>>>>>>>>>>> approval, with at least 3 PMC affirmative votes. >>>>>>>>>>>>>> >>>>>>>>>>>>>> Thanks, >>>>>>>>>>>>>> Aljoscha >>>>>>>>>>>>>> >>>>>>>>>>>>>> [1] >>>>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>> >>>>>>>>> >>>>>>>> >>>>>>> >>>> >> https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12315522&version=12344274 >>>>>>>>>>>>>> [2] >>>>>>>> https://dist.apache.org/repos/dist/dev/flink/flink-1.8.0-rc4/ >>>>>>>>>>>>>> [3] >>>> https://dist.apache.org/repos/dist/release/flink/KEYS >>>>>>>>>>>>>> [4] >>>>>>>>>>>>> >>>>>>>>>> >>>>>>> >>>> https://repository.apache.org/content/repositories/orgapacheflink-1215 >>>>>>>>>>>>>> [5] >>>>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>> >>>>>>>>> >>>>>>>> >>>>>>> >>>> >> https://gitbox.apache.org/repos/asf?p=flink.git;a=tag;h=c650befc10c8bb6cc4b007ae250b7b2173046145 >>>>>>>>>>>>>> [6] https://github.com/apache/flink-web/pull/180 < >>>>>>>>>>>>> https://github.com/apache/flink-web/pull/180> >>>>>>>>>>>>>> [7] https://github.com/apache/flink-web/pull/179 < >>>>>>>>>>>>> https://github.com/apache/flink-web/pull/179> >>>>>>>>>>>>>> >>>>>>>>>>>>>> P.S. The difference to the previous RCs is small, >> you >>>> can fetch >>>>>>>> the >>>>>>>>>>> tags >>>>>>>>>>>>> and do a "git log >> release-1.8.0-rc1..release-1.8.0-rc4” >>>> to see >>>>>>> the >>>>>>>>>>>>> difference in commits. Its fixes for the issues that >>>> led to the >>>>>>>>>>>>> cancellation of the previous RCs plus smaller fixes. >>>> Most >>>>>>>>>>>>> verification/testing that was carried out should apply >>>> as is to >>>>>>>> this >>>>>>>>>> RC. >>>>>>>>>>>>> Any functional verification that you did on previous >>>> RCs should >>>>>>>>>>> therefore >>>>>>>>>>>>> easily carry over to this one. >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>> >>>>>>>>> >>>>>>>> >>>>> >>>> >>> >>