-1, observed stably failure on streaming bucketing end-to-end test case in two different environments (Linux/MacOS) when running with both shaded hadoop-2.8.3 jar file <https://repository.apache.org/content/repositories/orgapacheflink-1213/org/apache/flink/flink-shaded-hadoop2-uber/2.8.3-1.8.0/flink-shaded-hadoop2-uber-2.8.3-1.8.0.jar> and hadoop-2.8.5 dist <http://archive.apache.org/dist/hadoop/core/hadoop-2.8.5/>, while both env could pass with hadoop 2.6.5. More details please refer to this comment <https://issues.apache.org/jira/browse/FLINK-11972?focusedCommentId=16797614&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-16797614> in FLINK-11972.
Best Regards, Yu On Thu, 21 Mar 2019 at 04:25, jincheng sun <sunjincheng...@gmail.com> wrote: > Thanks for the quick fix Aljoscha! The FLINK-11971 > <https://issues.apache.org/jira/browse/FLINK-11971> has been merged. > > Cheers, > Jincheng > > Piotr Nowojski <pi...@ververica.com> 于2019年3月21日周四 上午12:29写道: > >> -1 from my side due to performance regression found in the master branch >> since Jan 29th. >> >> In 10% JVM forks it was causing huge performance drop in some of the >> benchmarks (up to 30-50% reduced throughput), which could mean that one out >> of 10 task managers could be affected by it. Today we have merged a fix for >> it [1]. First benchmark run was promising [2], but we have to wait until >> tomorrow to make sure that the problem was definitely resolved. If that’s >> the case, I would recommend including it in 1.8.0, because we really do not >> know how big of performance regression this issue can be in the real world >> scenarios. >> >> Regarding the second regression from mid February. We have found the >> responsible commit and this one is probably just a false positive. Because >> of the nature some of the benchmarks, they are running with low number of >> records (300k). The apparent performance regression was caused by higher >> initialisation time. When I temporarily increased the number of records to >> 2M, the regression was gone. Together with Till and Stefan Richter we >> discussed the potential impact of this longer initialisation time (in the >> case of said benchmarks initialisation time increased from 70ms to 120ms) >> and we think that it’s not a critical issue, that doesn’t have to block the >> release. Nevertheless there might some follow up work for this. >> >> [1] https://github.com/apache/flink/pull/8020 >> [2] http://codespeed.dak8s.net:8000/timeline/?ben=tumblingWindow&env=2 >> >> Piotr Nowojski >> >> On 20 Mar 2019, at 10:09, Aljoscha Krettek <aljos...@apache.org> wrote: >> >> Thanks Jincheng! It would be very good to fix those but as you said, I >> would say they are not blockers. >> >> On 20. Mar 2019, at 09:47, Kurt Young <ykt...@gmail.com> wrote: >> >> +1 (non-binding) >> >> Checked items: >> - checked checksums and GPG files >> - verified that the source archives do not contains any binaries >> - checked that all POM files point to the same version >> - build from source successfully >> >> Best, >> Kurt >> >> >> On Wed, Mar 20, 2019 at 2:12 PM jincheng sun <sunjincheng...@gmail.com> >> wrote: >> >>> Hi Aljoscha&All, >>> >>> When I did the `end-to-end` test for RC3 under Mac OS, I found the >>> following two problems: >>> >>> 1. The verification returned for different `minikube status` is is not >>> enough for the robustness. The strings returned by different versions of >>> different platforms are different. the following misjudgment is caused: >>> When the `Command: start_kubernetes_if_not_ruunning failed` error >>> occurs, the minikube has actually started successfully. The core reason is >>> that there is a bug in the `test_kubernetes_embedded_job.sh` script. See >>> FLINK-11971 <https://issues.apache.org/jira/browse/FLINK-11971> for >>> details. >>> >>> 2. Since the difference between 1.8.0 and 1.7.x is that 1.8.x does not >>> put the `hadoop-shaded` JAR integrated into the dist. It will cause an >>> error when the end-to-end test cannot be found with `Hadoop` Related >>> classes, such as: `java.lang.NoClassDefFoundError: >>> Lorg/apache/hadoop/fs/FileSystem`. So we need to improve the end-to-end >>> test script, or explicitly stated in the README, i.e. end-to-end test need >>> to add `flink-shaded-hadoop2-uber-XXXX.jar` to the classpath. See >>> FLINK-11972 <https://issues.apache.org/jira/browse/FLINK-11972> for >>> details. >>> >>> I think this is not a blocker for release-1.8.0, but I think it would be >>> better to include those commits in release-1.8 If we still have performance >>> related bugs should be fixed. >>> >>> What do you think? >>> >>> Best, >>> Jincheng >>> >>> >>> Aljoscha Krettek <aljos...@apache.org> 于2019年3月19日周二 下午7:58写道: >>> >>>> Hi All, >>>> >>>> The release process for Flink 1.8.0 is currently ongoing. Please have a >>>> look at the thread, in case you’re interested in checking your applications >>>> against this next release of Apache Flink and participate in the process. >>>> >>>> Best, >>>> Aljoscha >>>> >>>> Begin forwarded message: >>>> >>>> *From: *Aljoscha Krettek <aljos...@apache.org> >>>> *Subject: **[VOTE] Release 1.8.0, release candidate #3* >>>> *Date: *19. March 2019 at 12:52:50 CET >>>> *To: *d...@flink.apache.org >>>> *Reply-To: *d...@flink.apache.org >>>> >>>> Hi everyone, >>>> Please review and vote on the release candidate 3 for Flink 1.8.0, as >>>> follows: >>>> [ ] +1, Approve the release >>>> [ ] -1, Do not approve the release (please provide specific comments) >>>> >>>> >>>> The complete staging area is available for your review, which includes: >>>> * JIRA release notes [1], >>>> * the official Apache source release and binary convenience releases to >>>> be deployed to dist.apache.org <http://dist.apache.org/> [2], which >>>> are signed with the key with fingerprint >>>> F2A67A8047499BBB3908D17AA8F4FD97121D7293 [3], >>>> * all artifacts to be deployed to the Maven Central Repository [4], >>>> * source code tag "release-1.8.0-rc3" [5], >>>> * website pull request listing the new release [6] >>>> * website pull request adding announcement blog post [7]. >>>> >>>> The vote will be open for at least 72 hours. It is adopted by majority >>>> approval, with at least 3 PMC affirmative votes. >>>> >>>> Thanks, >>>> Aljoscha >>>> >>>> [1] >>>> https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12315522&version=12344274 >>>> < >>>> https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12315522&version=12344274 >>>> > >>>> [2] https://dist.apache.org/repos/dist/dev/flink/flink-1.8.0-rc3/ < >>>> https://dist.apache.org/repos/dist/dev/flink/flink-1.8.0-rc3/> >>>> [3] https://dist.apache.org/repos/dist/release/flink/KEYS < >>>> https://dist.apache.org/repos/dist/release/flink/KEYS> >>>> [4] >>>> https://repository.apache.org/content/repositories/orgapacheflink-1214 >>>> <https://repository.apache.org/content/repositories/orgapacheflink-1214> >>>> >>>> [5] >>>> https://gitbox.apache.org/repos/asf?p=flink.git;a=tag;h=b505c0822edd2aed7fa22ed75eca40dca1a9de42 >>>> < >>>> https://gitbox.apache.org/repos/asf?p=flink.git;a=tag;h=b505c0822edd2aed7fa22ed75eca40dca1a9de42> >>>> >>>> [6] https://github.com/apache/flink-web/pull/180 < >>>> https://github.com/apache/flink-web/pull/180> >>>> [7] https://github.com/apache/flink-web/pull/179 < >>>> https://github.com/apache/flink-web/pull/179> >>>> >>>> P.S. The difference to the previous RCs 1 and 2 is very small, you can >>>> fetch the tags and do a "git log release-1.8.0-rc1..release-1.8.0-rc3” to >>>> see the difference in commits. Its fixes for the issues that led to the >>>> cancellation of the previous RCs plus smaller fixes. Most >>>> verification/testing that was carried out should apply as is to this RC. >>>> Any functional verification that you did on previous RCs should therefore >>>> easily carry over to this one. >>>> >>>> >>>> >> >>