Ok Ufuk, It's not so urgent indeed ;) Thanks anyway
On Mon, Dec 19, 2016 at 11:18 AM, Ufuk Celebi <[email protected]> wrote: > -1 Changing the memory consumption between minor releases should not > happen. > > The good news: Robert ran a test with the latest 1.1 branch that > contains a fix for the changed RocksDB memory configuration and > reported stable behaviour. > > @Flavio: I agree, but since we're already very late with this bugfix > release I would not like to wait for the PR to be merged. We can > include it in 1.1.5, which can follow very soon imo. I hope that's OK > for you. > > On Fri, Dec 16, 2016 at 11:07 AM, Gyula Fóra <[email protected]> wrote: > > @Robert > > > > -I am not sure if the RocksDB problems are closely related to the version > > upgrade, I have been experiencing similar problems for months. This is > > usually not a huge problem on YARN I think, it mostly hurts in standalone > > clusters. > > -Also the yarn memory limits are tricky to configure nicely as it > depends a > > lot on how rocks handles native memory. It seems to grow quite a lot over > > time. > > > > > > Flavio Pompermaier <[email protected]> ezt írta (időpont: 2016. dec. > > 16., P, 10:56): > > > >> I personally think that it should be quite important to have a fix also > for > >> the ES connector (https://issues.apache.org/jira/browse/FLINK-5122). > >> > >> Best, > >> Flavio > >> > >> On Fri, Dec 16, 2016 at 10:43 AM, Robert Metzger <[email protected]> > >> wrote: > >> > >> > I'm not sure if we can release the release candidate like this, > because > >> I'm > >> > running into two issues probably related to a recent rocksdb version > >> > upgrade. > >> > > >> > This is my list of points so far: > >> > > >> > - Checked the staging repository. Quickstarts and Hadoop 1 / 2 are > okay. > >> > - Build a job against the staging repository > >> > - Binaries deploy on a kerberized HA YARN / HDFS setup. Ran the KMeans > >> and > >> > WordCount batch jobs > >> > - Executed a heavy, misbehaved streaming job for a few hours. While > >> running > >> > that job, I found that: > >> > - Not all checkpoint directories are cleaned up in HDFS (I use the > >> async > >> > rocksdb statebackend) > >> > - segfaults from rocksdb (8 segfaults in ~3 hrs, but they were all > >> > happening in the last minutes) > >> > - "beyond physical memory limits" container killings from YARN (I > know > >> we > >> > can configure this, I just wonder what if we should change the default > >> > value) > >> > - the segfaults and memory limits caused the job to not run > anymore in > >> > the end because it was in a constant retry loop. > >> > - This is not a blocking issue I found during the testing: > >> > https://issues.apache.org/jira/browse/FLINK-5345 > >> > - This is also a non blocking issue for 1.1.4 (fixed for 1.2) > >> > https://issues.apache.org/jira/browse/FLINK-4631 > >> > > >> > > >> > Let me know if we should release anyways or fix these issues first. > >> > > >> > > >> > On Tue, Dec 13, 2016 at 11:04 PM, Ufuk Celebi <[email protected]> wrote: > >> > > >> > > Dear Flink community, > >> > > > >> > > Please vote on releasing the following candidate as Apache Flink > >> version > >> > > 1.1.4. > >> > > > >> > > The commit to be voted on: > >> > > 2cd6579 (http://git-wip-us.apache.org/ > repos/asf/flink/commit/2cd6579) > >> > > > >> > > Branch: > >> > > release-1.1.4-rc3 > >> > > (https://git1-us-west.apache.org/repos/asf/flink/repo?p=flin > >> > > k.git;a=shortlog;h=refs/heads/release-1.1.4-rc3) > >> > > > >> > > The release artifacts to be voted on can be found at: > >> > > http://people.apache.org/~uce/flink-1.1.4-rc3/ > >> > > > >> > > The release artifacts are signed with the key with fingerprint > >> 9D403309: > >> > > http://www.apache.org/dist/flink/KEYS > >> > > > >> > > The staging repository for this release can be found at: > >> > > https://repository.apache.org/content/repositories/ > orgapacheflink-1109 > >> > > > >> > > ------------------------------------------------------------- > >> > > > >> > > The voting time is at least three days and the vote passes if a > >> > > majority of at least three +1 PMC votes are cast. The vote ends > >> earliest > >> > > on Friday, December 16th, 2016, at 11 PM (CET)/2 PM (PST). > >> > > > >> > > [ ] +1 Release this package as Apache Flink 1.1.4 > >> > > [ ] -1 Do not release this package, because ... > >> > > > >> > > >> >
