Re: Contributing Code Guide - Clarification

2019-07-03 Thread Stephan Ewen
My personal take is the following: - These improvements are welcome in general. Improving code quality is a good idea. - At the same time, these refactorings can easily introduce new bugs - I would focus on issues where there is clearly duplication without need, but the code and refactoring

Re: [VOTE] Migrate to sponsored Travis account

2019-07-04 Thread Stephan Ewen
+1 to move to a private Travis account. I can confirm that Ververica will sponsor a Travis CI plan that is equivalent or a bit higher than the previous ASF quota (10 concurrent build queues) Best, Stephan On Thu, Jul 4, 2019 at 10:46 AM Chesnay Schepler wrote: > I've raised a JIRA >

Re: [DISCUSS] Flink framework and user log separation

2019-07-04 Thread Stephan Ewen
Is that something that can just be done by the right logging framework and configuration? Like having a log framework with two targets, one filtered on "org.apache.flink" and the other one filtered on "my.company.project" or so? On Fri, Mar 1, 2019 at 3:44 AM vino yang wrote: > Hi Jamie Grier,

Re: [DISCUSS] Create "flink-playgrounds" repository

2019-07-11 Thread Stephan Ewen
ache.org/confluence/display/FLINK/FLIP-42%3A+Rework+Flink+Documentation > > > [2] https://issues.apache.org/jira/browse/FLINK-12749 > > > > > > > > > -- > > > > > > Konstantin Knauf | Solutions Architect > > > > > > +49 160 91394525 > > > > > > > > > Planned Absences: 10.08.2019 - 31.08.2019, 05.09. - 06.09.2010 > > > > > > > > > -- > > > > > > Ververica GmbH | Invalidenstrasse 115, 10115 Berlin, Germany > > > > > > -- > > > > > > Ververica GmbH > > > Registered at Amtsgericht Charlottenburg: HRB 158244 B > > > Managing Directors: Dr. Kostas Tzoumas, Dr. Stephan Ewen > > > > > >

Re: [DISCUSS] ARM support for Flink

2019-07-11 Thread Stephan Ewen
I think an ARM release would be cool. To actually support that properly, we would need something like an ARM profile for the CI builds (at least in the nightly tests), otherwise ARM support would probably be broken frequently. Maybe that could be a way to start? Create a Travis CI ARM build (if po

Re: [ANNOUNCE] Feature freeze for Apache Flink 1.9.0 release

2019-07-11 Thread Stephan Ewen
Number (6) is not a feature but a bug fix, so no need to block on that... On Thu, Jul 11, 2019 at 4:27 PM Kurt Young wrote: > Hi Chesnay, > > Here is the JIRA list I have collected, all of them are under reviewing: > > 1. Hive UDF support (FLINK-13024, FLINK-13225) > 2. Partition prune support (

Re: [DISCUSS] Create "flink-playgrounds" repository

2019-07-12 Thread Stephan Ewen
whole `apache/flink` repository, which is a > > bit > > > overwhelming. The Java/Scala quickstarts use Maven archetypes. Is this > > what > > > you are suggesting? I think, this would be an option, but it seems > > strange > > > to manage a pur

Re: [DISCUSS] Publish the PyFlink into PyPI

2019-07-23 Thread Stephan Ewen
Hi! Sorry for the late involvement. Here are some thoughts from my side: Definitely +1 to publishing to PyPy, even if it is a binary release. Community growth into other communities is great, and if this is the natural way to reach developers in the Python community, let's do it. This is not abou

Re: [DISCUSS] Flink client api enhancement for downstream project

2019-07-23 Thread Stephan Ewen
only use the interface ClusterClient which > > >>> should be > > >>> > > in flink-clients while the concrete implementation class could be > > in > > >>> > > flink-runtime. > > >>> > > > > >&

Re: [DISCUSS] Adopting a Code Style and Quality Guide

2019-07-23 Thread Stephan Ewen
y evaluate naming test methods according to these > conventions <https://stackoverflow.com/a/1594049> ? > > Thanks > > On Mon, Jun 24, 2019 at 11:38 AM Stephan Ewen wrote: > > > I think it makes sense to also start individual [DISCUSS] threads about > > e

Re: [DISCUSS] Allow at-most-once delivery in case of failures

2019-07-23 Thread Stephan Ewen
Hi all! This is an interesting discussion for sure. Concerning user requests for changes modes, I also hear the following quite often: - reduce the expensiveness of checkpoint alignment (unaligned checkpoints) to make checkpoints fast/stable under high backpressure - more fine-grained failove

Re: Probing of simple repartition hash join

2019-07-25 Thread Stephan Ewen
Hi! The join implementations used for the DataSet API and for the Blink Planner are quite intricate. They make use of these custom memory segments, to operate as much as possible on bytes, to control JVM memory utilization and to save serialization costs. That makes the implementation super compli

Re: [DISCUSS] FLIP-27: Refactor Source Interface

2019-07-25 Thread Stephan Ewen
ver it would be good to >>> aim >>> > > with those changes for 1.9. >>> > > >>> > > Piotrek >>> > > >>> > > > On 20 Jan 2019, at 16:08, Biao Liu wrote: >>> > > > >>> > > > Hi community,

Re: Fine Grained Recovery / FLIP-1

2019-07-26 Thread Stephan Ewen
Hi Thomas! For Batch, this should be working in release 1.9. For streaming, it is a bit more tricky, mainly because of the fact that you have to deal with downstream correctness. Either a recovery still needs to reset downstream tasks (which means on average half of the tasks) or needs to wait be

Re: [DISCUSS] ARM support for Flink

2019-07-29 Thread Stephan Ewen
t; [2]: https://zuul-ci.org/docs/zuul/ > [3]: https://status.openlabtesting.org/projects > [4]: > https://status.openlabtesting.org/build/2aa33f1a87854679b70f36bd6f75a890 > [5]: https://github.com/theopenlab/flink/pull/1 > > > Stephan Ewen 于2019年7月11日周四 下午9:56写道: > > > I think an ARM release

Re: [DISCUSS] Removing the flink-mapr-fs module

2019-07-29 Thread Stephan Ewen
+1 to remove it One should still be able to use MapR in the same way as any other vendor Hadoop distribution. On Mon, Jul 29, 2019 at 12:22 PM JingsongLee wrote: > +1 for removing it. We never run mvn clean test success in China with > mapr-fs... > Best, Jingsong Lee > > > -

Re: [DISCUSS] Removing the flink-mapr-fs module

2019-07-29 Thread Stephan Ewen
> On 29. Jul 2019, at 15:16, Simon Su wrote: > > > > +1 to remove it. > > > > > > Thanks, > > SImon > > > > > > On 07/29/2019 21:00,Till Rohrmann wrote: > > +1 to remove it. > > > > On Mon, Jul 29, 2019 at 1:27 PM Stephan Ew

Re: [DISCUSS] Removing the flink-mapr-fs module

2019-07-30 Thread Stephan Ewen
how it only occur to > the mapr-fs module. > So +1 to remove. > > On Mon, Jul 29, 2019 at 12:47 PM Stephan Ewen wrote: > > > It should be fairly straightforward to rewrite the code to not have a > MapR > > dependency. > > Only one class from the MapR dependency i

Re: Reading RocksDB contents from outside of Flink

2019-07-30 Thread Stephan Ewen
Hi! Are you looking for online access or offline access? For online access, you can to key lookups via queryable state. For offline access, you can read and write rocksDB state using the new state processor API in Flink 1.9 https://ci.apache.org/projects/flink/flink-docs-master/dev/libs/state_pr

Re: Flink Kafka Issues

2019-07-30 Thread Stephan Ewen
Is the watermarking configured per-partition in Kafka, or per source? If it is configured per partition, then a late (trailing) or early (leading) partition would not affect the watermark as a whole. There would not be any dropping of late data, simply a delay in the results until the latest parti

Re: [DISCUSS] ARM support for Flink

2019-07-31 Thread Stephan Ewen
the 3rd Drony CI, what we can help is very limited. > AFAIK, Drony use container for CI test, which may not satisfy some > requiremnts. And OpenLab use VM for test. > > Need Flink core team's decision and reply. > > Thanks. > > > Stephan Ewen 于2019年7月29日周一 下午6:05写道:

Re: [DISCUSS] Removing the flink-mapr-fs module

2019-07-31 Thread Stephan Ewen
This has been fixed in master and 1.9, see https://issues.apache.org/jira/browse/FLINK-13499 On Tue, Jul 30, 2019 at 3:28 PM Stephan Ewen wrote: > I will open a PR later today, changing the module to use reflection rather > than a hard MapR dependency. > > On Tue, Jul 30, 2019 at

Re: [VOTE] Publish the PyFlink into PyPI

2019-08-01 Thread Stephan Ewen
+1 (binding) On Thu, Aug 1, 2019 at 9:52 AM Dian Fu wrote: > Hi Jincheng, > > Thanks a lot for driving this. > +1 (non-binding). > > Regards, > Dian > > > 在 2019年8月1日,下午3:24,jincheng sun 写道: > > > > Hi all, > > > > Publish the PyFlink into PyPI is very important for our user, Please vote > > on

Re: [DISCUSS] ARM support for Flink

2019-08-01 Thread Stephan Ewen
step, making them run stable and successful and then adding more > modules if needed. > > [1]: https://etherpad.net/p/flink_arm64_support > [2]: https://issues.apache.org/jira/browse/FLINK-13448 > [3]: https://github.com/apps/theopenlab-ci > > Regards > wangxiyuan > > St

[DISCUSS] Backport FLINK-13326 to 1.9 release

2019-08-01 Thread Stephan Ewen
Hi all! I would like to backport a minor chance from 'master' to 'release-1.9'. It is a very minor change I am checking here because this is not technically a bug fix, but a way of exposing the raw keyed state stream in tasks a bit different. It would unblock some work in a project that tries to

Re: [DISCUSS] Backport FLINK-13326 to 1.9 release

2019-08-01 Thread Stephan Ewen
ault > > > tolerance of streaming iterations sounds like a very valuable thing to > > > unblock with this release. > > > > > > – Ufuk > > > > > > > > > On Thu, Aug 1, 2019 at 11:02 AM Stephan Ewen wrote: > > > > > >

Re: [DISCUSS] Flink project bylaws

2019-08-05 Thread Stephan Ewen
I added a clarification to the table, clarifying that the current phrasing means that committers do not need another +1 for their commits. On Mon, Jul 29, 2019 at 2:11 PM Fabian Hueske wrote: > Hi Becket, > > Thanks a lot for pushing this forward and addressing the feedback. > I'm very happy wit

[DISCUSS] Merging new features post-feature-freeze

2019-08-08 Thread Stephan Ewen
Hi all! I would like to bring this topic up, because we saw quite a few "secret" post-feature-freeze feature merges. The latest example was https://issues.apache.org/jira/browse/FLINK-13225 I would like to make sure that we are all on the same page on what a feature freeze means and how to handle

Re: [DISCUSS] Merging new features post-feature-freeze

2019-08-08 Thread Stephan Ewen
t to me, seems I have some > >>> misunderstandings with this comparing to other community members. But > as > >>> you > >>> pointed out in the jira and also in this mail, I think your > understanding > >>> makes sense > >>> to me. >

Re: [DISCUSS] Flink Docker Playgrounds

2019-08-08 Thread Stephan Ewen
I remember that Patrick (who maintained the docker-flink images so far) frequently raised the point that its good practice to have the images decoupled from the project release cycle. Changes to the images can be done frequently and released fast that way. In addition, one typically supports image

Re: [DISCUSS] Repository split

2019-08-11 Thread Stephan Ewen
Thank you all for the good discussion. I was one of the folks that thinking about such a repository split together with Chesnay, but due to lack of prior experience, happy to hear all the points that Let's investigate a bit what would be alternatives to this that solve the two problems. (1) B

Re: [DISCUSS] Repository split

2019-08-12 Thread Stephan Ewen
Just in case we decide to pursue the repo split in the end, some thoughts on Chesnay's questions: (1) Git History We can also use "git filter-branch" to rewrite the history to only contain the connectors. It changes commit hashes, but not sure that this is a problem. The commit hashes are still v

Re: [VOTE] Apache Flink Release 1.9.0, release candidate #2

2019-08-12 Thread Stephan Ewen
Hi Gyula! Thanks for reporting this. Can you try to simply build Flink without Hadoop and then exporting HADOOP_CLASSPATH to your CloudEra libs? That is the recommended way these days. Best, Stephan On Mon, Aug 12, 2019 at 10:48 AM Gyula Fóra wrote: > Thanks Dawid, > > In the meantime I als

Re: [DISCUSS] Integrate new SourceReader with Mailbox Model in StreamTask

2019-08-12 Thread Stephan Ewen
+1 to looking at the Source Reader interface as converged with respect to its integration with the runtime. Especially the semantics around the availability future and "emitNext" seem to have reach consensus. On Sat, Aug 10, 2019 at 10:51 PM zhijiang wrote: > > Hi all, > > As mentioned in FLIP-

Re: Watermarking in Src and Timestamping downstream

2019-08-12 Thread Stephan Ewen
Do you know what part of the code happens to block off your watermark? Maybe a method that is overridden in AbstractStreamOperator in your code? On Sat, Aug 10, 2019 at 4:06 AM Roshan Naik wrote: > Have streaming use cases where it is useful & easier to generate the > watermark in the Source (vi

Re: [VOTE] Flink Project Bylaws

2019-08-13 Thread Stephan Ewen
+1 On Tue, Aug 13, 2019 at 12:22 PM Maximilian Michels wrote: > +1 It's good that we formalize this. > > On 13.08.19 10:41, Fabian Hueske wrote: > > +1 for the proposed bylaws. > > Thanks for pushing this Becket! > > > > Cheers, Fabian > > > > Am Mo., 12. Aug. 2019 um 16:31 Uhr schrieb Robert Me

Re: Apache flink 1.7.2 security issues

2019-08-13 Thread Stephan Ewen
Hi! Thank you for reporting this! At the moment, the Flink REST endpoint is not secure in the way that you can expose it publicly. After all, you can submit Flink jobs to it which by definition support executing arbitrary code. Given that access to the REST endpoint allows by design arbitrary cod

Re: [DISCUSS] Drop stale class Program

2019-08-14 Thread Stephan Ewen
+1 to drop it. It's one of the oldest pieces of legacy. On Wed, Aug 14, 2019 at 12:07 PM Aljoscha Krettek wrote: > Hi, > > I would be in favour of removing Program (and the code paths that support > it) for Flink 1.10. Most users of Flink don’t actually know it exists and > it is only making ou

Re: Checkpointing under backpressure

2019-08-14 Thread Stephan Ewen
Hi all! Yes, the first proposal of "unaligend checkpoints" (probably two years back now) drew a major inspiration from Chandy Lamport, as did actually the original checkpointing algorithm. "Logging data between first and last barrier" versus "barrier jumping over buffer and storing those buffers"

Re: [DISCUSS] FLIP-52: Remove legacy Program interface.

2019-08-14 Thread Stephan Ewen
+1 the "main" method is the overwhelming default. getting rid of "two ways to do things" is a good idea. On Wed, Aug 14, 2019 at 1:42 PM Kostas Kloudas wrote: > Hi all, > > As discussed in [1] , the Program interface seems to be outdated and > there seems to be > no objection to remove it. > >

Re: Checkpointing under backpressure

2019-08-14 Thread Stephan Ewen
n and it would simplify things by a lot. > Everything can be “staged” upon alignment including replacing channels and > tasks. > > -Paris > > > On 14 Aug 2019, at 13:39, Stephan Ewen wrote: > > > > Hi all! > > > > Yes, the first proposal of "unaligend

Re: Checkpointing under backpressure

2019-08-14 Thread Stephan Ewen
k input operation. > Otherwise causality can be violated. This also means dataflow recovery will > be expected to be slower to the one employed on an aligned snapshot. > > - Same as with state capture, markers should be forwarded upon > first marker received on input. No

Re: Checkpointing under backpressure

2019-08-15 Thread Stephan Ewen
ls, we might be wasting CPU cycles of up > > stream > > > tasks. If we succeed in designing new checkpointing mechanism to not > > > disrupt/block regular data processing (% the extra IO cost for logging > > the > > > in-flight records), that would be a huge i

Re: [DISCUSS] FLIP-50: Spill-able Heap Keyed State Backend

2019-08-15 Thread Stephan Ewen
+1 for this feature. I think this will be appreciated by users, as a way to use the HeapStateBackend with a safety-net against OOM errors. And having had major production exposure is great. >From the implementation plan, it looks like this exists purely in a new module and does not require any cha

Re: [DISCUSS] FLIP-48: Pluggable Intermediate Result Storage

2019-08-15 Thread Stephan Ewen
Sorry for the late response. So many FLIPs these days. I am a bit unsure about the motivation here, and that this need to be a part of Flink. It sounds like this can be perfectly built around Flink as a minimal library on top of it, without any change in the core APIs or runtime. The proposal to

Re: [VOTE] Apache Flink Release 1.9.0, release candidate #2

2019-08-18 Thread Stephan Ewen
Hello all, > > > > > > > > > > >>>>>>>>>>>> > > > > > > > > > > >>>>>>>>>>>> I noticed the PubSub example jar is not > > in

Re: [VOTE] FLIP-50: Spill-able Heap State Backend

2019-08-18 Thread Stephan Ewen
+1 On Sun, Aug 18, 2019 at 3:31 PM Till Rohrmann wrote: > +1 > > On Fri, Aug 16, 2019 at 4:54 PM Yu Li wrote: > > > Hi All, > > > > Since we have reached a consensus in the discussion thread [1], I'd like > to > > start the voting for FLIP-50 [2]. > > > > This vote will be open for at least 72

Re: [DISCUSS] FLIP-54: Evolve ConfigOption and Configuration

2019-08-18 Thread Stephan Ewen
I like the idea of enhancing the configuration and to do early validation. I feel that some of the ideas in the FLIP seem a bit ad hoc, though. For example, having a boolean "isList" is a clear indication of not having thought through the type/category system. Also, having a more clear category sy

Re: [DISCUSS] Update our Roadmap

2019-08-18 Thread Stephan Ewen
I could help with that. On Fri, Aug 16, 2019 at 2:36 PM Robert Metzger wrote: > Flink 1.9 is feature freezed and almost released. > I guess it makes sense to update the roadmap on the website again. > > Who feels like having a good overview of what's coming up? > > On Tue, May 7, 2019 at 4:33 PM

Re: [DISCUSS] Release flink-shaded 8.0

2019-08-18 Thread Stephan Ewen
Are we fine with the current Netty version, or would be want to bump it? On Fri, Aug 16, 2019 at 10:30 AM Chesnay Schepler wrote: > Hello, > > I would like to kick off the next flink-shaded release next week. There > are 2 ongoing efforts that are blocked on this release: > > * [FLINK-13467] J

Re: [DISCUSS] FLIP-54: Evolve ConfigOption and Configuration

2019-08-18 Thread Stephan Ewen
> Configuration that means they can be bound to the generic parameter of > Configuration. You can have a RangeValidator Comparable/Number>. I don't think the type hierarchy in the ConfigOption > has anything to do with the validation logic. Could you elaborate a bit > more what did you mea

Re: [VOTE] Apache Flink Release 1.9.0, release candidate #2

2019-08-19 Thread Stephan Ewen
formance regression but regardless the > > > regression I vote +1 > > > > > > Have verified following things > > > > > > - Jobs running on YARN x (Session & Per Job) with high-availability > > > enabled. > > > - Simulate JM and TM fai

Re: [DISCUSS] FLIP-49: Unified Memory Configuration for TaskExecutors

2019-08-19 Thread Stephan Ewen
About the "-XX:MaxDirectMemorySize" discussion, maybe let me summarize it a bit differently: We have the following two options: (1) We let MemorySegments be de-allocated by the GC. That makes it segfault safe. But then we need a way to trigger GC in case de-allocation and re-allocation of a bunch

Re: [VOTE] Apache Flink Release 1.9.0, release candidate #2

2019-08-19 Thread Stephan Ewen
tability). > > > I will do some tests around sql and blink planner if the RC3 include > this > > > fix. > > > > > > But if the community against to include it, I'm also fine with having > it > > in > > > the next minor release. >

Re: [DISCUSS] FLIP-49: Unified Memory Configuration for TaskExecutors

2019-08-19 Thread Stephan Ewen
he configuration need > to be updated. > > For option 1.1, it has the similar problem as 1.2, if the exceeded direct > memory does not reach the max direct memory limit specified by the > dedicated parameter. I think it is slightly better than 1.2, only because > we can tune the paramet

Re: [DISCUSS][CODE STYLE] Usage of Java Optional

2019-08-19 Thread Stephan Ewen
For the use of optional in private methods: It sounds fine to me, because it means it is strictly class-internal (between methods and helper methods) and does not leak beyond that. On Mon, Aug 19, 2019 at 5:53 PM Andrey Zagrebin wrote: > Hi all, > > Sorry for not getting back to this discussion

Re: [DISCUSS][CODE STYLE] Create collections always with initial capacity

2019-08-19 Thread Stephan Ewen
@Andrey Will you open a PR to add this to the code style? On Mon, Aug 19, 2019 at 11:51 AM Andrey Zagrebin wrote: > Hi All, > > It looks like this proposal has an approval and we can conclude this > discussion. > Additionally, I agree with Piotr we should really force the proven good > reasoning

Re: [DISCUSS][CODE STYLE] Breaking long function argument lists and chained method calls

2019-08-19 Thread Stephan Ewen
I personally prefer not to break lines with few parameters. It just feels unnecessarily clumsy to parse the breaks if there are only two or three arguments with short names. So +1 - for a hard line length limit - allowing arguments on the same line if below that limit - with consistent argum

Re: [DISCUSS] FLIP-49: Unified Memory Configuration for TaskExecutors

2019-08-19 Thread Stephan Ewen
configured some direct memory for the user > libraries. > > > If the library actually use more direct memory then configured, which > > > cannot be cleaned by GC because they are still in use, may lead to > > overuse > > > of the total container memory.

Re: [DISCUSS][CODE STYLE] Usage of Java Optional

2019-08-20 Thread Stephan Ewen
I think Dawid raised a very good point here. One of the outcomes should be that we are consistent in our recommendations and requests during PR reviews. Otherwise we'll just confuse contributors. So I would be +1 for someone to use Optional in a private method if they believe it is helpful -1

Re: [DISCUSS] Flink client api enhancement for downstream project

2019-08-20 Thread Stephan Ewen
t; >> >> >> > >> >> >> > >> >> >> Flavio Pompermaier 于2019年7月31日周三 下午8:12写道: > >> >> >> > >> >> >>> Just one note on my side: it is not clear to me whether the > client > >>

Re: [DISCUSS] Upgrade kinesis connector to Apache 2.0 License and include it in official release

2019-08-20 Thread Stephan Ewen
Just FYI - Becket, Aljoscha, and me are working on fleshing out the remaining details of FLIP-27 (new source API). We will share this as soon as we have made some progress on some of the details. The Kinesis connector would be one of the first that we would try to also implement in that new API, a

Re: [VOTE] Flink Project Bylaws

2019-08-20 Thread Stephan Ewen
; > > > >> rmetz...@apache.org> > > > > >>>>>> wrote: > > > > >>>>>> > > > > >>>>>>> +1 (binding) > > > > >>>>>>> > > &

Re: flink release-1.8.0 Flink-avro unit tests failed

2019-08-20 Thread Stephan Ewen
Thanks, looks like you diagnosed it correctly. environment specific encoding settings. Could you open a ticket (maybe a PR) to set the encoding and make the test stable across environments? On Mon, Aug 19, 2019 at 9:46 PM Ethan Li wrote: > It’s probably the encoding problem. The environment I r

Re: [VOTE] Apache Flink 1.9.0, release candidate #3

2019-08-20 Thread Stephan Ewen
+1 (binding) - Downloaded the binary release tarball - started a standalone cluster with four nodes - ran some examples through the Web UI - checked the logs - created a project from the Java quickstarts maven archetype - ran a multi-stage DataSet job in batch mode - killed as TaskManager a

Re: [VOTE] FLIP-52: Remove legacy Program interface.

2019-08-21 Thread Stephan Ewen
+1 On Wed, Aug 21, 2019 at 1:07 PM Kostas Kloudas wrote: > Hi all, > > Following the FLIP process, this is a voting thread dedicated to the > FLIP-52. > As shown from the corresponding discussion thread [1], we seem to agree > that > the Program interface can be removed, so let's make it also of

Re: CiBot Update

2019-08-22 Thread Stephan Ewen
Nice, thanks! On Thu, Aug 22, 2019 at 3:59 AM Zili Chen wrote: > Thanks for your announcement. Nice work! > > Best, > tison. > > > vino yang 于2019年8月22日周四 上午8:14写道: > > > +1 for "@flinkbot run travis", it is very convenient. > > > > Chesnay Schepler 于2019年8月21日周三 下午9:12写道: > > > > > Hi everyon

Re: [DISCUSS] Release flink-shaded 8.0

2019-08-22 Thread Stephan Ewen
ter active channels reusing a file > > descriptor (#9149) > > - Prefer direct io buffers if direct buffers pooled (#9167) > > > > Netty 4.1.38.Final > > - Prevent ByteToMessageDecoder from overreading when !isAutoRead (#9252) > > - Correctly take length of ByteBufInp

[DISCUSS] Use Java's Duration instead of Flink's Time

2019-08-23 Thread Stephan Ewen
Hi all! Many parts of the code use Flink's "Time" class. The Time really is a "time interval" or a "Duration". Since Java 8, there is a Java class "Duration" that is nice and flexible to use. I would suggest we start using Java Duration instead and drop Time as much as possible in the runtime fro

[DISCUSS] Add ARM CI build to Flink (information-only)

2019-08-23 Thread Stephan Ewen
Hi all! As part of the Flink on ARM effort, there is a pull request that triggers a build on OpenLabs CI for each push and runs tests on ARM machines. Currently that build is roughly equivalent to what the "core" and "tests" profiles do on Travis. The result will be posted to the PR comments, sim

Re: [DISCUSS] Add ARM CI build to Flink (information-only)

2019-08-26 Thread Stephan Ewen
> testing. You just need to creat a Test Request issue in openlab[1]. >> Then >> > we'll create ARM VMs for you, you can login and do the thing you want. >> > >> > Does it make sense? >> > >> > [1]: http://114.115.168.52:8081/#/overview >>

Re: [DISCUSS] Use Java's Duration instead of Flink's Time

2019-08-26 Thread Stephan Ewen
> I like the idea unify usage of time/duration api. We actually > > > > > > > use at least five different classes for this purposes(see > below). > > > > > > > > > > > > > > One thing I'd like to pick up is that duration con

Re: [DISCUSS] FLIP-49: Unified Memory Configuration for TaskExecutors

2019-08-27 Thread Stephan Ewen
hanges. > > > > > >- Removed open question regarding MemorySegment allocation. As > > >discussed, we exclude this topic from the scope of this FLIP. > > >- Updated content about JVM direct memory parameter according to > > recent > > >discuss

Re: [DISCUSS] FLIP-54: Evolve ConfigOption and Configuration

2019-08-29 Thread Stephan Ewen
ke "key=12" can be represented by a list "keys=12;13". But > > >> we don't want to go further; esp. no nesting. A dedicated list option > > >> would start making this more complicated such as > > >> "ListOption(ObjectOption(ListOption(In

Re: [DISCUSS] FLIP-49: Unified Memory Configuration for TaskExecutors

2019-08-29 Thread Stephan Ewen
h. > > By reducing the distributed cached files, it could make launching a > > taskmanager faster. > > > > Stephan gives a good suggestion that we could move the logic into > > "GlobalConfiguration.loadConfig()" method. > > Maybe the client could also benefit from this. Different

Re: [DISCUSS] FLIP-49: Unified Memory Configuration for TaskExecutors

2019-08-29 Thread Stephan Ewen
fer from > the values with which the JVM is started, it should be possible to > recompute them in the Flink process in order to set the values. > > > > On Thu, Aug 29, 2019 at 11:29 AM Stephan Ewen wrote: > > > When computing the values in the JVM process after it started

Re: [DISCUSS] Simplify Flink's cluster level RestartStrategy configuration

2019-08-30 Thread Stephan Ewen
+1 in general What is the default in batch, though? No restarts? I always found that somewhat uncommon. Should we also change that part, if we are changing the default anyways? On Fri, Aug 30, 2019 at 2:35 PM Till Rohrmann wrote: > Hi everyone, > > I wanted to discuss how to simplify Flink's c

Re: instable checkpointing after migration to flink 1.8

2019-08-30 Thread Stephan Ewen
Hi all! A thought would be that this has something to do with timers. Does the task with that behavior use timers (windows, or process function)? If that is the case, some theories to check: - Could it be a timer firing storm coinciding with a checkpoint? Currently, that storm synchronously fir

Re: Potential block size issue with S3 binary files

2019-09-01 Thread Stephan Ewen
witching to dev list), > > On Aug 29, 2019, at 2:52 AM, Stephan Ewen wrote: > > That is a good point. > > Which way would you suggest to go? Not relying on the FS block size at > all, but using a fix (configurable) block size? > > > There’s value to not requiring a f

Re: [VOTE] FLIP-49: Unified Memory Configuration for TaskExecutors

2019-09-03 Thread Stephan Ewen
+1 to the proposal in general A few things seems to be a bit put of sync with the latest discussions though. The section about JVM Parameters states that the -XX:MaxDirectMemorySize value is set to Task Off-heap Memory, Shuffle Memory and JVM Overhead. The way I understand the last discussion con

Re: [VOTE] FLIP-49: Unified Memory Configuration for TaskExecutors

2019-09-04 Thread Stephan Ewen
which cases. > > > > Cheers, > > Till > > > > On Tue, Sep 3, 2019 at 2:34 PM Andrey Zagrebin > > wrote: > > > > > Thanks for starting the vote Xintong > > > > > > Also +1 for the proposed FLIP-49. > > > > > > @Stephan

Re: [VOTE] FLIP-49: Unified Memory Configuration for TaskExecutors

2019-09-04 Thread Stephan Ewen
fers to "shuffle" and "other > > > network memory", or only "shuffle"? > > > > > > I guess what you mean is only "shuffle"? Because currently > > > "taskmanager.network.memory" refers to shuffle buffers only, which

Re: [VOTE] FLIP-49: Unified Memory Configuration for TaskExecutors

2019-09-04 Thread Stephan Ewen
Let's not block on config key names, just go ahead and we figure this out concurrently or on the PR later. On Wed, Sep 4, 2019 at 3:48 PM Stephan Ewen wrote: > Maybe to clear up confusion about my suggestion: > > I would vote to keep the name of the co

Re: [VOTE] FLIP-49: Unified Memory Configuration for TaskExecutors

2019-09-09 Thread Stephan Ewen
em later in PR, does it mean > we > >> will need to update the FLIP document? If so, it seems we need another > >> vote > >> after the modification according to current bylaw? Or maybe we could > just > >> create a subpage under the FLIP and only re-vot

Re: [DISCUSS] Support notifyOnMaster for notifyCheckpointComplete

2019-09-10 Thread Stephan Ewen
Hi all! I think it would be time to rethink the Sink API as a whole, like we did with the Source API in FLIP-27. It would be nice to have proper design that handles all this consistently, rather than adding one more hook. For example: - For batch, you can already use the existing "finalize on m

Re: [DISCUSS] Contribute Pulsar Flink connector back to Flink

2019-09-10 Thread Stephan Ewen
Hi all! Nice to see this lively discussion about the Pulsar connector. Some thoughts on the open questions: ## Contribute to Flink or maintain as a community package Looks like the discussion is more going towards contribution. I think that is good, especially if we think that we want to build a

[DISCUSS] Drop older versions of Kafka Connectors (0.9, 0.10) for Flink 1.10

2019-09-11 Thread Stephan Ewen
Hi all! We still maintain connectors for Kafka 0.8 and 0.9 in Flink. I would suggest to drop those with Flink 1.10 and start supporting only Kafka 0.10 onwards. Are there any concerns about this, or still a significant number of users of these versions? Best, Stephan

Re: [DISCUSS] Contribute Pulsar Flink connector back to Flink

2019-09-12 Thread Stephan Ewen
> connectors, we have to do that anyways. But it would be good to avoid > introducing a new connector with the same problem. > > Thanks, > > Jiangjie (Becket) Qin > > On Tue, Sep 10, 2019 at 6:51 PM Stephan Ewen wrote: > > > Hi all! > > > > Nice to see th

Re: [DISCUSS] Use Java's Duration instead of Flink's Time

2019-09-12 Thread Stephan Ewen
pull request(not > perfect > thought :\) > > We can do the final removal at once when prepare for 2.0 though. > > Best, > tison. > > [1]https://issues.apache.org/jira/browse/FLINK-14068 > > > Stephan Ewen 于2019年8月27日周二 上午1:19写道: > > > Seems everyone is i

Re: Checkpoint metrics.

2019-09-12 Thread Stephan Ewen
Hi Jamie! Did you mean to attach a screenshot? If yes, you need to share that through a different channel, the mailing list does not support attachments, unfortunately. Seth is right how the time is measured. One important bit to add to the interpretation: - For non-source tasks, the time inclu

Re: How stable is FlinkSQL.

2019-09-13 Thread Stephan Ewen
Can you share some more details? - are you running batch SQL or streaming SQL - are you running the original Flink SQL engine or the new Blink SQL engine (since 1.9) Best, Stephan On Fri, Sep 13, 2019 at 3:24 PM srikanth flink wrote: > Hi there, > > I'm trying to get some hands on with Fl

Re: [DISCUSS] FLIP-60: Restructure the Table API & SQL documentation

2019-09-16 Thread Stephan Ewen
There are also some other efforts to restructure the docs, which have resulted until now in more quickstarts and more concepts. IIRC there is the goal to have a big section on concepts for the whole system: streaming concepts, time, order, etc. The API docs would be really more about an API specif

Re: [DISCUSS] Contribute Pulsar Flink connector back to Flink

2019-09-19 Thread Stephan Ewen
peaking, removing the old connector code is a > backwards > > > > > incompatible change which requires a major version bump, i.e. Flink > > > 2.x. > > > > > Given that we don't have a clear plan on when to have the next > major > > > >

Re: [DISCUSS] have separate Flink distributions with built-in Hive dependencies

2020-02-05 Thread Stephan Ewen
h flink-connector-hive jar spacial > > too. > > CC: Rui Li > > > > I think the best system to integrate with hive is presto, which only > > connects hive metastore through thrift protocol. But I understand that it > > costs a lot to rewrite the code. > > > >

Re: [VOTE] Release 1.10.0, release candidate #1

2020-02-05 Thread Stephan Ewen
Should we make these a blocker? I am not sure - we could also clearly state in the release notes how to restore the old behavior, if your setup assumes that behavior. Release candidates for this release have been out since mid December, it is a bit unfortunate that these things have been raised so

Re: [DISCUSS] have separate Flink distributions with built-in Hive dependencies

2020-02-06 Thread Stephan Ewen
://spark.apache.org/docs/latest/sql-data-sources-hive-tables.html#interacting-with-different-versions-of-hive-metastore > [3] https://github.com/prestodb/presto-hive-apache > > Best, > Jingsong Lee > > On Wed, Feb 5, 2020 at 10:15 PM Stephan Ewen wrote: > >> Some thoughts

Re: [VOTE] FLIP-27 - Refactor Source Interface

2020-02-07 Thread Stephan Ewen
+1 (binding) (belated) Quick addendum to clarify some questions from recent discussions in other threads: - The core interfaces (Source, SourceReader, Enumerator) and the core architecture (Enumerator as coordinators on the JobManager, SourceReaders in Tasks) seem to have no open questions -

Re: [DISCUSS] Does removing deprecated interfaces needs another FLIP

2020-02-07 Thread Stephan Ewen
I would also agree with the above. Changing a stable API and deprecating stable methods would need a FLIP in my opinion. But then executing the removal of previously deprecated methods would be fine in my understanding. On Fri, Feb 7, 2020 at 11:17 AM Kurt Young wrote: > Thanks for the clarifi

Re: [DISCUSS] Drop connectors for Elasticsearch 2.x and 5.x

2020-02-11 Thread Stephan Ewen
+1 to drop ES 2.x - unsure about 5.x (makes sense to get more user input for that one). @Itamar - if you would be interested in contributing a "universal" or "cross version" ES connector, that could be very interesting. Do you know if there are known performance issues or feature restrictions with

Re: [DISCUSS] have separate Flink distributions with built-in Hive dependencies

2020-02-11 Thread Stephan Ewen
n we put flink-connector-hive.jar into flink/lib, it > should clean and no dependencies. > > Best, > Jingsong Lee > > On Thu, Feb 6, 2020 at 7:13 PM Stephan Ewen wrote: > >> Hi Jingsong! >> >> This sounds that with two pre-bundled versions (hive 1.2.1 and hive

  1   2   3   4   5   6   7   8   9   10   >