Re: [QUESTION] thread model in Flink makes me confused
That would be definitely awesome (and useful also for us)! +1 On Thu, May 12, 2016 at 7:38 AM, Aljoscha Krettek wrote: > I favor the one-cluster-per job approach. If this becomes the dominant > approach to doing things we could also think about introducing a separate > component that would allow monitoring the jobs in these per-job clusters as > is now possible when running multiple jobs in a single cluster. > > On Thu, 12 May 2016 at 01:59 Wright, Eron wrote: > > > One option is to use a separate cluster (JobManager + TaskManagers) for > > each job. This is fairly straightforward with the YARN support - "flink > > run” can launch a cluster for a job and tear it down afterwards. > > > > Of course this means you must deploy YARN. That doesn’t necessarily > > imply HDFS though a Hadoop-compatible filesystem (HCFS) is needed to > > support the YARN staging directory. > > > > This approach also facilitates richer scheduling and multi-user > scenarios. > > > > One downside is the loss of a unified web UI to view all jobs. > > > > > > > On May 11, 2016, at 8:32 AM, Jark Wu > wrote: > > > > > > > > > As I know, Flink uses thread model, that means one TaskManager process > > may run many different operator threads from different jobs. So tasks > from > > different jobs will compete for memory and CPU in the one process. In the > > worst case scenario, the bad job will eat most of CPU and memroy which > may > > lead to OOM, and then the regular job died too. And there's another > > problem, tasks from different jobs will print there logs into the same > > file(the taskmanager log file). This increases the difficulty of > debugging. > > > > > > As I know, Storm will spawn workers for every job. The tasks in one > > worker belong to the same job. So I'm confused the purpose or advantages > of > > Flink design. One more question, is there any tips to solves the issues > > above? Or any suggestions to implemention the similar desgin with Storm ? > > > > > > Thank you for any answers in advance! > > > > > > Regards, > > > Jark Wu > > > > > > > > > > > > > >
Re: [QUESTION] thread model in Flink makes me confused
Funny you should say that, because in a recent discussion with Stephan and Jamie, we talked about reworking the web UI to talk to numerous job managers. I’ve been looking into is as part of the Mesos work (FLINK-1984). I’ll start a new thread about it soon. > On May 11, 2016, at 10:38 PM, Aljoscha Krettek wrote: > > I favor the one-cluster-per job approach. If this becomes the dominant > approach to doing things we could also think about introducing a separate > component that would allow monitoring the jobs in these per-job clusters as > is now possible when running multiple jobs in a single cluster. > > On Thu, 12 May 2016 at 01:59 Wright, Eron wrote: > >> One option is to use a separate cluster (JobManager + TaskManagers) for >> each job. This is fairly straightforward with the YARN support - "flink >> run” can launch a cluster for a job and tear it down afterwards. >> >> Of course this means you must deploy YARN. That doesn’t necessarily >> imply HDFS though a Hadoop-compatible filesystem (HCFS) is needed to >> support the YARN staging directory. >> >> This approach also facilitates richer scheduling and multi-user scenarios. >> >> One downside is the loss of a unified web UI to view all jobs. >> >> >>> On May 11, 2016, at 8:32 AM, Jark Wu wrote: >>> >>> >>> As I know, Flink uses thread model, that means one TaskManager process >> may run many different operator threads from different jobs. So tasks from >> different jobs will compete for memory and CPU in the one process. In the >> worst case scenario, the bad job will eat most of CPU and memroy which may >> lead to OOM, and then the regular job died too. And there's another >> problem, tasks from different jobs will print there logs into the same >> file(the taskmanager log file). This increases the difficulty of debugging. >>> >>> As I know, Storm will spawn workers for every job. The tasks in one >> worker belong to the same job. So I'm confused the purpose or advantages of >> Flink design. One more question, is there any tips to solves the issues >> above? Or any suggestions to implemention the similar desgin with Storm ? >>> >>> Thank you for any answers in advance! >>> >>> Regards, >>> Jark Wu >>> >>> >>> >> >>
Re: [RESULT] [VOTE] Release Apache Flink 1.0.3 (RC3)
Thanks Ufuk :-) On Wed, May 11, 2016 at 5:16 PM, Stephan Ewen wrote: > Thanks for pushing this release Ufuk! > > On Wed, May 11, 2016 at 5:12 PM, Fabian Hueske wrote: > > > Thanks Ufuk! > > > > 2016-05-11 16:39 GMT+02:00 Ufuk Celebi : > > > > > This vote has passed with 3 binding +1 votes. Thanks to everyone who > > > contributed and tested the release candidate. > > > > > > +1s: > > > Gyula Fora (binding) > > > Fabian Hueske (binding) > > > Ufuk Celebi (binding) > > > > > > There are no 0s or -1s. > > > > > > I'll go ahead finalize and package this release. > > > > > > On Mon, May 9, 2016 at 10:24 AM, Ufuk Celebi wrote: > > > > Dear Flink community, > > > > > > > > Please vote on releasing the following candidate as Apache Flink > > version > > > 1.0.3. > > > > > > > > The commit to be voted on: > > > > f3a6b5f1e8d85d10e1449e2f96291408b781 > > > > > > > > Branch: > > > > release-1.0.3-rc3 (see > > > > > > > > > > https://git1-us-west.apache.org/repos/asf/flink/?p=flink.git;a=shortlog;h=refs/heads/release-1.0.3-rc3 > > > ) > > > > > > > > The release artifacts to be voted on can be found at: > > > > http://home.apache.org/~uce/flink-1.0.3-rc3/ > > > > > > > > The release artifacts are signed with the key with fingerprint > > 9D403309: > > > > http://www.apache.org/dist/flink/KEYS > > > > > > > > The staging repository for this release can be found at: > > > > > https://repository.apache.org/content/repositories/orgapacheflink-1096 > > > > > > > > - > > > > > > > > The vote is open for the next 48 hours and passes if a majority of at > > > > least three +1 PMC votes are cast. > > > > > > > > The vote ends on Wednesday May 11, 2016. > > > > > > > > [ ] +1 Release this package as Apache Flink 1.0.3 > > > > [ ] -1 Do not release this package because ... > > > > > > > > === > > > > > > > > The following commits have been added since the 1.0.2 release > > (excluding > > > docs): > > > > > > > > * 4d3dcb1 - [FLINK-3860] [connector-wikiedits] Add retry loop to > > > > WikipediaEditsSourceTest (5 days ago) > > > > * f1d34b1 - [FLINK-3790] [streaming] Use proper hadoop config in > > > > rolling sink (12 hours ago) > > > > * 4a34f6f - [FLINK-3835] [optimizer] Add input id to JSON plan to > > > > resolve ambiguous input names. (2 days ago) > > > > * d8feb15 - [hotfix] OptionSerializer.duplicate to respect stateful > > > > element serializer (3 days ago) > > > > * 7062b0a - [FLINK-3803] [runtime] Pass CheckpointStatsTracker to > > > > ExecutionGraph (3 days ago) > > > > * f80f6d6 - [FLINK-3678] [dist, docs] Make Flink logs directory > > > > configurable (4 days ago) > > > > * 344a55e - [hotfix] [cep] Make cep window border treatment > consistent > > > > (9 days ago) > > > > > >
Re: How to specify dependencies for an application that needs to use modified version of Flink
Hi Saiph, You can enter flink directory and run `mvn clean install -DskipTest=true` to install all the modules (including flunk-streaming-java) into your local .m2 repository . After that, change your app dependencies version to the version of your flink, such as “1.1-SNAPSHOT”. At last, reimport your app project. - Jark Wu > 在 2016年5月12日,上午2:33,Saiph Kappa 写道: > > Hi, > > I'm performing some modifications on Flink (current trunk version). I want > a scala app (sbt based) to use that modified version. I'm only modifying > the flink-streaming-java module, what is the typical way to specify the > dependencies for my application in this case? Should I copy all jars to the > lib folder of my app, or to build a big fat jar? how do the devs here do it? > > Thanks.
Re: How to specify dependencies for an application that needs to use modified version of Flink
Sorry for mistyped the command. You can enter into flink/flink-streaming-java and run `mvn clean package install -DskipTests=true` . It will install only flink-streaming-java module. > 在 2016年5月12日,上午10:02,Jark 写道: > > Hi Saiph, >You can enter flink directory and run `mvn clean install -DskipTest=true` > to install all the modules (including flunk-streaming-java) into your local > .m2 repository . After that, change your app dependencies version to the > version of your flink, such as “1.1-SNAPSHOT”. At last, reimport your app > project. > > - Jark Wu > >> 在 2016年5月12日,上午2:33,Saiph Kappa 写道: >> >> Hi, >> >> I'm performing some modifications on Flink (current trunk version). I want >> a scala app (sbt based) to use that modified version. I'm only modifying >> the flink-streaming-java module, what is the typical way to specify the >> dependencies for my application in this case? Should I copy all jars to the >> lib folder of my app, or to build a big fat jar? how do the devs here do it? >> >> Thanks. >
[ANNOUNCE] Flink 1.0.3 Released
The Flink PMC is pleased to announce the availability of Flink 1.0.3. The official release announcement: http://flink.apache.org/news/2016/05/11/release-1.0.3.html Release binaries: http://apache.openmirror.de/flink/flink-1.0.3/ Please update your Maven dependencies to the new 1.0.3 version and update your binaries. On behalf of the Flink PMC, I would like to thank everybody who contributed to the release.
Re: How to specify dependencies for an application that needs to use modified version of Flink
Since FLINK-1827 was merged you could also skip test compilation with -Dmaven.test.skip=true if you don't want to waste time and resources :) On 12 May 2016 10:06, "Jark" wrote: > Sorry for mistyped the command. You can enter into > flink/flink-streaming-java and run `mvn clean package install > -DskipTests=true` . It will install only flink-streaming-java module. > > > 在 2016年5月12日,上午10:02,Jark 写道: > > > > Hi Saiph, > >You can enter flink directory and run `mvn clean install > -DskipTest=true` to install all the modules (including > flunk-streaming-java) into your local .m2 repository . After that, change > your app dependencies version to the version of your flink, such as > “1.1-SNAPSHOT”. At last, reimport your app project. > > > > - Jark Wu > > > >> 在 2016年5月12日,上午2:33,Saiph Kappa 写道: > >> > >> Hi, > >> > >> I'm performing some modifications on Flink (current trunk version). I > want > >> a scala app (sbt based) to use that modified version. I'm only modifying > >> the flink-streaming-java module, what is the typical way to specify the > >> dependencies for my application in this case? Should I copy all jars to > the > >> lib folder of my app, or to build a big fat jar? how do the devs here > do it? > >> > >> Thanks. > > > >
Re: [PROPOSAL] Structure the Flink Open Source Development
Hello, There are at least three Gábors in the Flink community, :) so assuming that the Gábor in the list of maintainers of the DataSet API is referring to me, I'll be happy to do it. :) Best, Gábor G. 2016-05-10 11:24 GMT+02:00 Stephan Ewen : > Hi everyone! > > We propose to establish some lightweight structures in the Flink open > source community and development process, > to help us better handle the increased interest in Flink (mailing list and > pull requests), while not overwhelming the > committers, and giving users and contributors a good experience. > > This proposal is triggered by the observation that we are reaching the > limits of where the current community can support > users and guide new contributors. The below proposal is based on > observations and ideas from Till, Robert, and me. > > > Goals > > > We try to achieve the following > > - Pull requests get handled in a timely fashion > - New contributors are better integrated into the community > - The community feels empowered on the mailing list. > But questions that need the attention of someone that has deep > knowledge of a certain part of Flink get their attention. > - At the same time, the committers that are knowledgeable about many core > parts do not get completely overwhelmed. > - We don't overlook threads that report critical issues. > - We always have a pretty good overview of what the status of certain > parts of the system are. > -> What are often encountered known issues > -> What are the most frequently requested features > > > > Problems > > > Looking into the process, there are two big issues: > > (1) Up to now, we have been relying on the fact that everything just > "organizes itself", driven by best effort. That assumes > that everyone feels equally responsible for every part, question, and > contribution. At the current state, this is impossible > to maintain, it overwhelms the committers and contributors. > > Example: Pull requests are picked up by whoever wants to pick them up. Pull > requests that are a lot of work, have little > chance of getting in, or relate to less active components are sometimes not > picked up. When contributors are pretty > loaded already, it may happen that no one eventually feels responsible to > pick up a pull request, and it falls through the cracks. > > (2) There is no good overview of what are known shortcomings, efforts, and > requested features for different parts of the system. > This information exists in various peoples' heads, but is not easily > accessible for new people. The Flink JIRA is not well > maintained, it is not easy to draw insights from that. > > > === > The Proposal > === > > Since we are building a parallel system, the natural solution seems to be: > partition the workload ;-) > > We propose to define a set of components for Flink. Each component is > maintained or tracked by one or more > people - let's call them maintainers. It is important to note that we don't > suggest the maintainers as an authoritative role, but > simply as committers or contributors that visibly step up for a certain > component, and mainly track and drive the efforts > pertaining to that component. > > It is also important to realize that we do not want to suggest that people > get less involved with certain parts and components, because > they are not the maintainers. We simply want to make sure that each pull > request or question or contribution has in the end > one person (or a small set of people) responsible for catching and tracking > it, if it was not worked on by the pro-active > community. > > For some components, having multiple maintainers will be helpful. In that > case, one maintainer should be the "chair" or "lead" > and make sure that no issue of that component gets lost between the > multiple maintainers. > > > A maintainers' role is: > - > > - Have an overview of which of the open pull requests relate to their > component > - Drive the pull requests relating to the component to resolution > => Moderate the decision whether the feature should be merged > => Make sure the pull request gets a shepherd. >In many cases, the maintainers would shepherd themselves. > => In case the shepherd becomes inactive, the maintainers need to > find a new shepherd. > > - Have an overview of what are the known issues of their component > - Have an overview of what are the frequently requested features of their > component > > - Have an overview of which contributors are doing very good work in > their component, > would be candidates for committers, and should be mentored towards that. > > - Resolve email threads that have been brought to their attention, > because deeper > component knowledge is required for that thread. > > A maintainers' role is NOT: > -- > > - Review all pull requests of that c
Re: [PROPOSAL] Structure the Flink Open Source Development
+1 for the proposal @ggevay: I do think that it refers to you. :) On Thu, May 12, 2016 at 10:40 AM, Gábor Gévay wrote: > Hello, > > There are at least three Gábors in the Flink community, :) so > assuming that the Gábor in the list of maintainers of the DataSet API > is referring to me, I'll be happy to do it. :) > > Best, > Gábor G. > > > > 2016-05-10 11:24 GMT+02:00 Stephan Ewen : > > Hi everyone! > > > > We propose to establish some lightweight structures in the Flink open > > source community and development process, > > to help us better handle the increased interest in Flink (mailing list > and > > pull requests), while not overwhelming the > > committers, and giving users and contributors a good experience. > > > > This proposal is triggered by the observation that we are reaching the > > limits of where the current community can support > > users and guide new contributors. The below proposal is based on > > observations and ideas from Till, Robert, and me. > > > > > > Goals > > > > > > We try to achieve the following > > > > - Pull requests get handled in a timely fashion > > - New contributors are better integrated into the community > > - The community feels empowered on the mailing list. > > But questions that need the attention of someone that has deep > > knowledge of a certain part of Flink get their attention. > > - At the same time, the committers that are knowledgeable about many > core > > parts do not get completely overwhelmed. > > - We don't overlook threads that report critical issues. > > - We always have a pretty good overview of what the status of certain > > parts of the system are. > > -> What are often encountered known issues > > -> What are the most frequently requested features > > > > > > > > Problems > > > > > > Looking into the process, there are two big issues: > > > > (1) Up to now, we have been relying on the fact that everything just > > "organizes itself", driven by best effort. That assumes > > that everyone feels equally responsible for every part, question, and > > contribution. At the current state, this is impossible > > to maintain, it overwhelms the committers and contributors. > > > > Example: Pull requests are picked up by whoever wants to pick them up. > Pull > > requests that are a lot of work, have little > > chance of getting in, or relate to less active components are sometimes > not > > picked up. When contributors are pretty > > loaded already, it may happen that no one eventually feels responsible to > > pick up a pull request, and it falls through the cracks. > > > > (2) There is no good overview of what are known shortcomings, efforts, > and > > requested features for different parts of the system. > > This information exists in various peoples' heads, but is not easily > > accessible for new people. The Flink JIRA is not well > > maintained, it is not easy to draw insights from that. > > > > > > === > > The Proposal > > === > > > > Since we are building a parallel system, the natural solution seems to > be: > > partition the workload ;-) > > > > We propose to define a set of components for Flink. Each component is > > maintained or tracked by one or more > > people - let's call them maintainers. It is important to note that we > don't > > suggest the maintainers as an authoritative role, but > > simply as committers or contributors that visibly step up for a certain > > component, and mainly track and drive the efforts > > pertaining to that component. > > > > It is also important to realize that we do not want to suggest that > people > > get less involved with certain parts and components, because > > they are not the maintainers. We simply want to make sure that each pull > > request or question or contribution has in the end > > one person (or a small set of people) responsible for catching and > tracking > > it, if it was not worked on by the pro-active > > community. > > > > For some components, having multiple maintainers will be helpful. In that > > case, one maintainer should be the "chair" or "lead" > > and make sure that no issue of that component gets lost between the > > multiple maintainers. > > > > > > A maintainers' role is: > > - > > > > - Have an overview of which of the open pull requests relate to their > > component > > - Drive the pull requests relating to the component to resolution > > => Moderate the decision whether the feature should be merged > > => Make sure the pull request gets a shepherd. > >In many cases, the maintainers would shepherd themselves. > > => In case the shepherd becomes inactive, the maintainers need to > > find a new shepherd. > > > > - Have an overview of what are the known issues of their component > > - Have an overview of what are the frequently requested features of > their > > component > > > > - Have an overview of which contributors are doing v
Re: Intellij code style
If you're interested to I created an Eclipse version that should follows Flink coding rules..should I create a new JIRA for it? On Thu, May 5, 2016 at 6:02 PM, Dawid Wysakowicz wrote: > I opened JIRA: https://issues.apache.org/jira/browse/FLINK-3870. and > created PR both to flink and flink-web. > > https://github.com/apache/flink/pull/1963 > https://github.com/apache/flink-web/pull/20 > > I would be thankful for a review. > > 2016-05-04 11:00 GMT+02:00 Fabian Hueske : > > > Yes, please open a JIRA. Thanks! > > > > 2016-05-04 10:16 GMT+02:00 Dawid Wysakowicz >: > > > > > Sure, Will open PR shortly. Shall I create any JIRA issue? > > > > > > 2016-05-04 9:28 GMT+02:00 Fabian Hueske : > > > > > > > +1 for adding a template to the tools folder and linking it from the > > > coding > > > > guide lines! > > > > > > > > 2016-05-04 6:08 GMT+02:00 Henry Saputra : > > > > > > > > > We could actually put this in the tools directory of the source and > > > repo > > > > > and refer it from contribution guide. > > > > > > > > > > @Dawid want to try to send Pull request for it? > > > > > > > > > > On Thursday, April 28, 2016, Theodore Vasiloudis < > > > > > theodoros.vasilou...@gmail.com> wrote: > > > > > > > > > > > Do we plan to include something like this in the contribution > guide > > > as > > > > > > well? > > > > > > > > > > > > On Thu, Apr 28, 2016 at 3:16 PM, Stefano Baghino < > > > > > > stefano.bagh...@radicalbit.io > wrote: > > > > > > > > > > > > > Awesome Dawid! Thanks for taking the time to do this. :) > > > > > > > > > > > > > > On Thu, Apr 28, 2016 at 1:45 PM, Dawid Wysakowicz < > > > > > > > wysakowicz.da...@gmail.com > wrote: > > > > > > > > > > > > > > > Hi, > > > > > > > > > > > > > > > > I tried to create a code style that would follow Flink > > > code-style. > > > > It > > > > > > may > > > > > > > > be not "production" ready, but I think it can be a good > start. > > > > > > > > Hope it will be useful for someone. Also I will be glad for > any > > > > > > comments > > > > > > > > on that. > > > > > > > > > > > > > > > > 2016-04-10 13:59 GMT+02:00 Stephan Ewen > > > > > >: > > > > > > > > > > > > > > > >> I don't know how close Phoenix' code style is to Flink's > > > de-facto > > > > > code > > > > > > > >> style. > > > > > > > >> I would create one that reflects Flink's de-facto code > style, > > so > > > > > that > > > > > > > the > > > > > > > >> formatter does not change everything... > > > > > > > >> > > > > > > > >> On Sun, Apr 10, 2016 at 4:40 AM, Naveen Madhire < > > > > > > vmadh...@umail.iu.edu > > > > > > > > >> wrote: > > > > > > > >> > > > > > > > >> > Apache Phoenix has one code template which contributors > use. > > > Do > > > > > you > > > > > > > >> think > > > > > > > >> > onc can use the same for Flink or may be with some more > > > > > > modifications? > > > > > > > >> > > > > > > > > >> > > > > > > > > >> > > > > > > > > >> > > > > > > > > > > > > > > > > > > > > > > > > > > > > https://github.com/apache/phoenix/blob/master/dev/PhoenixCodeTemplate.xml > > > > > > > >> > > > > > > > > >> > On Sat, Apr 9, 2016 at 11:00 AM, Stephan Ewen < > > > se...@apache.org > > > > > > > > > > > > > > wrote: > > > > > > > >> > > > > > > > > >> > > Actually, It would be amazing to create a code style > > profile > > > > for > > > > > > > >> > download, > > > > > > > >> > > so that all contributors would use that. > > > > > > > >> > > > > > > > > > >> > > Same thing actually for IntelliJ inspections: A set of > > > > > inspections > > > > > > > we > > > > > > > >> > want > > > > > > > >> > > to have active and where we strive for zero warnings. > > > > > > > >> > > > > > > > > > >> > > On Sat, Apr 9, 2016 at 10:00 AM, Robert Metzger < > > > > > > > rmetz...@apache.org > > > > > > > > >> > > wrote: > > > > > > > >> > > > > > > > > > >> > > > Hi Dawid, > > > > > > > >> > > > > > > > > > > >> > > > we don't have an automated formatter for intelliJ. > > > However, > > > > > you > > > > > > > can > > > > > > > >> use > > > > > > > >> > > the > > > > > > > >> > > > "Checkstyle" plugin of IntelliJ to mark checkstyle > > > > violations > > > > > in > > > > > > > the > > > > > > > >> > IDE. > > > > > > > >> > > > > > > > > > > >> > > > On Fri, Apr 8, 2016 at 12:30 PM, Dawid Wysakowicz < > > > > > > > >> > > > wysakowicz.da...@gmail.com > wrote: > > > > > > > >> > > > > > > > > > > >> > > > > Hi all, > > > > > > > >> > > > > > > > > > > > >> > > > > I am currently working on some issues and been > > wondering > > > > if > > > > > > you > > > > > > > >> have > > > > > > > >> > > > > settings for Intellij code style that would follow > > your > > > > > coding > > > > > > > >> > > guidelines > > > > > > > >> > > > > available (I tried to look on wikis but could not > find > > > > it). > > > > > If > > > > > > > not > > > > > > > >> > > could > > > > > > > >> > > > > someone share its own? I would be grateful. > > > > > > > >> > > > > > > > > > > > >> > > > > Regards > > > > > > > >> > > > > Dawid Wysakow
Re: Intellij code style
Yes, please open a pull request for that. On Thu, May 12, 2016 at 11:40 AM, Flavio Pompermaier wrote: > If you're interested to I created an Eclipse version that should follows > Flink coding rules..should I create a new JIRA for it? > > On Thu, May 5, 2016 at 6:02 PM, Dawid Wysakowicz < > wysakowicz.da...@gmail.com > > wrote: > > > I opened JIRA: https://issues.apache.org/jira/browse/FLINK-3870. and > > created PR both to flink and flink-web. > > > > https://github.com/apache/flink/pull/1963 > > https://github.com/apache/flink-web/pull/20 > > > > I would be thankful for a review. > > > > 2016-05-04 11:00 GMT+02:00 Fabian Hueske : > > > > > Yes, please open a JIRA. Thanks! > > > > > > 2016-05-04 10:16 GMT+02:00 Dawid Wysakowicz < > wysakowicz.da...@gmail.com > > >: > > > > > > > Sure, Will open PR shortly. Shall I create any JIRA issue? > > > > > > > > 2016-05-04 9:28 GMT+02:00 Fabian Hueske : > > > > > > > > > +1 for adding a template to the tools folder and linking it from > the > > > > coding > > > > > guide lines! > > > > > > > > > > 2016-05-04 6:08 GMT+02:00 Henry Saputra : > > > > > > > > > > > We could actually put this in the tools directory of the source > and > > > > repo > > > > > > and refer it from contribution guide. > > > > > > > > > > > > @Dawid want to try to send Pull request for it? > > > > > > > > > > > > On Thursday, April 28, 2016, Theodore Vasiloudis < > > > > > > theodoros.vasilou...@gmail.com> wrote: > > > > > > > > > > > > > Do we plan to include something like this in the contribution > > guide > > > > as > > > > > > > well? > > > > > > > > > > > > > > On Thu, Apr 28, 2016 at 3:16 PM, Stefano Baghino < > > > > > > > stefano.bagh...@radicalbit.io > wrote: > > > > > > > > > > > > > > > Awesome Dawid! Thanks for taking the time to do this. :) > > > > > > > > > > > > > > > > On Thu, Apr 28, 2016 at 1:45 PM, Dawid Wysakowicz < > > > > > > > > wysakowicz.da...@gmail.com > wrote: > > > > > > > > > > > > > > > > > Hi, > > > > > > > > > > > > > > > > > > I tried to create a code style that would follow Flink > > > > code-style. > > > > > It > > > > > > > may > > > > > > > > > be not "production" ready, but I think it can be a good > > start. > > > > > > > > > Hope it will be useful for someone. Also I will be glad for > > any > > > > > > > comments > > > > > > > > > on that. > > > > > > > > > > > > > > > > > > 2016-04-10 13:59 GMT+02:00 Stephan Ewen > > > > > > >: > > > > > > > > > > > > > > > > > >> I don't know how close Phoenix' code style is to Flink's > > > > de-facto > > > > > > code > > > > > > > > >> style. > > > > > > > > >> I would create one that reflects Flink's de-facto code > > style, > > > so > > > > > > that > > > > > > > > the > > > > > > > > >> formatter does not change everything... > > > > > > > > >> > > > > > > > > >> On Sun, Apr 10, 2016 at 4:40 AM, Naveen Madhire < > > > > > > > vmadh...@umail.iu.edu > > > > > > > > > >> wrote: > > > > > > > > >> > > > > > > > > >> > Apache Phoenix has one code template which contributors > > use. > > > > Do > > > > > > you > > > > > > > > >> think > > > > > > > > >> > onc can use the same for Flink or may be with some more > > > > > > > modifications? > > > > > > > > >> > > > > > > > > > >> > > > > > > > > > >> > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > https://github.com/apache/phoenix/blob/master/dev/PhoenixCodeTemplate.xml > > > > > > > > >> > > > > > > > > > >> > On Sat, Apr 9, 2016 at 11:00 AM, Stephan Ewen < > > > > se...@apache.org > > > > > > > > > > > > > > > > wrote: > > > > > > > > >> > > > > > > > > > >> > > Actually, It would be amazing to create a code style > > > profile > > > > > for > > > > > > > > >> > download, > > > > > > > > >> > > so that all contributors would use that. > > > > > > > > >> > > > > > > > > > > >> > > Same thing actually for IntelliJ inspections: A set of > > > > > > inspections > > > > > > > > we > > > > > > > > >> > want > > > > > > > > >> > > to have active and where we strive for zero warnings. > > > > > > > > >> > > > > > > > > > > >> > > On Sat, Apr 9, 2016 at 10:00 AM, Robert Metzger < > > > > > > > > rmetz...@apache.org > > > > > > > > > >> > > wrote: > > > > > > > > >> > > > > > > > > > > >> > > > Hi Dawid, > > > > > > > > >> > > > > > > > > > > > >> > > > we don't have an automated formatter for intelliJ. > > > > However, > > > > > > you > > > > > > > > can > > > > > > > > >> use > > > > > > > > >> > > the > > > > > > > > >> > > > "Checkstyle" plugin of IntelliJ to mark checkstyle > > > > > violations > > > > > > in > > > > > > > > the > > > > > > > > >> > IDE. > > > > > > > > >> > > > > > > > > > > > >> > > > On Fri, Apr 8, 2016 at 12:30 PM, Dawid Wysakowicz < > > > > > > > > >> > > > wysakowicz.da...@gmail.com > wrote: > > > > > > > > >> > > > > > > > > > > > >> > > > > Hi all, > > > > > > > > >> > > > > > > > > > > > > >> > > > > I am currently working on some issues and been > > > wondering > > > > > if > > > > > > > you > > > >
Re: [PROPOSAL] Structure the Flink Open Source Development
Yes, Gabor Gevay, that did refer to you! Sorry for the ambiguity... On Thu, May 12, 2016 at 10:46 AM, Márton Balassi wrote: > +1 for the proposal > @ggevay: I do think that it refers to you. :) > > On Thu, May 12, 2016 at 10:40 AM, Gábor Gévay wrote: > > > Hello, > > > > There are at least three Gábors in the Flink community, :) so > > assuming that the Gábor in the list of maintainers of the DataSet API > > is referring to me, I'll be happy to do it. :) > > > > Best, > > Gábor G. > > > > > > > > 2016-05-10 11:24 GMT+02:00 Stephan Ewen : > > > Hi everyone! > > > > > > We propose to establish some lightweight structures in the Flink open > > > source community and development process, > > > to help us better handle the increased interest in Flink (mailing list > > and > > > pull requests), while not overwhelming the > > > committers, and giving users and contributors a good experience. > > > > > > This proposal is triggered by the observation that we are reaching the > > > limits of where the current community can support > > > users and guide new contributors. The below proposal is based on > > > observations and ideas from Till, Robert, and me. > > > > > > > > > Goals > > > > > > > > > We try to achieve the following > > > > > > - Pull requests get handled in a timely fashion > > > - New contributors are better integrated into the community > > > - The community feels empowered on the mailing list. > > > But questions that need the attention of someone that has deep > > > knowledge of a certain part of Flink get their attention. > > > - At the same time, the committers that are knowledgeable about many > > core > > > parts do not get completely overwhelmed. > > > - We don't overlook threads that report critical issues. > > > - We always have a pretty good overview of what the status of certain > > > parts of the system are. > > > -> What are often encountered known issues > > > -> What are the most frequently requested features > > > > > > > > > > > > Problems > > > > > > > > > Looking into the process, there are two big issues: > > > > > > (1) Up to now, we have been relying on the fact that everything just > > > "organizes itself", driven by best effort. That assumes > > > that everyone feels equally responsible for every part, question, and > > > contribution. At the current state, this is impossible > > > to maintain, it overwhelms the committers and contributors. > > > > > > Example: Pull requests are picked up by whoever wants to pick them up. > > Pull > > > requests that are a lot of work, have little > > > chance of getting in, or relate to less active components are sometimes > > not > > > picked up. When contributors are pretty > > > loaded already, it may happen that no one eventually feels responsible > to > > > pick up a pull request, and it falls through the cracks. > > > > > > (2) There is no good overview of what are known shortcomings, efforts, > > and > > > requested features for different parts of the system. > > > This information exists in various peoples' heads, but is not easily > > > accessible for new people. The Flink JIRA is not well > > > maintained, it is not easy to draw insights from that. > > > > > > > > > === > > > The Proposal > > > === > > > > > > Since we are building a parallel system, the natural solution seems to > > be: > > > partition the workload ;-) > > > > > > We propose to define a set of components for Flink. Each component is > > > maintained or tracked by one or more > > > people - let's call them maintainers. It is important to note that we > > don't > > > suggest the maintainers as an authoritative role, but > > > simply as committers or contributors that visibly step up for a certain > > > component, and mainly track and drive the efforts > > > pertaining to that component. > > > > > > It is also important to realize that we do not want to suggest that > > people > > > get less involved with certain parts and components, because > > > they are not the maintainers. We simply want to make sure that each > pull > > > request or question or contribution has in the end > > > one person (or a small set of people) responsible for catching and > > tracking > > > it, if it was not worked on by the pro-active > > > community. > > > > > > For some components, having multiple maintainers will be helpful. In > that > > > case, one maintainer should be the "chair" or "lead" > > > and make sure that no issue of that component gets lost between the > > > multiple maintainers. > > > > > > > > > A maintainers' role is: > > > - > > > > > > - Have an overview of which of the open pull requests relate to their > > > component > > > - Drive the pull requests relating to the component to resolution > > > => Moderate the decision whether the feature should be merged > > > => Make sure the pull request gets a shepherd. > > >In many cases, t
Re: Intellij code style
Do I need to open also a Jira or just the PR? On Thu, May 12, 2016 at 12:03 PM, Stephan Ewen wrote: > Yes, please open a pull request for that. > > On Thu, May 12, 2016 at 11:40 AM, Flavio Pompermaier > > wrote: > > > If you're interested to I created an Eclipse version that should follows > > Flink coding rules..should I create a new JIRA for it? > > > > On Thu, May 5, 2016 at 6:02 PM, Dawid Wysakowicz < > > wysakowicz.da...@gmail.com > > > wrote: > > > > > I opened JIRA: https://issues.apache.org/jira/browse/FLINK-3870. and > > > created PR both to flink and flink-web. > > > > > > https://github.com/apache/flink/pull/1963 > > > https://github.com/apache/flink-web/pull/20 > > > > > > I would be thankful for a review. > > > > > > 2016-05-04 11:00 GMT+02:00 Fabian Hueske : > > > > > > > Yes, please open a JIRA. Thanks! > > > > > > > > 2016-05-04 10:16 GMT+02:00 Dawid Wysakowicz < > > wysakowicz.da...@gmail.com > > > >: > > > > > > > > > Sure, Will open PR shortly. Shall I create any JIRA issue? > > > > > > > > > > 2016-05-04 9:28 GMT+02:00 Fabian Hueske : > > > > > > > > > > > +1 for adding a template to the tools folder and linking it from > > the > > > > > coding > > > > > > guide lines! > > > > > > > > > > > > 2016-05-04 6:08 GMT+02:00 Henry Saputra >: > > > > > > > > > > > > > We could actually put this in the tools directory of the source > > and > > > > > repo > > > > > > > and refer it from contribution guide. > > > > > > > > > > > > > > @Dawid want to try to send Pull request for it? > > > > > > > > > > > > > > On Thursday, April 28, 2016, Theodore Vasiloudis < > > > > > > > theodoros.vasilou...@gmail.com> wrote: > > > > > > > > > > > > > > > Do we plan to include something like this in the contribution > > > guide > > > > > as > > > > > > > > well? > > > > > > > > > > > > > > > > On Thu, Apr 28, 2016 at 3:16 PM, Stefano Baghino < > > > > > > > > stefano.bagh...@radicalbit.io > wrote: > > > > > > > > > > > > > > > > > Awesome Dawid! Thanks for taking the time to do this. :) > > > > > > > > > > > > > > > > > > On Thu, Apr 28, 2016 at 1:45 PM, Dawid Wysakowicz < > > > > > > > > > wysakowicz.da...@gmail.com > wrote: > > > > > > > > > > > > > > > > > > > Hi, > > > > > > > > > > > > > > > > > > > > I tried to create a code style that would follow Flink > > > > > code-style. > > > > > > It > > > > > > > > may > > > > > > > > > > be not "production" ready, but I think it can be a good > > > start. > > > > > > > > > > Hope it will be useful for someone. Also I will be glad > for > > > any > > > > > > > > comments > > > > > > > > > > on that. > > > > > > > > > > > > > > > > > > > > 2016-04-10 13:59 GMT+02:00 Stephan Ewen < > se...@apache.org > > > > > > > > >: > > > > > > > > > > > > > > > > > > > >> I don't know how close Phoenix' code style is to Flink's > > > > > de-facto > > > > > > > code > > > > > > > > > >> style. > > > > > > > > > >> I would create one that reflects Flink's de-facto code > > > style, > > > > so > > > > > > > that > > > > > > > > > the > > > > > > > > > >> formatter does not change everything... > > > > > > > > > >> > > > > > > > > > >> On Sun, Apr 10, 2016 at 4:40 AM, Naveen Madhire < > > > > > > > > vmadh...@umail.iu.edu > > > > > > > > > > >> wrote: > > > > > > > > > >> > > > > > > > > > >> > Apache Phoenix has one code template which > contributors > > > use. > > > > > Do > > > > > > > you > > > > > > > > > >> think > > > > > > > > > >> > onc can use the same for Flink or may be with some > more > > > > > > > > modifications? > > > > > > > > > >> > > > > > > > > > > >> > > > > > > > > > > >> > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > https://github.com/apache/phoenix/blob/master/dev/PhoenixCodeTemplate.xml > > > > > > > > > >> > > > > > > > > > > >> > On Sat, Apr 9, 2016 at 11:00 AM, Stephan Ewen < > > > > > se...@apache.org > > > > > > > > > > > > > > > > > > wrote: > > > > > > > > > >> > > > > > > > > > > >> > > Actually, It would be amazing to create a code style > > > > profile > > > > > > for > > > > > > > > > >> > download, > > > > > > > > > >> > > so that all contributors would use that. > > > > > > > > > >> > > > > > > > > > > > >> > > Same thing actually for IntelliJ inspections: A set > of > > > > > > > inspections > > > > > > > > > we > > > > > > > > > >> > want > > > > > > > > > >> > > to have active and where we strive for zero > warnings. > > > > > > > > > >> > > > > > > > > > > > >> > > On Sat, Apr 9, 2016 at 10:00 AM, Robert Metzger < > > > > > > > > > rmetz...@apache.org > > > > > > > > > > >> > > wrote: > > > > > > > > > >> > > > > > > > > > > > >> > > > Hi Dawid, > > > > > > > > > >> > > > > > > > > > > > > >> > > > we don't have an automated formatter for intelliJ. > > > > > However, > > > > > > > you > > > > > > > > > can > > > > > > > > > >> use > > > > > > > > > >> > > the > > > > > > > > > >> > > > "Checkstyle" plugin of IntelliJ to mark checkstyle > > > > > > violations > > > > > >
Re: [PROPOSAL] Structure the Flink Open Source Development
+1 for the proposal On May 12, 2016 12:13 PM, "Stephan Ewen" wrote: > Yes, Gabor Gevay, that did refer to you! > > Sorry for the ambiguity... > > On Thu, May 12, 2016 at 10:46 AM, Márton Balassi > > wrote: > > > +1 for the proposal > > @ggevay: I do think that it refers to you. :) > > > > On Thu, May 12, 2016 at 10:40 AM, Gábor Gévay wrote: > > > > > Hello, > > > > > > There are at least three Gábors in the Flink community, :) so > > > assuming that the Gábor in the list of maintainers of the DataSet API > > > is referring to me, I'll be happy to do it. :) > > > > > > Best, > > > Gábor G. > > > > > > > > > > > > 2016-05-10 11:24 GMT+02:00 Stephan Ewen : > > > > Hi everyone! > > > > > > > > We propose to establish some lightweight structures in the Flink open > > > > source community and development process, > > > > to help us better handle the increased interest in Flink (mailing > list > > > and > > > > pull requests), while not overwhelming the > > > > committers, and giving users and contributors a good experience. > > > > > > > > This proposal is triggered by the observation that we are reaching > the > > > > limits of where the current community can support > > > > users and guide new contributors. The below proposal is based on > > > > observations and ideas from Till, Robert, and me. > > > > > > > > > > > > Goals > > > > > > > > > > > > We try to achieve the following > > > > > > > > - Pull requests get handled in a timely fashion > > > > - New contributors are better integrated into the community > > > > - The community feels empowered on the mailing list. > > > > But questions that need the attention of someone that has deep > > > > knowledge of a certain part of Flink get their attention. > > > > - At the same time, the committers that are knowledgeable about > many > > > core > > > > parts do not get completely overwhelmed. > > > > - We don't overlook threads that report critical issues. > > > > - We always have a pretty good overview of what the status of > certain > > > > parts of the system are. > > > > -> What are often encountered known issues > > > > -> What are the most frequently requested features > > > > > > > > > > > > > > > > Problems > > > > > > > > > > > > Looking into the process, there are two big issues: > > > > > > > > (1) Up to now, we have been relying on the fact that everything just > > > > "organizes itself", driven by best effort. That assumes > > > > that everyone feels equally responsible for every part, question, and > > > > contribution. At the current state, this is impossible > > > > to maintain, it overwhelms the committers and contributors. > > > > > > > > Example: Pull requests are picked up by whoever wants to pick them > up. > > > Pull > > > > requests that are a lot of work, have little > > > > chance of getting in, or relate to less active components are > sometimes > > > not > > > > picked up. When contributors are pretty > > > > loaded already, it may happen that no one eventually feels > responsible > > to > > > > pick up a pull request, and it falls through the cracks. > > > > > > > > (2) There is no good overview of what are known shortcomings, > efforts, > > > and > > > > requested features for different parts of the system. > > > > This information exists in various peoples' heads, but is not easily > > > > accessible for new people. The Flink JIRA is not well > > > > maintained, it is not easy to draw insights from that. > > > > > > > > > > > > === > > > > The Proposal > > > > === > > > > > > > > Since we are building a parallel system, the natural solution seems > to > > > be: > > > > partition the workload ;-) > > > > > > > > We propose to define a set of components for Flink. Each component is > > > > maintained or tracked by one or more > > > > people - let's call them maintainers. It is important to note that we > > > don't > > > > suggest the maintainers as an authoritative role, but > > > > simply as committers or contributors that visibly step up for a > certain > > > > component, and mainly track and drive the efforts > > > > pertaining to that component. > > > > > > > > It is also important to realize that we do not want to suggest that > > > people > > > > get less involved with certain parts and components, because > > > > they are not the maintainers. We simply want to make sure that each > > pull > > > > request or question or contribution has in the end > > > > one person (or a small set of people) responsible for catching and > > > tracking > > > > it, if it was not worked on by the pro-active > > > > community. > > > > > > > > For some components, having multiple maintainers will be helpful. In > > that > > > > case, one maintainer should be the "chair" or "lead" > > > > and make sure that no issue of that component gets lost between the > > > > multiple maintainers. > > > > > > > > > > > > A maintainers' role is: > > > > -
Re: [PROPOSAL] Structure the Flink Open Source Development
+1 from my side. Happy to be the maintainer for Storm-Compatibiltiy (at least I guess it's me, even the correct spelling would be with two 't' :P) -Matthias On 05/12/2016 12:56 PM, Till Rohrmann wrote: > +1 for the proposal > On May 12, 2016 12:13 PM, "Stephan Ewen" wrote: > >> Yes, Gabor Gevay, that did refer to you! >> >> Sorry for the ambiguity... >> >> On Thu, May 12, 2016 at 10:46 AM, Márton Balassi >> >> wrote: >> >>> +1 for the proposal >>> @ggevay: I do think that it refers to you. :) >>> >>> On Thu, May 12, 2016 at 10:40 AM, Gábor Gévay wrote: >>> Hello, There are at least three Gábors in the Flink community, :) so assuming that the Gábor in the list of maintainers of the DataSet API is referring to me, I'll be happy to do it. :) Best, Gábor G. 2016-05-10 11:24 GMT+02:00 Stephan Ewen : > Hi everyone! > > We propose to establish some lightweight structures in the Flink open > source community and development process, > to help us better handle the increased interest in Flink (mailing >> list and > pull requests), while not overwhelming the > committers, and giving users and contributors a good experience. > > This proposal is triggered by the observation that we are reaching >> the > limits of where the current community can support > users and guide new contributors. The below proposal is based on > observations and ideas from Till, Robert, and me. > > > Goals > > > We try to achieve the following > > - Pull requests get handled in a timely fashion > - New contributors are better integrated into the community > - The community feels empowered on the mailing list. > But questions that need the attention of someone that has deep > knowledge of a certain part of Flink get their attention. > - At the same time, the committers that are knowledgeable about >> many core > parts do not get completely overwhelmed. > - We don't overlook threads that report critical issues. > - We always have a pretty good overview of what the status of >> certain > parts of the system are. > -> What are often encountered known issues > -> What are the most frequently requested features > > > > Problems > > > Looking into the process, there are two big issues: > > (1) Up to now, we have been relying on the fact that everything just > "organizes itself", driven by best effort. That assumes > that everyone feels equally responsible for every part, question, and > contribution. At the current state, this is impossible > to maintain, it overwhelms the committers and contributors. > > Example: Pull requests are picked up by whoever wants to pick them >> up. Pull > requests that are a lot of work, have little > chance of getting in, or relate to less active components are >> sometimes not > picked up. When contributors are pretty > loaded already, it may happen that no one eventually feels >> responsible >>> to > pick up a pull request, and it falls through the cracks. > > (2) There is no good overview of what are known shortcomings, >> efforts, and > requested features for different parts of the system. > This information exists in various peoples' heads, but is not easily > accessible for new people. The Flink JIRA is not well > maintained, it is not easy to draw insights from that. > > > === > The Proposal > === > > Since we are building a parallel system, the natural solution seems >> to be: > partition the workload ;-) > > We propose to define a set of components for Flink. Each component is > maintained or tracked by one or more > people - let's call them maintainers. It is important to note that we don't > suggest the maintainers as an authoritative role, but > simply as committers or contributors that visibly step up for a >> certain > component, and mainly track and drive the efforts > pertaining to that component. > > It is also important to realize that we do not want to suggest that people > get less involved with certain parts and components, because > they are not the maintainers. We simply want to make sure that each >>> pull > request or question or contribution has in the end > one person (or a small set of people) responsible for catching and tracking > it, if it was not worked on by the pro-active > community. > > For some components, having multiple maintainers will be helpful. In >>> that > case, one maintainer should be the "chair" or "lead" > and make sure that no issue of that component gets lost between the > multiple maintainers. > > > A maintainers' role is: > --
Re: [PROPOSAL] Structure the Flink Open Source Development
Big +1 from my side, I think this will help the community grow and prosper big time! On Thu, May 12, 2016 at 1:27 PM, Matthias J. Sax wrote: > +1 from my side. > > Happy to be the maintainer for Storm-Compatibiltiy (at least I guess > it's me, even the correct spelling would be with two 't' :P) > > -Matthias > > On 05/12/2016 12:56 PM, Till Rohrmann wrote: > > +1 for the proposal > > On May 12, 2016 12:13 PM, "Stephan Ewen" wrote: > > > >> Yes, Gabor Gevay, that did refer to you! > >> > >> Sorry for the ambiguity... > >> > >> On Thu, May 12, 2016 at 10:46 AM, Márton Balassi < > balassi.mar...@gmail.com > >>> > >> wrote: > >> > >>> +1 for the proposal > >>> @ggevay: I do think that it refers to you. :) > >>> > >>> On Thu, May 12, 2016 at 10:40 AM, Gábor Gévay > wrote: > >>> > Hello, > > There are at least three Gábors in the Flink community, :) so > assuming that the Gábor in the list of maintainers of the DataSet API > is referring to me, I'll be happy to do it. :) > > Best, > Gábor G. > > > > 2016-05-10 11:24 GMT+02:00 Stephan Ewen : > > Hi everyone! > > > > We propose to establish some lightweight structures in the Flink open > > source community and development process, > > to help us better handle the increased interest in Flink (mailing > >> list > and > > pull requests), while not overwhelming the > > committers, and giving users and contributors a good experience. > > > > This proposal is triggered by the observation that we are reaching > >> the > > limits of where the current community can support > > users and guide new contributors. The below proposal is based on > > observations and ideas from Till, Robert, and me. > > > > > > Goals > > > > > > We try to achieve the following > > > > - Pull requests get handled in a timely fashion > > - New contributors are better integrated into the community > > - The community feels empowered on the mailing list. > > But questions that need the attention of someone that has deep > > knowledge of a certain part of Flink get their attention. > > - At the same time, the committers that are knowledgeable about > >> many > core > > parts do not get completely overwhelmed. > > - We don't overlook threads that report critical issues. > > - We always have a pretty good overview of what the status of > >> certain > > parts of the system are. > > -> What are often encountered known issues > > -> What are the most frequently requested features > > > > > > > > Problems > > > > > > Looking into the process, there are two big issues: > > > > (1) Up to now, we have been relying on the fact that everything just > > "organizes itself", driven by best effort. That assumes > > that everyone feels equally responsible for every part, question, and > > contribution. At the current state, this is impossible > > to maintain, it overwhelms the committers and contributors. > > > > Example: Pull requests are picked up by whoever wants to pick them > >> up. > Pull > > requests that are a lot of work, have little > > chance of getting in, or relate to less active components are > >> sometimes > not > > picked up. When contributors are pretty > > loaded already, it may happen that no one eventually feels > >> responsible > >>> to > > pick up a pull request, and it falls through the cracks. > > > > (2) There is no good overview of what are known shortcomings, > >> efforts, > and > > requested features for different parts of the system. > > This information exists in various peoples' heads, but is not easily > > accessible for new people. The Flink JIRA is not well > > maintained, it is not easy to draw insights from that. > > > > > > === > > The Proposal > > === > > > > Since we are building a parallel system, the natural solution seems > >> to > be: > > partition the workload ;-) > > > > We propose to define a set of components for Flink. Each component is > > maintained or tracked by one or more > > people - let's call them maintainers. It is important to note that we > don't > > suggest the maintainers as an authoritative role, but > > simply as committers or contributors that visibly step up for a > >> certain > > component, and mainly track and drive the efforts > > pertaining to that component. > > > > It is also important to realize that we do not want to suggest that > people > > get less involved with certain parts and components, because > > they are not the maintainers. We simply want to make sure that each > >>> pull > > request or question or contribution has in the end > > one perso
Re: [PROPOSAL] Structure the Flink Open Source Development
Yes, Matthias, that was supposed to be you. Sorry from another guy who frequently has his name misspelled ;-) On Thu, May 12, 2016 at 1:27 PM, Matthias J. Sax wrote: > +1 from my side. > > Happy to be the maintainer for Storm-Compatibiltiy (at least I guess > it's me, even the correct spelling would be with two 't' :P) > > -Matthias > > On 05/12/2016 12:56 PM, Till Rohrmann wrote: > > +1 for the proposal > > On May 12, 2016 12:13 PM, "Stephan Ewen" wrote: > > > >> Yes, Gabor Gevay, that did refer to you! > >> > >> Sorry for the ambiguity... > >> > >> On Thu, May 12, 2016 at 10:46 AM, Márton Balassi < > balassi.mar...@gmail.com > >>> > >> wrote: > >> > >>> +1 for the proposal > >>> @ggevay: I do think that it refers to you. :) > >>> > >>> On Thu, May 12, 2016 at 10:40 AM, Gábor Gévay > wrote: > >>> > Hello, > > There are at least three Gábors in the Flink community, :) so > assuming that the Gábor in the list of maintainers of the DataSet API > is referring to me, I'll be happy to do it. :) > > Best, > Gábor G. > > > > 2016-05-10 11:24 GMT+02:00 Stephan Ewen : > > Hi everyone! > > > > We propose to establish some lightweight structures in the Flink open > > source community and development process, > > to help us better handle the increased interest in Flink (mailing > >> list > and > > pull requests), while not overwhelming the > > committers, and giving users and contributors a good experience. > > > > This proposal is triggered by the observation that we are reaching > >> the > > limits of where the current community can support > > users and guide new contributors. The below proposal is based on > > observations and ideas from Till, Robert, and me. > > > > > > Goals > > > > > > We try to achieve the following > > > > - Pull requests get handled in a timely fashion > > - New contributors are better integrated into the community > > - The community feels empowered on the mailing list. > > But questions that need the attention of someone that has deep > > knowledge of a certain part of Flink get their attention. > > - At the same time, the committers that are knowledgeable about > >> many > core > > parts do not get completely overwhelmed. > > - We don't overlook threads that report critical issues. > > - We always have a pretty good overview of what the status of > >> certain > > parts of the system are. > > -> What are often encountered known issues > > -> What are the most frequently requested features > > > > > > > > Problems > > > > > > Looking into the process, there are two big issues: > > > > (1) Up to now, we have been relying on the fact that everything just > > "organizes itself", driven by best effort. That assumes > > that everyone feels equally responsible for every part, question, and > > contribution. At the current state, this is impossible > > to maintain, it overwhelms the committers and contributors. > > > > Example: Pull requests are picked up by whoever wants to pick them > >> up. > Pull > > requests that are a lot of work, have little > > chance of getting in, or relate to less active components are > >> sometimes > not > > picked up. When contributors are pretty > > loaded already, it may happen that no one eventually feels > >> responsible > >>> to > > pick up a pull request, and it falls through the cracks. > > > > (2) There is no good overview of what are known shortcomings, > >> efforts, > and > > requested features for different parts of the system. > > This information exists in various peoples' heads, but is not easily > > accessible for new people. The Flink JIRA is not well > > maintained, it is not easy to draw insights from that. > > > > > > === > > The Proposal > > === > > > > Since we are building a parallel system, the natural solution seems > >> to > be: > > partition the workload ;-) > > > > We propose to define a set of components for Flink. Each component is > > maintained or tracked by one or more > > people - let's call them maintainers. It is important to note that we > don't > > suggest the maintainers as an authoritative role, but > > simply as committers or contributors that visibly step up for a > >> certain > > component, and mainly track and drive the efforts > > pertaining to that component. > > > > It is also important to realize that we do not want to suggest that > people > > get less involved with certain parts and components, because > > they are not the maintainers. We simply want to make sure that each > >>> pull > > request or question or contribution has in
Re: [PROPOSAL] Structure the Flink Open Source Development
Hey Stephan! Thanks to you and the others who started this. I really like the proposal and I'm happy to see my name on some components. So, +1. I'd say let's wait until the end of the week/beginning of next week to see if there is any disagreement with the propsal in the community (doesn't look like it so far ;-)). Then we can continue to execute this. :-) – Ufuk On Thu, May 12, 2016 at 1:52 PM, Stephan Ewen wrote: > Yes, Matthias, that was supposed to be you. > Sorry from another guy who frequently has his name misspelled ;-) > > On Thu, May 12, 2016 at 1:27 PM, Matthias J. Sax wrote: > >> +1 from my side. >> >> Happy to be the maintainer for Storm-Compatibiltiy (at least I guess >> it's me, even the correct spelling would be with two 't' :P) >> >> -Matthias >> >> On 05/12/2016 12:56 PM, Till Rohrmann wrote: >> > +1 for the proposal >> > On May 12, 2016 12:13 PM, "Stephan Ewen" wrote: >> > >> >> Yes, Gabor Gevay, that did refer to you! >> >> >> >> Sorry for the ambiguity... >> >> >> >> On Thu, May 12, 2016 at 10:46 AM, Márton Balassi < >> balassi.mar...@gmail.com >> >>> >> >> wrote: >> >> >> >>> +1 for the proposal >> >>> @ggevay: I do think that it refers to you. :) >> >>> >> >>> On Thu, May 12, 2016 at 10:40 AM, Gábor Gévay >> wrote: >> >>> >> Hello, >> >> There are at least three Gábors in the Flink community, :) so >> assuming that the Gábor in the list of maintainers of the DataSet API >> is referring to me, I'll be happy to do it. :) >> >> Best, >> Gábor G. >> >> >> >> 2016-05-10 11:24 GMT+02:00 Stephan Ewen : >> > Hi everyone! >> > >> > We propose to establish some lightweight structures in the Flink open >> > source community and development process, >> > to help us better handle the increased interest in Flink (mailing >> >> list >> and >> > pull requests), while not overwhelming the >> > committers, and giving users and contributors a good experience. >> > >> > This proposal is triggered by the observation that we are reaching >> >> the >> > limits of where the current community can support >> > users and guide new contributors. The below proposal is based on >> > observations and ideas from Till, Robert, and me. >> > >> > >> > Goals >> > >> > >> > We try to achieve the following >> > >> > - Pull requests get handled in a timely fashion >> > - New contributors are better integrated into the community >> > - The community feels empowered on the mailing list. >> > But questions that need the attention of someone that has deep >> > knowledge of a certain part of Flink get their attention. >> > - At the same time, the committers that are knowledgeable about >> >> many >> core >> > parts do not get completely overwhelmed. >> > - We don't overlook threads that report critical issues. >> > - We always have a pretty good overview of what the status of >> >> certain >> > parts of the system are. >> > -> What are often encountered known issues >> > -> What are the most frequently requested features >> > >> > >> > >> > Problems >> > >> > >> > Looking into the process, there are two big issues: >> > >> > (1) Up to now, we have been relying on the fact that everything just >> > "organizes itself", driven by best effort. That assumes >> > that everyone feels equally responsible for every part, question, and >> > contribution. At the current state, this is impossible >> > to maintain, it overwhelms the committers and contributors. >> > >> > Example: Pull requests are picked up by whoever wants to pick them >> >> up. >> Pull >> > requests that are a lot of work, have little >> > chance of getting in, or relate to less active components are >> >> sometimes >> not >> > picked up. When contributors are pretty >> > loaded already, it may happen that no one eventually feels >> >> responsible >> >>> to >> > pick up a pull request, and it falls through the cracks. >> > >> > (2) There is no good overview of what are known shortcomings, >> >> efforts, >> and >> > requested features for different parts of the system. >> > This information exists in various peoples' heads, but is not easily >> > accessible for new people. The Flink JIRA is not well >> > maintained, it is not easy to draw insights from that. >> > >> > >> > === >> > The Proposal >> > === >> > >> > Since we are building a parallel system, the natural solution seems >> >> to >> be: >> > partition the workload ;-) >> > >> > We propose to define a set of components for Flink. Each component is >> > maintained or tracked by one or more >> > people - let's call them maintainers. It is important to note that we >> >
Re: [PROPOSAL] Structure the Flink Open Source Development
+1 for the initiative. With a better process we will improve the quality of the Flink development and give us more time to focus. Could we have another category "Infrastructure"? This would concern things like CI, nightly deployment of snapshots/documentation, ASF Infra communication. Robert and me could be the initial maintainers for that. On Thu, May 12, 2016 at 1:52 PM, Stephan Ewen wrote: > Yes, Matthias, that was supposed to be you. > Sorry from another guy who frequently has his name misspelled ;-) > > On Thu, May 12, 2016 at 1:27 PM, Matthias J. Sax wrote: > >> +1 from my side. >> >> Happy to be the maintainer for Storm-Compatibiltiy (at least I guess >> it's me, even the correct spelling would be with two 't' :P) >> >> -Matthias >> >> On 05/12/2016 12:56 PM, Till Rohrmann wrote: >> > +1 for the proposal >> > On May 12, 2016 12:13 PM, "Stephan Ewen" wrote: >> > >> >> Yes, Gabor Gevay, that did refer to you! >> >> >> >> Sorry for the ambiguity... >> >> >> >> On Thu, May 12, 2016 at 10:46 AM, Márton Balassi < >> balassi.mar...@gmail.com >> >>> >> >> wrote: >> >> >> >>> +1 for the proposal >> >>> @ggevay: I do think that it refers to you. :) >> >>> >> >>> On Thu, May 12, 2016 at 10:40 AM, Gábor Gévay >> wrote: >> >>> >> Hello, >> >> There are at least three Gábors in the Flink community, :) so >> assuming that the Gábor in the list of maintainers of the DataSet API >> is referring to me, I'll be happy to do it. :) >> >> Best, >> Gábor G. >> >> >> >> 2016-05-10 11:24 GMT+02:00 Stephan Ewen : >> > Hi everyone! >> > >> > We propose to establish some lightweight structures in the Flink open >> > source community and development process, >> > to help us better handle the increased interest in Flink (mailing >> >> list >> and >> > pull requests), while not overwhelming the >> > committers, and giving users and contributors a good experience. >> > >> > This proposal is triggered by the observation that we are reaching >> >> the >> > limits of where the current community can support >> > users and guide new contributors. The below proposal is based on >> > observations and ideas from Till, Robert, and me. >> > >> > >> > Goals >> > >> > >> > We try to achieve the following >> > >> > - Pull requests get handled in a timely fashion >> > - New contributors are better integrated into the community >> > - The community feels empowered on the mailing list. >> > But questions that need the attention of someone that has deep >> > knowledge of a certain part of Flink get their attention. >> > - At the same time, the committers that are knowledgeable about >> >> many >> core >> > parts do not get completely overwhelmed. >> > - We don't overlook threads that report critical issues. >> > - We always have a pretty good overview of what the status of >> >> certain >> > parts of the system are. >> > -> What are often encountered known issues >> > -> What are the most frequently requested features >> > >> > >> > >> > Problems >> > >> > >> > Looking into the process, there are two big issues: >> > >> > (1) Up to now, we have been relying on the fact that everything just >> > "organizes itself", driven by best effort. That assumes >> > that everyone feels equally responsible for every part, question, and >> > contribution. At the current state, this is impossible >> > to maintain, it overwhelms the committers and contributors. >> > >> > Example: Pull requests are picked up by whoever wants to pick them >> >> up. >> Pull >> > requests that are a lot of work, have little >> > chance of getting in, or relate to less active components are >> >> sometimes >> not >> > picked up. When contributors are pretty >> > loaded already, it may happen that no one eventually feels >> >> responsible >> >>> to >> > pick up a pull request, and it falls through the cracks. >> > >> > (2) There is no good overview of what are known shortcomings, >> >> efforts, >> and >> > requested features for different parts of the system. >> > This information exists in various peoples' heads, but is not easily >> > accessible for new people. The Flink JIRA is not well >> > maintained, it is not easy to draw insights from that. >> > >> > >> > === >> > The Proposal >> > === >> > >> > Since we are building a parallel system, the natural solution seems >> >> to >> be: >> > partition the workload ;-) >> > >> > We propose to define a set of components for Flink. Each component is >> > maintained or tracked by one or more >> > people - let's call them maintainers. It is important to note that we >> don't >> > suggest
Dataset split/demultiplex
Hi folks, Is there any way in dataset api to split Dataset[A] to Dataset[A] and Dataset[B] ? Use case belongs to a custom filter component that we want to implement. We will want to direct input elements whose result is false after we apply the predicate. Actually we want to direct input elements that throw exception to another output as well(demultiplexer like component). Thank you in advance...
[jira] [Created] (FLINK-3899) Document window processing with Reduce/FoldFunction + WindowFunction
Fabian Hueske created FLINK-3899: Summary: Document window processing with Reduce/FoldFunction + WindowFunction Key: FLINK-3899 URL: https://issues.apache.org/jira/browse/FLINK-3899 Project: Flink Issue Type: Improvement Components: Documentation, Streaming Affects Versions: 1.1.0 Reporter: Fabian Hueske The streaming documentation does not describe how windows can be processed with FoldFunction or ReduceFunction and a subsequent WindowFunction. This combination allows for eager window aggregation (only a single element is kept in the window) and access of the Window object, e.g., to have access to the window's start and end time. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
Re: Dataset split/demultiplex
Hello, You can split a DataSet into two DataSets with two filters: val xs: DataSet[A] = ... val split1: DataSet[A] = xs.filter(f1) val split2: DataSet[A] = xs.filter(f2) where f1 and f2 are true for those elements that should go into the first and second DataSets respectively. So far, the splits will just contain elements from the input DataSet, but you can of course apply some map after one of the filters. Does this help? Best, Gábor 2016-05-12 16:03 GMT+02:00 CPC : > Hi folks, > > Is there any way in dataset api to split Dataset[A] to Dataset[A] and > Dataset[B] ? Use case belongs to a custom filter component that we want to > implement. We will want to direct input elements whose result is false > after we apply the predicate. Actually we want to direct input elements > that throw exception to another output as well(demultiplexer like > component). > > Thank you in advance...
Re: Dataset split/demultiplex
Hi Gabor, Yes functionally this helps. But in this case i am processing an element twice and sending whole data to two different operator . What i am trying to achieve is like datastream split like functionality or a little bit more: In filter like scenario i want to do below pseudo operation: def function(iter: Iterator[URLOutputData], trueEvents: >> Collector[URLOutputData], falseEvents: Collector[URLOutputData], errEvents: >> Collector[URLOutputData]) { > > iter.foreach { > > i => > > try { > > if (predicate(i)) > > trueEvents.collect(i) > > else > > falseEvents.collect(i) > > } catch { > > case _ => errEvents.collect(i) > > } > > } > > } > > Another case could be,suppose i have an input set of web events comes from different web apps and i want to split dataset based on application category Thanks, On 12 May 2016 at 17:28, Gábor Gévay wrote: > Hello, > > You can split a DataSet into two DataSets with two filters: > > val xs: DataSet[A] = ... > val split1: DataSet[A] = xs.filter(f1) > val split2: DataSet[A] = xs.filter(f2) > > where f1 and f2 are true for those elements that should go into the > first and second DataSets respectively. So far, the splits will just > contain elements from the input DataSet, but you can of course apply > some map after one of the filters. > > Does this help? > > Best, > Gábor > > > > 2016-05-12 16:03 GMT+02:00 CPC : > > Hi folks, > > > > Is there any way in dataset api to split Dataset[A] to Dataset[A] and > > Dataset[B] ? Use case belongs to a custom filter component that we want > to > > implement. We will want to direct input elements whose result is false > > after we apply the predicate. Actually we want to direct input elements > > that throw exception to another output as well(demultiplexer like > > component). > > > > Thank you in advance... >
Re: [PROPOSAL] Structure the Flink Open Source Development
tl;dr: +1 I also like the proposal a lot. Our community is growing at a quite fast pace and we need to have some structure in place to still keep track of everything going on. I'm happy to see that the proposal mentions cleaning up our JIRA. This is something that has been annoying me for quite a while, but its too big to do it alone. If maintainers could take care of their components, we should have covered already a lot there. One question regarding the "chair" or "lead" role for components: Is the first name in the list of maintainers the lead? I would actually suggest to wait until all proposed maintainers agreed to the proposal. It doesn't make sense to make somebody a maintainer of something if they disagree or are not aware of it. On Thu, May 12, 2016 at 2:13 PM, Maximilian Michels wrote: > +1 for the initiative. With a better process we will improve the > quality of the Flink development and give us more time to focus. > > Could we have another category "Infrastructure"? This would concern > things like CI, nightly deployment of snapshots/documentation, ASF > Infra communication. Robert and me could be the initial maintainers > for that. > > On Thu, May 12, 2016 at 1:52 PM, Stephan Ewen wrote: > > Yes, Matthias, that was supposed to be you. > > Sorry from another guy who frequently has his name misspelled ;-) > > > > On Thu, May 12, 2016 at 1:27 PM, Matthias J. Sax > wrote: > > > >> +1 from my side. > >> > >> Happy to be the maintainer for Storm-Compatibiltiy (at least I guess > >> it's me, even the correct spelling would be with two 't' :P) > >> > >> -Matthias > >> > >> On 05/12/2016 12:56 PM, Till Rohrmann wrote: > >> > +1 for the proposal > >> > On May 12, 2016 12:13 PM, "Stephan Ewen" wrote: > >> > > >> >> Yes, Gabor Gevay, that did refer to you! > >> >> > >> >> Sorry for the ambiguity... > >> >> > >> >> On Thu, May 12, 2016 at 10:46 AM, Márton Balassi < > >> balassi.mar...@gmail.com > >> >>> > >> >> wrote: > >> >> > >> >>> +1 for the proposal > >> >>> @ggevay: I do think that it refers to you. :) > >> >>> > >> >>> On Thu, May 12, 2016 at 10:40 AM, Gábor Gévay > >> wrote: > >> >>> > >> Hello, > >> > >> There are at least three Gábors in the Flink community, :) so > >> assuming that the Gábor in the list of maintainers of the DataSet > API > >> is referring to me, I'll be happy to do it. :) > >> > >> Best, > >> Gábor G. > >> > >> > >> > >> 2016-05-10 11:24 GMT+02:00 Stephan Ewen : > >> > Hi everyone! > >> > > >> > We propose to establish some lightweight structures in the Flink > open > >> > source community and development process, > >> > to help us better handle the increased interest in Flink (mailing > >> >> list > >> and > >> > pull requests), while not overwhelming the > >> > committers, and giving users and contributors a good experience. > >> > > >> > This proposal is triggered by the observation that we are reaching > >> >> the > >> > limits of where the current community can support > >> > users and guide new contributors. The below proposal is based on > >> > observations and ideas from Till, Robert, and me. > >> > > >> > > >> > Goals > >> > > >> > > >> > We try to achieve the following > >> > > >> > - Pull requests get handled in a timely fashion > >> > - New contributors are better integrated into the community > >> > - The community feels empowered on the mailing list. > >> > But questions that need the attention of someone that has deep > >> > knowledge of a certain part of Flink get their attention. > >> > - At the same time, the committers that are knowledgeable about > >> >> many > >> core > >> > parts do not get completely overwhelmed. > >> > - We don't overlook threads that report critical issues. > >> > - We always have a pretty good overview of what the status of > >> >> certain > >> > parts of the system are. > >> > -> What are often encountered known issues > >> > -> What are the most frequently requested features > >> > > >> > > >> > > >> > Problems > >> > > >> > > >> > Looking into the process, there are two big issues: > >> > > >> > (1) Up to now, we have been relying on the fact that everything > just > >> > "organizes itself", driven by best effort. That assumes > >> > that everyone feels equally responsible for every part, question, > and > >> > contribution. At the current state, this is impossible > >> > to maintain, it overwhelms the committers and contributors. > >> > > >> > Example: Pull requests are picked up by whoever wants to pick them > >> >> up. > >> Pull > >> > requests that are a lot of work, have little > >> > chance of getting in, or relate to less active components are > >> >> sometimes > >> not > >> >
Re: Dataset split/demultiplex
Hi, I agree that this would be very nice. Unfortunately Flink does only allow one output from an operation right now. Maybe we can extends this somehow in the future. Cheers, Aljoscha On Thu, 12 May 2016 at 17:27 CPC wrote: > Hi Gabor, > > Yes functionally this helps. But in this case i am processing an element > twice and sending whole data to two different operator . What i am trying > to achieve is like datastream split like functionality or a little bit > more: > In filter like scenario i want to do below pseudo operation: > > def function(iter: Iterator[URLOutputData], trueEvents: > >> Collector[URLOutputData], falseEvents: Collector[URLOutputData], > errEvents: > >> Collector[URLOutputData]) { > > > > iter.foreach { > > > > i => > > > > try { > > > > if (predicate(i)) > > > > trueEvents.collect(i) > > > > else > > > > falseEvents.collect(i) > > > > } catch { > > > > case _ => errEvents.collect(i) > > > > } > > > > } > > > > } > > > > > Another case could be,suppose i have an input set of web events comes from > different web apps and i want to split dataset based on application > category > > Thanks, > > > On 12 May 2016 at 17:28, Gábor Gévay wrote: > > > Hello, > > > > You can split a DataSet into two DataSets with two filters: > > > > val xs: DataSet[A] = ... > > val split1: DataSet[A] = xs.filter(f1) > > val split2: DataSet[A] = xs.filter(f2) > > > > where f1 and f2 are true for those elements that should go into the > > first and second DataSets respectively. So far, the splits will just > > contain elements from the input DataSet, but you can of course apply > > some map after one of the filters. > > > > Does this help? > > > > Best, > > Gábor > > > > > > > > 2016-05-12 16:03 GMT+02:00 CPC : > > > Hi folks, > > > > > > Is there any way in dataset api to split Dataset[A] to Dataset[A] and > > > Dataset[B] ? Use case belongs to a custom filter component that we want > > to > > > implement. We will want to direct input elements whose result is false > > > after we apply the predicate. Actually we want to direct input elements > > > that throw exception to another output as well(demultiplexer like > > > component). > > > > > > Thank you in advance... > > >
Re: [PROPOSAL] Structure the Flink Open Source Development
All maintainer candidates are only proposals so far. No indication of lead or anything so far. Let's first see if we agree on the structure proposed here, and if we take the components as suggested here or if we refine the list. Am 12.05.2016 17:45 schrieb "Robert Metzger" : > tl;dr: +1 > > I also like the proposal a lot. Our community is growing at a quite fast > pace and we need to have some structure in place to still keep track of > everything going on. > > I'm happy to see that the proposal mentions cleaning up our JIRA. This is > something that has been annoying me for quite a while, but its too big to > do it alone. If maintainers could take care of their components, we should > have covered already a lot there. > > One question regarding the "chair" or "lead" role for components: Is the > first name in the list of maintainers the lead? > > I would actually suggest to wait until all proposed maintainers agreed to > the proposal. It doesn't make sense to make somebody a maintainer of > something if they disagree or are not aware of it. > > > > > On Thu, May 12, 2016 at 2:13 PM, Maximilian Michels > wrote: > > > +1 for the initiative. With a better process we will improve the > > quality of the Flink development and give us more time to focus. > > > > Could we have another category "Infrastructure"? This would concern > > things like CI, nightly deployment of snapshots/documentation, ASF > > Infra communication. Robert and me could be the initial maintainers > > for that. > > > > On Thu, May 12, 2016 at 1:52 PM, Stephan Ewen wrote: > > > Yes, Matthias, that was supposed to be you. > > > Sorry from another guy who frequently has his name misspelled ;-) > > > > > > On Thu, May 12, 2016 at 1:27 PM, Matthias J. Sax > > wrote: > > > > > >> +1 from my side. > > >> > > >> Happy to be the maintainer for Storm-Compatibiltiy (at least I guess > > >> it's me, even the correct spelling would be with two 't' :P) > > >> > > >> -Matthias > > >> > > >> On 05/12/2016 12:56 PM, Till Rohrmann wrote: > > >> > +1 for the proposal > > >> > On May 12, 2016 12:13 PM, "Stephan Ewen" wrote: > > >> > > > >> >> Yes, Gabor Gevay, that did refer to you! > > >> >> > > >> >> Sorry for the ambiguity... > > >> >> > > >> >> On Thu, May 12, 2016 at 10:46 AM, Márton Balassi < > > >> balassi.mar...@gmail.com > > >> >>> > > >> >> wrote: > > >> >> > > >> >>> +1 for the proposal > > >> >>> @ggevay: I do think that it refers to you. :) > > >> >>> > > >> >>> On Thu, May 12, 2016 at 10:40 AM, Gábor Gévay > > >> wrote: > > >> >>> > > >> Hello, > > >> > > >> There are at least three Gábors in the Flink community, :) so > > >> assuming that the Gábor in the list of maintainers of the DataSet > > API > > >> is referring to me, I'll be happy to do it. :) > > >> > > >> Best, > > >> Gábor G. > > >> > > >> > > >> > > >> 2016-05-10 11:24 GMT+02:00 Stephan Ewen : > > >> > Hi everyone! > > >> > > > >> > We propose to establish some lightweight structures in the Flink > > open > > >> > source community and development process, > > >> > to help us better handle the increased interest in Flink > (mailing > > >> >> list > > >> and > > >> > pull requests), while not overwhelming the > > >> > committers, and giving users and contributors a good experience. > > >> > > > >> > This proposal is triggered by the observation that we are > reaching > > >> >> the > > >> > limits of where the current community can support > > >> > users and guide new contributors. The below proposal is based on > > >> > observations and ideas from Till, Robert, and me. > > >> > > > >> > > > >> > Goals > > >> > > > >> > > > >> > We try to achieve the following > > >> > > > >> > - Pull requests get handled in a timely fashion > > >> > - New contributors are better integrated into the community > > >> > - The community feels empowered on the mailing list. > > >> > But questions that need the attention of someone that has > deep > > >> > knowledge of a certain part of Flink get their attention. > > >> > - At the same time, the committers that are knowledgeable > about > > >> >> many > > >> core > > >> > parts do not get completely overwhelmed. > > >> > - We don't overlook threads that report critical issues. > > >> > - We always have a pretty good overview of what the status of > > >> >> certain > > >> > parts of the system are. > > >> > -> What are often encountered known issues > > >> > -> What are the most frequently requested features > > >> > > > >> > > > >> > > > >> > Problems > > >> > > > >> > > > >> > Looking into the process, there are two big issues: > > >> > > > >> > (1) Up to now, we have been relying on the fact that everything > > just > > >> > "organizes itself", driven by be
[jira] [Created] (FLINK-3900) Set nullCheck=true as default in TableConfig
Flavio Pompermaier created FLINK-3900: - Summary: Set nullCheck=true as default in TableConfig Key: FLINK-3900 URL: https://issues.apache.org/jira/browse/FLINK-3900 Project: Flink Issue Type: Improvement Components: Table API Affects Versions: 1.0.2 Reporter: Flavio Pompermaier Priority: Minor As discussed with Fabian, TableConfig should use nullCheck=true as default to allow for null values in the data -- This message was sent by Atlassian JIRA (v6.3.4#6332)
Re: [PROPOSAL] Structure the Flink Open Source Development
+1 The ideas seem good and the proposed number of components seems reasonable. With this, we should also then cleanup the JIRA to make it actually usable. On Thu, 12 May 2016 at 18:09 Stephan Ewen wrote: > All maintainer candidates are only proposals so far. No indication of lead > or anything so far. > > Let's first see if we agree on the structure proposed here, and if we take > the components as suggested here or if we refine the list. > Am 12.05.2016 17:45 schrieb "Robert Metzger" : > > > tl;dr: +1 > > > > I also like the proposal a lot. Our community is growing at a quite fast > > pace and we need to have some structure in place to still keep track of > > everything going on. > > > > I'm happy to see that the proposal mentions cleaning up our JIRA. This is > > something that has been annoying me for quite a while, but its too big to > > do it alone. If maintainers could take care of their components, we > should > > have covered already a lot there. > > > > One question regarding the "chair" or "lead" role for components: Is the > > first name in the list of maintainers the lead? > > > > I would actually suggest to wait until all proposed maintainers agreed to > > the proposal. It doesn't make sense to make somebody a maintainer of > > something if they disagree or are not aware of it. > > > > > > > > > > On Thu, May 12, 2016 at 2:13 PM, Maximilian Michels > > wrote: > > > > > +1 for the initiative. With a better process we will improve the > > > quality of the Flink development and give us more time to focus. > > > > > > Could we have another category "Infrastructure"? This would concern > > > things like CI, nightly deployment of snapshots/documentation, ASF > > > Infra communication. Robert and me could be the initial maintainers > > > for that. > > > > > > On Thu, May 12, 2016 at 1:52 PM, Stephan Ewen > wrote: > > > > Yes, Matthias, that was supposed to be you. > > > > Sorry from another guy who frequently has his name misspelled ;-) > > > > > > > > On Thu, May 12, 2016 at 1:27 PM, Matthias J. Sax > > > wrote: > > > > > > > >> +1 from my side. > > > >> > > > >> Happy to be the maintainer for Storm-Compatibiltiy (at least I guess > > > >> it's me, even the correct spelling would be with two 't' :P) > > > >> > > > >> -Matthias > > > >> > > > >> On 05/12/2016 12:56 PM, Till Rohrmann wrote: > > > >> > +1 for the proposal > > > >> > On May 12, 2016 12:13 PM, "Stephan Ewen" > wrote: > > > >> > > > > >> >> Yes, Gabor Gevay, that did refer to you! > > > >> >> > > > >> >> Sorry for the ambiguity... > > > >> >> > > > >> >> On Thu, May 12, 2016 at 10:46 AM, Márton Balassi < > > > >> balassi.mar...@gmail.com > > > >> >>> > > > >> >> wrote: > > > >> >> > > > >> >>> +1 for the proposal > > > >> >>> @ggevay: I do think that it refers to you. :) > > > >> >>> > > > >> >>> On Thu, May 12, 2016 at 10:40 AM, Gábor Gévay > > > > >> wrote: > > > >> >>> > > > >> Hello, > > > >> > > > >> There are at least three Gábors in the Flink community, :) so > > > >> assuming that the Gábor in the list of maintainers of the > DataSet > > > API > > > >> is referring to me, I'll be happy to do it. :) > > > >> > > > >> Best, > > > >> Gábor G. > > > >> > > > >> > > > >> > > > >> 2016-05-10 11:24 GMT+02:00 Stephan Ewen : > > > >> > Hi everyone! > > > >> > > > > >> > We propose to establish some lightweight structures in the > Flink > > > open > > > >> > source community and development process, > > > >> > to help us better handle the increased interest in Flink > > (mailing > > > >> >> list > > > >> and > > > >> > pull requests), while not overwhelming the > > > >> > committers, and giving users and contributors a good > experience. > > > >> > > > > >> > This proposal is triggered by the observation that we are > > reaching > > > >> >> the > > > >> > limits of where the current community can support > > > >> > users and guide new contributors. The below proposal is based > on > > > >> > observations and ideas from Till, Robert, and me. > > > >> > > > > >> > > > > >> > Goals > > > >> > > > > >> > > > > >> > We try to achieve the following > > > >> > > > > >> > - Pull requests get handled in a timely fashion > > > >> > - New contributors are better integrated into the community > > > >> > - The community feels empowered on the mailing list. > > > >> > But questions that need the attention of someone that has > > deep > > > >> > knowledge of a certain part of Flink get their attention. > > > >> > - At the same time, the committers that are knowledgeable > > about > > > >> >> many > > > >> core > > > >> > parts do not get completely overwhelmed. > > > >> > - We don't overlook threads that report critical issues. > > > >> > - We always have a pretty good overview of what the status > of > > > >> >> certain > > > >> >>>
[jira] [Created] (FLINK-3901) Create a RowCsvInputFormat to use as default CSV IF in Table API
Flavio Pompermaier created FLINK-3901: - Summary: Create a RowCsvInputFormat to use as default CSV IF in Table API Key: FLINK-3901 URL: https://issues.apache.org/jira/browse/FLINK-3901 Project: Flink Issue Type: Improvement Affects Versions: 1.0.2 Reporter: Flavio Pompermaier Priority: Minor At the moment the Table APIs reads CSVs using the TupleCsvInputFormat, that has the big limitation of 25 fields and null handling. A new IF producing Row object is indeed necessary to avoid those limitations -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (FLINK-3902) Discarded FileSystem checkpoints are lingering around
Ufuk Celebi created FLINK-3902: -- Summary: Discarded FileSystem checkpoints are lingering around Key: FLINK-3902 URL: https://issues.apache.org/jira/browse/FLINK-3902 Project: Flink Issue Type: Bug Components: Distributed Runtime Affects Versions: 1.0.2 Reporter: Ufuk Celebi A user reported that checkpoints with {{FSStateBackend}} are not properly cleaned up. {code} 2016-05-10 12:21:06,559 INFO BlockStateChange: BLOCK* addToInvalidates: blk_1084791727_11053122 10.10.113.10:50010 2016-05-10 12:21:06,559 INFO org.apache.hadoop.ipc.Server: IPC Server handler 9 on 8020, call org.apache.hadoop.hdfs.protocol.ClientProtocol.delete from 10.10.113.9:49233 Call#12337 Retry#0 org.apache.hadoop.fs.PathIsNotEmptyDirectoryException: `/flink/checkpoints_test/570d6e67d571c109daab468e5678402b/chk-62 is non empty': Directory is not empty at org.apache.hadoop.hdfs.server.namenode.FSDirDeleteOp.delete(FSDirDeleteOp.java:85) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.delete(FSNamesystem.java:3712) {code} {code} 2016-05-10 12:20:22,636 [Checkpoint Timer] INFO org.apache.flink.runtime.checkpoint.CheckpointCoordinator - Triggering checkpoint 62 @ 1462875622636 2016-05-10 12:20:32,507 [flink-akka.actor.default-dispatcher-240088] INFO org.apache.flink.runtime.checkpoint.CheckpointCoordinator - Completed checkpoint 62 (in 9843 ms) 2016-05-10 12:20:52,637 [Checkpoint Timer] INFO org.apache.flink.runtime.checkpoint.CheckpointCoordinator - Triggering checkpoint 63 @ 1462875652637 2016-05-10 12:21:06,563 [flink-akka.actor.default-dispatcher-240028] INFO org.apache.flink.runtime.checkpoint.CheckpointCoordinator - Completed checkpoint 63 (in 13909 ms) 2016-05-10 12:21:22,636 [Checkpoint Timer] INFO org.apache.flink.runtime.checkpoint.CheckpointCoordinator - Triggering checkpoint 64 @ 1462875682636 {code} Running the same program with the {{RocksDBBackend}} works as expected and clears the old checkpoints properly. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
Re: [PROPOSAL] Structure the Flink Open Source Development
For what it's worth, this is very close to how HBase attempts to manage the community load. We break out components (in Jira), with a list of named component maintainers. Actually, having components alone has given a Big Bang for the buck because when properly labeled, it makes it really easy for part-timers to channel their efforts with precision. As a flink user, I'm +1 for this proposal as well :) On Thursday, May 12, 2016, Aljoscha Krettek wrote: > +1 > > The ideas seem good and the proposed number of components seems reasonable. > With this, we should also then cleanup the JIRA to make it actually usable. > > On Thu, 12 May 2016 at 18:09 Stephan Ewen > > wrote: > > > All maintainer candidates are only proposals so far. No indication of > lead > > or anything so far. > > > > Let's first see if we agree on the structure proposed here, and if we > take > > the components as suggested here or if we refine the list. > > Am 12.05.2016 17:45 schrieb "Robert Metzger" >: > > > > > tl;dr: +1 > > > > > > I also like the proposal a lot. Our community is growing at a quite > fast > > > pace and we need to have some structure in place to still keep track of > > > everything going on. > > > > > > I'm happy to see that the proposal mentions cleaning up our JIRA. This > is > > > something that has been annoying me for quite a while, but its too big > to > > > do it alone. If maintainers could take care of their components, we > > should > > > have covered already a lot there. > > > > > > One question regarding the "chair" or "lead" role for components: Is > the > > > first name in the list of maintainers the lead? > > > > > > I would actually suggest to wait until all proposed maintainers agreed > to > > > the proposal. It doesn't make sense to make somebody a maintainer of > > > something if they disagree or are not aware of it. > > > > > > > > > > > > > > > On Thu, May 12, 2016 at 2:13 PM, Maximilian Michels > > > > wrote: > > > > > > > +1 for the initiative. With a better process we will improve the > > > > quality of the Flink development and give us more time to focus. > > > > > > > > Could we have another category "Infrastructure"? This would concern > > > > things like CI, nightly deployment of snapshots/documentation, ASF > > > > Infra communication. Robert and me could be the initial maintainers > > > > for that. > > > > > > > > On Thu, May 12, 2016 at 1:52 PM, Stephan Ewen > > > wrote: > > > > > Yes, Matthias, that was supposed to be you. > > > > > Sorry from another guy who frequently has his name misspelled ;-) > > > > > > > > > > On Thu, May 12, 2016 at 1:27 PM, Matthias J. Sax > > > > > wrote: > > > > > > > > > >> +1 from my side. > > > > >> > > > > >> Happy to be the maintainer for Storm-Compatibiltiy (at least I > guess > > > > >> it's me, even the correct spelling would be with two 't' :P) > > > > >> > > > > >> -Matthias > > > > >> > > > > >> On 05/12/2016 12:56 PM, Till Rohrmann wrote: > > > > >> > +1 for the proposal > > > > >> > On May 12, 2016 12:13 PM, "Stephan Ewen" > > > wrote: > > > > >> > > > > > >> >> Yes, Gabor Gevay, that did refer to you! > > > > >> >> > > > > >> >> Sorry for the ambiguity... > > > > >> >> > > > > >> >> On Thu, May 12, 2016 at 10:46 AM, Márton Balassi < > > > > >> balassi.mar...@gmail.com > > > > >> >>> > > > > >> >> wrote: > > > > >> >> > > > > >> >>> +1 for the proposal > > > > >> >>> @ggevay: I do think that it refers to you. :) > > > > >> >>> > > > > >> >>> On Thu, May 12, 2016 at 10:40 AM, Gábor Gévay < > gga...@gmail.com > > > > > > > >> wrote: > > > > >> >>> > > > > >> Hello, > > > > >> > > > > >> There are at least three Gábors in the Flink community, :) > so > > > > >> assuming that the Gábor in the list of maintainers of the > > DataSet > > > > API > > > > >> is referring to me, I'll be happy to do it. :) > > > > >> > > > > >> Best, > > > > >> Gábor G. > > > > >> > > > > >> > > > > >> > > > > >> 2016-05-10 11:24 GMT+02:00 Stephan Ewen >: > > > > >> > Hi everyone! > > > > >> > > > > > >> > We propose to establish some lightweight structures in the > > Flink > > > > open > > > > >> > source community and development process, > > > > >> > to help us better handle the increased interest in Flink > > > (mailing > > > > >> >> list > > > > >> and > > > > >> > pull requests), while not overwhelming the > > > > >> > committers, and giving users and contributors a good > > experience. > > > > >> > > > > > >> > This proposal is triggered by the observation that we are > > > reaching > > > > >> >> the > > > > >> > limits of where the current community can support > > > > >> > users and guide new contributors. The below proposal is > based > > on > > > > >> > observations and ideas from Till, Robert, and me. > > > > >> > > > > > >> > > > > > >> > Goals > > > > >> > > > > > >> > > > > > >> > We try to achieve t
Re: Dataset split/demultiplex
Hi, if it just require implementing a custom operator(i mean does not require changes to network stack or other engine level changes) i can try to implement it since i am working on optimizer and plan generation for a month. Also we are going to implement our etl framework on flink and this kind of scenario is a good fit and a common requirement in etl like flows. If you can help me which parts of the project I should look for , i can try it. Thanks On May 12, 2016 6:54 PM, "Aljoscha Krettek" wrote: > Hi, > I agree that this would be very nice. Unfortunately Flink does only allow > one output from an operation right now. Maybe we can extends this somehow > in the future. > > Cheers, > Aljoscha > > On Thu, 12 May 2016 at 17:27 CPC wrote: > > > Hi Gabor, > > > > Yes functionally this helps. But in this case i am processing an element > > twice and sending whole data to two different operator . What i am > trying > > to achieve is like datastream split like functionality or a little bit > > more: > > In filter like scenario i want to do below pseudo operation: > > > > def function(iter: Iterator[URLOutputData], trueEvents: > > >> Collector[URLOutputData], falseEvents: Collector[URLOutputData], > > errEvents: > > >> Collector[URLOutputData]) { > > > > > > iter.foreach { > > > > > > i => > > > > > > try { > > > > > > if (predicate(i)) > > > > > > trueEvents.collect(i) > > > > > > else > > > > > > falseEvents.collect(i) > > > > > > } catch { > > > > > > case _ => errEvents.collect(i) > > > > > > } > > > > > > } > > > > > > } > > > > > > > > Another case could be,suppose i have an input set of web events comes > from > > different web apps and i want to split dataset based on application > > category > > > > Thanks, > > > > > > On 12 May 2016 at 17:28, Gábor Gévay wrote: > > > > > Hello, > > > > > > You can split a DataSet into two DataSets with two filters: > > > > > > val xs: DataSet[A] = ... > > > val split1: DataSet[A] = xs.filter(f1) > > > val split2: DataSet[A] = xs.filter(f2) > > > > > > where f1 and f2 are true for those elements that should go into the > > > first and second DataSets respectively. So far, the splits will just > > > contain elements from the input DataSet, but you can of course apply > > > some map after one of the filters. > > > > > > Does this help? > > > > > > Best, > > > Gábor > > > > > > > > > > > > 2016-05-12 16:03 GMT+02:00 CPC : > > > > Hi folks, > > > > > > > > Is there any way in dataset api to split Dataset[A] to Dataset[A] and > > > > Dataset[B] ? Use case belongs to a custom filter component that we > want > > > to > > > > implement. We will want to direct input elements whose result is > false > > > > after we apply the predicate. Actually we want to direct input > elements > > > > that throw exception to another output as well(demultiplexer like > > > > component). > > > > > > > > Thank you in advance... > > > > > >
[jira] [Created] (FLINK-3903) Homebrew Installation
Eron Wright created FLINK-3903: --- Summary: Homebrew Installation Key: FLINK-3903 URL: https://issues.apache.org/jira/browse/FLINK-3903 Project: Flink Issue Type: Task Components: Documentation, release Reporter: Eron Wright Assignee: Ufuk Celebi Priority: Minor Recently I submitted a formula for apache-flink to the [homebrew|http://brew.sh/] project. Homebrew simplifies installation on Mac: {code} $ brew install apache-flink ... $ flink --version Version: 1.0.2, Commit ID: d39af15 {code} Updates to the formula are adhoc at the moment. I opened this issue to formalize updating homebrew into the release process. I drafted a procedure doc here: [https://gist.github.com/EronWright/b62bd3b192a15be4c200a2542f7c9376] Please also consider updating the website documentation to suggest homebrew as an alternate installation method for Mac users. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
Re: [RESULT] [VOTE] Release Apache Flink 1.0.3 (RC3)
FYI the brew formula has been updated to 1.0.3. $ brew info apache-flink apache-flink: stable 1.0.3, HEAD Scalable batch and stream data processing https://flink.apache.org/ Not installed From: https://github.com/Homebrew/homebrew-core/blob/master/Formula/apache-flink.rb > On May 12, 2016, at 12:58 AM, Till Rohrmann wrote: > > Thanks Ufuk :-) > > On Wed, May 11, 2016 at 5:16 PM, Stephan Ewen wrote: > >> Thanks for pushing this release Ufuk! >> >> On Wed, May 11, 2016 at 5:12 PM, Fabian Hueske wrote: >> >>> Thanks Ufuk! >>> >>> 2016-05-11 16:39 GMT+02:00 Ufuk Celebi : >>> This vote has passed with 3 binding +1 votes. Thanks to everyone who contributed and tested the release candidate. +1s: Gyula Fora (binding) Fabian Hueske (binding) Ufuk Celebi (binding) There are no 0s or -1s. I'll go ahead finalize and package this release. On Mon, May 9, 2016 at 10:24 AM, Ufuk Celebi wrote: > Dear Flink community, > > Please vote on releasing the following candidate as Apache Flink >>> version 1.0.3. > > The commit to be voted on: > f3a6b5f1e8d85d10e1449e2f96291408b781 > > Branch: > release-1.0.3-rc3 (see > >>> >> https://git1-us-west.apache.org/repos/asf/flink/?p=flink.git;a=shortlog;h=refs/heads/release-1.0.3-rc3 ) > > The release artifacts to be voted on can be found at: > http://home.apache.org/~uce/flink-1.0.3-rc3/ > > The release artifacts are signed with the key with fingerprint >>> 9D403309: > http://www.apache.org/dist/flink/KEYS > > The staging repository for this release can be found at: > >> https://repository.apache.org/content/repositories/orgapacheflink-1096 > > - > > The vote is open for the next 48 hours and passes if a majority of at > least three +1 PMC votes are cast. > > The vote ends on Wednesday May 11, 2016. > > [ ] +1 Release this package as Apache Flink 1.0.3 > [ ] -1 Do not release this package because ... > > === > > The following commits have been added since the 1.0.2 release >>> (excluding docs): > > * 4d3dcb1 - [FLINK-3860] [connector-wikiedits] Add retry loop to > WikipediaEditsSourceTest (5 days ago) > * f1d34b1 - [FLINK-3790] [streaming] Use proper hadoop config in > rolling sink (12 hours ago) > * 4a34f6f - [FLINK-3835] [optimizer] Add input id to JSON plan to > resolve ambiguous input names. (2 days ago) > * d8feb15 - [hotfix] OptionSerializer.duplicate to respect stateful > element serializer (3 days ago) > * 7062b0a - [FLINK-3803] [runtime] Pass CheckpointStatsTracker to > ExecutionGraph (3 days ago) > * f80f6d6 - [FLINK-3678] [dist, docs] Make Flink logs directory > configurable (4 days ago) > * 344a55e - [hotfix] [cep] Make cep window border treatment >> consistent > (9 days ago) >>> >>
flink Kafka connector
Hi, I am trying to use the flink-kafka-connector and I notice that every time I restart my application it re-reads the last message on the kafka topic. So if the latest offset on the topic is 10, then when the application is restarted, kafka will re-read message 10. Why is this the behavior? I would assume that the last message has already been read and offset committed. I require that messages that are already processed from the topic not be reprocessed. Any insight would be helpful. Thanks Arun Balan
Re: Intellij code style
Please create a JIRA issue for this and send the PR with JIRA issue number. Regards, Chiwan Park > On May 12, 2016, at 7:15 PM, Flavio Pompermaier wrote: > > Do I need to open also a Jira or just the PR? > > On Thu, May 12, 2016 at 12:03 PM, Stephan Ewen wrote: > >> Yes, please open a pull request for that. >> >> On Thu, May 12, 2016 at 11:40 AM, Flavio Pompermaier >> >> wrote: >> >>> If you're interested to I created an Eclipse version that should follows >>> Flink coding rules..should I create a new JIRA for it? >>> >>> On Thu, May 5, 2016 at 6:02 PM, Dawid Wysakowicz < >>> wysakowicz.da...@gmail.com wrote: >>> I opened JIRA: https://issues.apache.org/jira/browse/FLINK-3870. and created PR both to flink and flink-web. https://github.com/apache/flink/pull/1963 https://github.com/apache/flink-web/pull/20 I would be thankful for a review. 2016-05-04 11:00 GMT+02:00 Fabian Hueske : > Yes, please open a JIRA. Thanks! > > 2016-05-04 10:16 GMT+02:00 Dawid Wysakowicz < >>> wysakowicz.da...@gmail.com > : > >> Sure, Will open PR shortly. Shall I create any JIRA issue? >> >> 2016-05-04 9:28 GMT+02:00 Fabian Hueske : >> >>> +1 for adding a template to the tools folder and linking it from >>> the >> coding >>> guide lines! >>> >>> 2016-05-04 6:08 GMT+02:00 Henry Saputra >> : >>> We could actually put this in the tools directory of the source >>> and >> repo and refer it from contribution guide. @Dawid want to try to send Pull request for it? On Thursday, April 28, 2016, Theodore Vasiloudis < theodoros.vasilou...@gmail.com> wrote: > Do we plan to include something like this in the contribution guide >> as > well? > > On Thu, Apr 28, 2016 at 3:16 PM, Stefano Baghino < > stefano.bagh...@radicalbit.io > wrote: > >> Awesome Dawid! Thanks for taking the time to do this. :) >> >> On Thu, Apr 28, 2016 at 1:45 PM, Dawid Wysakowicz < >> wysakowicz.da...@gmail.com > wrote: >> >>> Hi, >>> >>> I tried to create a code style that would follow Flink >> code-style. >>> It > may >>> be not "production" ready, but I think it can be a good start. >>> Hope it will be useful for someone. Also I will be glad >> for any > comments >>> on that. >>> >>> 2016-04-10 13:59 GMT+02:00 Stephan Ewen < >> se...@apache.org > >: >>> I don't know how close Phoenix' code style is to Flink's >> de-facto code style. I would create one that reflects Flink's de-facto code style, > so that >> the formatter does not change everything... On Sun, Apr 10, 2016 at 4:40 AM, Naveen Madhire < > vmadh...@umail.iu.edu > wrote: > Apache Phoenix has one code template which >> contributors use. >> Do you think > onc can use the same for Flink or may be with some >> more > modifications? > > > >> > >>> >> > >>> >> https://github.com/apache/phoenix/blob/master/dev/PhoenixCodeTemplate.xml > > On Sat, Apr 9, 2016 at 11:00 AM, Stephan Ewen < >> se...@apache.org > > >> wrote: > >> Actually, It would be amazing to create a code style > profile >>> for > download, >> so that all contributors would use that. >> >> Same thing actually for IntelliJ inspections: A set >> of inspections >> we > want >> to have active and where we strive for zero >> warnings. >> >> On Sat, Apr 9, 2016 at 10:00 AM, Robert Metzger < >> rmetz...@apache.org > >> wrote: >> >>> Hi Dawid, >>> >>> we don't have an automated formatter for intelliJ. >> However, you >> can use >> the >>> "Checkstyle" plugin of IntelliJ to mark checkstyle >>> violations in >> the > IDE. >>> >>> On Fri, Apr 8, 2016 at 12:30 PM, Dawid Wysakowicz >> < >>> wysakowicz.da...@gmail.com > wrote: >>> Hi all, I am currently working on some issues and been > wondering >>> if > you have settings for Intellij code style that would >> follow > your coding >> guidelines av
Re: [PROPOSAL] Structure the Flink Open Source Development
Thanks for great suggestion. +1 for this proposal. Regards, Chiwan Park > On May 13, 2016, at 1:44 AM, Nick Dimiduk wrote: > > For what it's worth, this is very close to how HBase attempts to manage the > community load. We break out components (in Jira), with a list of named > component maintainers. Actually, having components alone has given a Big > Bang for the buck because when properly labeled, it makes it really easy > for part-timers to channel their efforts with precision. > > As a flink user, I'm +1 for this proposal as well :) > > On Thursday, May 12, 2016, Aljoscha Krettek wrote: > >> +1 >> >> The ideas seem good and the proposed number of components seems reasonable. >> With this, we should also then cleanup the JIRA to make it actually usable. >> >> On Thu, 12 May 2016 at 18:09 Stephan Ewen > >> wrote: >> >>> All maintainer candidates are only proposals so far. No indication of >> lead >>> or anything so far. >>> >>> Let's first see if we agree on the structure proposed here, and if we >> take >>> the components as suggested here or if we refine the list. >>> Am 12.05.2016 17:45 schrieb "Robert Metzger" > >: >>> tl;dr: +1 I also like the proposal a lot. Our community is growing at a quite >> fast pace and we need to have some structure in place to still keep track of everything going on. I'm happy to see that the proposal mentions cleaning up our JIRA. This >> is something that has been annoying me for quite a while, but its too big >> to do it alone. If maintainers could take care of their components, we >>> should have covered already a lot there. One question regarding the "chair" or "lead" role for components: Is >> the first name in the list of maintainers the lead? I would actually suggest to wait until all proposed maintainers agreed >> to the proposal. It doesn't make sense to make somebody a maintainer of something if they disagree or are not aware of it. On Thu, May 12, 2016 at 2:13 PM, Maximilian Michels > > wrote: > +1 for the initiative. With a better process we will improve the > quality of the Flink development and give us more time to focus. > > Could we have another category "Infrastructure"? This would concern > things like CI, nightly deployment of snapshots/documentation, ASF > Infra communication. Robert and me could be the initial maintainers > for that. > > On Thu, May 12, 2016 at 1:52 PM, Stephan Ewen > > >>> wrote: >> Yes, Matthias, that was supposed to be you. >> Sorry from another guy who frequently has his name misspelled ;-) >> >> On Thu, May 12, 2016 at 1:27 PM, Matthias J. Sax > > > wrote: >> >>> +1 from my side. >>> >>> Happy to be the maintainer for Storm-Compatibiltiy (at least I >> guess >>> it's me, even the correct spelling would be with two 't' :P) >>> >>> -Matthias >>> >>> On 05/12/2016 12:56 PM, Till Rohrmann wrote: +1 for the proposal On May 12, 2016 12:13 PM, "Stephan Ewen" > > >>> wrote: > Yes, Gabor Gevay, that did refer to you! > > Sorry for the ambiguity... > > On Thu, May 12, 2016 at 10:46 AM, Márton Balassi < >>> balassi.mar...@gmail.com >> > wrote: > >> +1 for the proposal >> @ggevay: I do think that it refers to you. :) >> >> On Thu, May 12, 2016 at 10:40 AM, Gábor Gévay < >> gga...@gmail.com >>> wrote: >> >>> Hello, >>> >>> There are at least three Gábors in the Flink community, :) >> so >>> assuming that the Gábor in the list of maintainers of the >>> DataSet > API >>> is referring to me, I'll be happy to do it. :) >>> >>> Best, >>> Gábor G. >>> >>> >>> >>> 2016-05-10 11:24 GMT+02:00 Stephan Ewen > >: Hi everyone! We propose to establish some lightweight structures in the >>> Flink > open source community and development process, to help us better handle the increased interest in Flink (mailing > list >>> and pull requests), while not overwhelming the committers, and giving users and contributors a good >>> experience. This proposal is triggered by the observation that we are reaching > the limits of where the current community can support users and guide new contributors. The below proposal is >> based >>> on observations and ideas from Till, Robert, and me. Goals We try to achieve the following - Pull requests get handled in a timel