[jira] [Created] (FLINK-10361) Elasticsearch (v6.3.1) sink end-to-end test instable

2018-09-18 Thread Till Rohrmann (JIRA)
Till Rohrmann created FLINK-10361:
-

 Summary: Elasticsearch (v6.3.1) sink end-to-end test instable
 Key: FLINK-10361
 URL: https://issues.apache.org/jira/browse/FLINK-10361
 Project: Flink
  Issue Type: Bug
  Components: Tests
Affects Versions: 1.6.0, 1.7.0
Reporter: Till Rohrmann
 Fix For: 1.7.0
 Attachments: flink-elasticsearch-logs.tgz

The Elasticsearch (v6.3.1) sink end-to-end test is instable. Running it on an 
Amazon instance it failed with the following exception in the logs:
{code}
2018-09-17 20:46:04,856 INFO  org.apache.flink.runtime.taskmanager.Task 
- Source: Sequence Source -> Flat Map -> Sink: Unnamed (1/1) 
(cb23fdd9df0d4e09270b2ae9970efbac) switched from RUNNING to FAILED.
java.io.IOException: Connection refused
at 
org.elasticsearch.client.RestClient$SyncResponseListener.get(RestClient.java:728)
at 
org.elasticsearch.client.RestClient.performRequest(RestClient.java:235)
at 
org.elasticsearch.client.RestClient.performRequest(RestClient.java:198)
at 
org.elasticsearch.client.RestHighLevelClient.performRequest(RestHighLevelClient.java:522)
at 
org.elasticsearch.client.RestHighLevelClient.ping(RestHighLevelClient.java:275)
at 
org.apache.flink.streaming.connectors.elasticsearch6.Elasticsearch6ApiCallBridge.createClient(Elasticsearch6ApiCallBridge.java:81)
at 
org.apache.flink.streaming.connectors.elasticsearch6.Elasticsearch6ApiCallBridge.createClient(Elasticsearch6ApiCallBridge.java:47)
at 
org.apache.flink.streaming.connectors.elasticsearch.ElasticsearchSinkBase.open(ElasticsearchSinkBase.java:296)
at 
org.apache.flink.api.common.functions.util.FunctionUtils.openFunction(FunctionUtils.java:36)
at 
org.apache.flink.streaming.api.operators.AbstractUdfStreamOperator.open(AbstractUdfStreamOperator.java:102)
at 
org.apache.flink.streaming.api.operators.StreamSink.open(StreamSink.java:48)
at 
org.apache.flink.streaming.runtime.tasks.StreamTask.openAllOperators(StreamTask.java:424)
at 
org.apache.flink.streaming.runtime.tasks.StreamTask.invoke(StreamTask.java:290)
at org.apache.flink.runtime.taskmanager.Task.run(Task.java:711)
at java.lang.Thread.run(Thread.java:748)
Caused by: java.net.ConnectException: Connection refused
at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
at 
sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:717)
at 
org.apache.http.impl.nio.reactor.DefaultConnectingIOReactor.processEvent(DefaultConnectingIOReactor.java:171)
at 
org.apache.http.impl.nio.reactor.DefaultConnectingIOReactor.processEvents(DefaultConnectingIOReactor.java:145)
at 
org.apache.http.impl.nio.reactor.AbstractMultiworkerIOReactor.execute(AbstractMultiworkerIOReactor.java:348)
at 
org.apache.http.impl.nio.conn.PoolingNHttpClientConnectionManager.execute(PoolingNHttpClientConnectionManager.java:192)
at 
org.apache.http.impl.nio.client.CloseableHttpAsyncClientBase$1.run(CloseableHttpAsyncClientBase.java:64)
... 1 more
{code}

I assume that we should harden the test against connection problems a little 
bit better.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (FLINK-10362) Bundles S3 connectors load wrong Hadoop config

2018-09-18 Thread Stephan Ewen (JIRA)
Stephan Ewen created FLINK-10362:


 Summary: Bundles S3 connectors load wrong Hadoop config
 Key: FLINK-10362
 URL: https://issues.apache.org/jira/browse/FLINK-10362
 Project: Flink
  Issue Type: Bug
  Components: FileSystem
Affects Versions: 1.6.0
Reporter: Stephan Ewen
Assignee: Stephan Ewen
 Fix For: 1.7.0


The bundles S3 connectors internally build on Hadoop's file system support, but 
are packaged to not expose that at all - they are Hadoop-independent, from the 
user's perspective.

Hence, they should not try to take external Hadoop configurations into account, 
but only the Flink configuration.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (FLINK-10363) S3 FileSystem factory prints secrets into logs

2018-09-18 Thread Stephan Ewen (JIRA)
Stephan Ewen created FLINK-10363:


 Summary: S3 FileSystem factory prints secrets into logs
 Key: FLINK-10363
 URL: https://issues.apache.org/jira/browse/FLINK-10363
 Project: Flink
  Issue Type: Bug
  Components: FileSystem
Reporter: Stephan Ewen
Assignee: Stephan Ewen
 Fix For: 1.7.0, 1.5.4, 1.6.2


The file system factory logs all values it applies from the flink configuration.
That frequently includes access keys, which should not leak into logs.
The loader should only log the keys, not the values.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (FLINK-10364) Test instability in NonHAQueryableStateFsBackendITCase#testMapState

2018-09-18 Thread Nico Kruber (JIRA)
Nico Kruber created FLINK-10364:
---

 Summary: Test instability in 
NonHAQueryableStateFsBackendITCase#testMapState
 Key: FLINK-10364
 URL: https://issues.apache.org/jira/browse/FLINK-10364
 Project: Flink
  Issue Type: Bug
  Components: Queryable State, Tests
Affects Versions: 1.5.4
Reporter: Nico Kruber


{code}
Tests run: 12, Failures: 0, Errors: 1, Skipped: 1, Time elapsed: 14.365 sec <<< 
FAILURE! - in 
org.apache.flink.queryablestate.itcases.NonHAQueryableStateFsBackendITCase
testMapState(org.apache.flink.queryablestate.itcases.NonHAQueryableStateFsBackendITCase)
  Time elapsed: 0.253 sec  <<< ERROR!
java.lang.NullPointerException: null
at 
org.apache.flink.queryablestate.itcases.AbstractQueryableStateTestBase.testMapState(AbstractQueryableStateTestBase.java:840)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at 
org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50)
at 
org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
at 
org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47)
at 
org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
at 
org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
at org.junit.rules.ExternalResource$1.evaluate(ExternalResource.java:48)
at org.junit.rules.TestWatcher$1.evaluate(TestWatcher.java:55)
at org.junit.rules.RunRules.evaluate(RunRules.java:20)
at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:325)
at 
org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:78)
at 
org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:57)
at org.junit.runners.ParentRunner$3.run(ParentRunner.java:290)
at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:71)
at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:288)
at org.junit.runners.ParentRunner.access$000(ParentRunner.java:58)
at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:268)
at 
org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
at 
org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
at org.junit.rules.ExternalResource$1.evaluate(ExternalResource.java:48)
at org.junit.rules.RunRules.evaluate(RunRules.java:20)
at org.junit.runners.ParentRunner.run(ParentRunner.java:363)
at 
org.apache.maven.surefire.junit4.JUnit4Provider.execute(JUnit4Provider.java:283)
at 
org.apache.maven.surefire.junit4.JUnit4Provider.executeWithRerun(JUnit4Provider.java:173)
at 
org.apache.maven.surefire.junit4.JUnit4Provider.executeTestSet(JUnit4Provider.java:153)
at 
org.apache.maven.surefire.junit4.JUnit4Provider.invoke(JUnit4Provider.java:128)
at 
org.apache.maven.surefire.booter.ForkedBooter.invokeProviderInSameClassLoader(ForkedBooter.java:203)
at 
org.apache.maven.surefire.booter.ForkedBooter.runSuitesInProcess(ForkedBooter.java:155)
at 
org.apache.maven.surefire.booter.ForkedBooter.main(ForkedBooter.java:103)
{code}
from https://api.travis-ci.org/v3/job/429797943/log.txt



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (FLINK-10365) Consolidate shaded Hadoop classes for filesystems

2018-09-18 Thread Stephan Ewen (JIRA)
Stephan Ewen created FLINK-10365:


 Summary: Consolidate shaded Hadoop classes for filesystems
 Key: FLINK-10365
 URL: https://issues.apache.org/jira/browse/FLINK-10365
 Project: Flink
  Issue Type: Improvement
  Components: FileSystem
Reporter: Stephan Ewen
Assignee: Stephan Ewen
 Fix For: 1.7.0


We currently have have three bundled/shaded filesystem connectors that build on 
top of Hadoop's classes. More will probably come. Each of them re-builds the 
shaded Hadoop module, including creating the relocated config, adapting native 
code loading, etc.

We should factor that out into a single base project to avoid duplicating work.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


Watermark alignment during unit tests

2018-09-18 Thread Евгений Юшин
Hi devs

During the work on https://issues.apache.org/jira/browse/FLINK-10050 I've
found unstable behaviour of unit tests for unioned streams (which are used
in CoGroupedStream/JoinedStream under the hood).
Let's assume we have late elements in one of the stream. The thing is we
have no guarantees which source will be read first, and in which order
watermark alignment will be applied. So, the following example produce
different results for different invocation:

 val s1 = env.addSource(new SourceFunction[(String, String)] {
override def run(ctx: SourceFunction.SourceContext[(String,
String)]): Unit = {
  ctx.collectWithTimestamp(("a", "a1"), 1)
  //wmAllignmentLock.synchronized {
  //wmAllignmentLock.wait()
  //}
  ctx.emitWatermark(new Watermark(4))
  ctx.collectWithTimestamp(("a", "a2"), 2)
}

override def cancel(): Unit = {}
  })

  val s2 = env.addSource(new SourceFunction[(String, String)] {
override def run(ctx: SourceFunction.SourceContext[(String,
String)]): Unit = {
  ctx.collectWithTimestamp(("a", "b1"), 1)
  ctx.emitWatermark(new Watermark(4))
  //wmAllignmentLock.synchronized {
  //wmAllignmentLock.notifyAll()
  //}
}

override def cancel(): Unit = {}
  })

  val joined = s1.join(s2).where(_._1).equalTo(_._1)
.window(TumblingEventTimeWindows.of(Time.milliseconds(3)))
.apply((x, y) => s"$x:$y")
For some invocations (when Flink decide to process 2nd source before
1st), ("a", "a2") is considered to be late and dropped; and vice
versa.Here is the rate for 1000 invocations:
Run JOIN periodic
iteration [50] contains late total = 22, this iter = 22
iteration [100] contains late total = 51, this iter = 29
iteration [150] contains late total = 78, this iter = 27
iteration [200] contains late total = 101, this iter = 23
iteration [250] contains late total = 124, this iter = 23
iteration [300] contains late total = 155, this iter = 31
iteration [350] contains late total = 184, this iter = 29
iteration [400] contains late total = 210, this iter = 26
iteration [450] contains late total = 233, this iter = 23
iteration [500] contains late total = 256, this iter = 23
iteration [550] contains late total = 274, this iter = 18
iteration [600] contains late total = 303, this iter = 29
iteration [650] contains late total = 338, this iter = 35
iteration [700] contains late total = 367, this iter = 29
iteration [750] contains late total = 393, this iter = 26
iteration [800] contains late total = 415, this iter = 22
iteration [850] contains late total = 439, this iter = 24
iteration [900] contains late total = 459, this iter = 20
iteration [950] contains late total = 484, this iter = 25
iteration [1000] contains late total = 502, this iter = 18
contains late = 502


It doesn't matter Periodic or Punctuated watermark assigner is used.
As well as syncronization mechanism (commented in the code snippet
above) doesn't help to align records in particular order.

While this behaviour is totally fine for Production case, I just
wonder how to write stable unit test scenario to cover late elements
processing.
I didn't find any suitable test harness from utils.

Any feedback is appreciated!

Regards,
Eugen


[jira] [Created] (FLINK-10366) Create a shared base for S3 file systems

2018-09-18 Thread Stephan Ewen (JIRA)
Stephan Ewen created FLINK-10366:


 Summary: Create a shared base for S3 file systems
 Key: FLINK-10366
 URL: https://issues.apache.org/jira/browse/FLINK-10366
 Project: Flink
  Issue Type: Improvement
  Components: FileSystem
Reporter: Stephan Ewen
Assignee: Stephan Ewen
 Fix For: 1.7.0


We are adding extensions to the FileSystem logic on top of what Hadoop offers, 
like entropy injection, recoverable writers.

To be able to share the code across the Presto and Hadoop s3a implementations, 
we should rebase them both onto a common shared project.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


Re: Watermark alignment during unit tests

2018-09-18 Thread Kostas Kloudas
Hi Eugen,

It is true that for ITcases this can be difficult and this should be improved 
in Flink’s testing infrastructure,
but for this specific PR, what you need to check is if the allowedLateness 
parameter is propagated correctly
throughout the translation process. The window operator with allowed lateness 
(which is applied next)
is covered by other tests.

In this case I would recommend to do the Joining of the stream “manually”, i.e.:

input1.coGroup(input2)
  .where(keySelector1)
  .equalTo(keySelector2)
  .window(windowAssigner)

and then from the resulting WithWindow, just try to get the allowed lateness 
and verify that this is the 
value that you provided.

This will cover the propagation and make sure that nobody breaks it in the 
future.

Cheers,
Kostas


> On Sep 18, 2018, at 11:40 AM, Евгений Юшин  wrote:
> 
> Hi devs
> 
> During the work on https://issues.apache.org/jira/browse/FLINK-10050 I've
> found unstable behaviour of unit tests for unioned streams (which are used
> in CoGroupedStream/JoinedStream under the hood).
> Let's assume we have late elements in one of the stream. The thing is we
> have no guarantees which source will be read first, and in which order
> watermark alignment will be applied. So, the following example produce
> different results for different invocation:
> 
> val s1 = env.addSource(new SourceFunction[(String, String)] {
>override def run(ctx: SourceFunction.SourceContext[(String,
> String)]): Unit = {
>  ctx.collectWithTimestamp(("a", "a1"), 1)
>  //wmAllignmentLock.synchronized {
>  //wmAllignmentLock.wait()
>  //}
>  ctx.emitWatermark(new Watermark(4))
>  ctx.collectWithTimestamp(("a", "a2"), 2)
>}
> 
>override def cancel(): Unit = {}
>  })
> 
>  val s2 = env.addSource(new SourceFunction[(String, String)] {
>override def run(ctx: SourceFunction.SourceContext[(String,
> String)]): Unit = {
>  ctx.collectWithTimestamp(("a", "b1"), 1)
>  ctx.emitWatermark(new Watermark(4))
>  //wmAllignmentLock.synchronized {
>  //wmAllignmentLock.notifyAll()
>  //}
>}
> 
>override def cancel(): Unit = {}
>  })
> 
>  val joined = s1.join(s2).where(_._1).equalTo(_._1)
>.window(TumblingEventTimeWindows.of(Time.milliseconds(3)))
>.apply((x, y) => s"$x:$y")
> For some invocations (when Flink decide to process 2nd source before
> 1st), ("a", "a2") is considered to be late and dropped; and vice
> versa.Here is the rate for 1000 invocations:
> Run JOIN periodic
> iteration [50] contains late total = 22, this iter = 22
> iteration [100] contains late total = 51, this iter = 29
> iteration [150] contains late total = 78, this iter = 27
> iteration [200] contains late total = 101, this iter = 23
> iteration [250] contains late total = 124, this iter = 23
> iteration [300] contains late total = 155, this iter = 31
> iteration [350] contains late total = 184, this iter = 29
> iteration [400] contains late total = 210, this iter = 26
> iteration [450] contains late total = 233, this iter = 23
> iteration [500] contains late total = 256, this iter = 23
> iteration [550] contains late total = 274, this iter = 18
> iteration [600] contains late total = 303, this iter = 29
> iteration [650] contains late total = 338, this iter = 35
> iteration [700] contains late total = 367, this iter = 29
> iteration [750] contains late total = 393, this iter = 26
> iteration [800] contains late total = 415, this iter = 22
> iteration [850] contains late total = 439, this iter = 24
> iteration [900] contains late total = 459, this iter = 20
> iteration [950] contains late total = 484, this iter = 25
> iteration [1000] contains late total = 502, this iter = 18
> contains late = 502
> 
> 
> It doesn't matter Periodic or Punctuated watermark assigner is used.
> As well as syncronization mechanism (commented in the code snippet
> above) doesn't help to align records in particular order.
> 
> While this behaviour is totally fine for Production case, I just
> wonder how to write stable unit test scenario to cover late elements
> processing.
> I didn't find any suitable test harness from utils.
> 
> Any feedback is appreciated!
> 
> Regards,
> Eugen



Re: CEP & checkpoints/savepoints

2018-09-18 Thread Dawid Wysakowicz
Hi Ron,

Fabian is absolutely right. CEP library uses standard checkpointing
mechanisms of Flink. You do not need any additional configuration.

The only consideration one has to think of is that if you change some
conditions in your Pattern, and restart from checkpoint/savepoint you
might get some outdated results, cause we might have progressed already
with old condition. Consider pattern A B, if some event mapped to A with
old condition, we are already looking for B. So if we take a savepoint,
change condition for A , restart, there might be some matches that do
not match the new A condition, cause the state machine is already in B
state.

Regards,

Dawid


On 17/09/18 22:24, Fabian Hueske wrote:
> Hi Ron,
>
> The CEP library is built on top of the DataStream / ProcessFunction
> API and holds all necessary state (the state of the pattern matching
> state machine) in regular keyed MapState.
> Hence, CEP does not require a dedicated configuration for checkpoints
> and savepoints, besides the regular application checkpoint configuration.
>
> That's also why there's no dedicated documentation about this subject.
>
> @Dawid or Klou, please correct me if I'm wrong.
>
> Best, Fabian
>
> 2018-09-17 19:09 GMT+02:00 Ron Crocker  >:
>
> I’m working with CEP to detect when something stops reporting
> (which is very simple), but I need to show the team that the jobs
> will survive being shutdown and restarted without either a)
> declaring that everything stopped reporting (false positives) or
> b) missing things that have indeed stopped reporting (false
> negatives).
>
> There seems to be NO documentation regarding CEP and
> checkpoints/savepoints. Am I just missing it? Or is it something
> so simple that it should be obvious?
>
> Our graph is fairly straightforward - keyed stream using event
> time and a Pattern that is essentially a report followed by a
> report within a time window, and we use the timed-out side output
> as the “events” indicating “thing stopped reporting”. It seems
> that we need to checkpoint/savepoint the pattern state along with
> the normal things checkpointed (e.g., Kafka offsets).
>
> For now, I should be able to sell an assertion from you that it
> works, but official documentation would help.
>
> Ron
> —
> Ron Crocker
> Distinguished Engineer & Architect
> ( ( •)) New Relic
> rcroc...@newrelic.com 
> M: +1 630 363 8835
>
>



signature.asc
Description: OpenPGP digital signature


Re: [PROPOSAL] [community] A more structured approach to reviews and contributions

2018-09-18 Thread Stephan Ewen
On the template discussion, some thoughts

*PR Template*

I think the PR template went well. We can rethink the "checklist" at the
bottom, but all other parts turned out helpful in my opinion.

With the amount of contributions, it helps to ask the contributor to take a
little more work in order for the reviewer to be more efficient.
I would suggest to keep that mindset: Whenever we find a way that the
contributor can prepare stuff in such a way that reviews become
more efficient, we should do that. In my experience, most contributors are
willing to put in some extra minutes if it helps that their
PR gets merged faster.

*Review Template*

I think it would be helpful to have this checklist. It does not matter in
which form, be that as a text template, be that as labels.

The most important thing is to make explicit which questions have been
answered in the review.
Currently there is a lot of "+1" on pull requests which means "code quality
is fine", but all other questions are unanswered.
The contributors then rightfully wonder why this does not get merged.



On Tue, Sep 18, 2018 at 7:26 AM, 陈梓立  wrote:

> Hi all interested,
>
> Within the document there is a heated discussion about how the PR
> template/review template should be.
>
> Here share my opinion:
>
> 1. For the review template, actually we don't need comment a review
> template at all. GitHub has a tag system and only committer could add tags,
> which we can make use of it. That is, tagging this PR is
> waiting-for-proposal-approved, waiting-for-code-review,
> waiting-for-benchmark or block-by-author and so on. Asfbot could pick
> GitHub tag state to the corresponding JIRA and we always regard JIRA as the
> main discussion borad.
>
> 2. For the PR template, the greeting message is redundant. Just emphasize a
> JIRA associated is important and how to format the title is enough.
> Besides, the "Does this pull request potentially affect one of the
> following parts" part and "Documentation" should be coved from "What is the
> purpose of the change" and "Brief change log". These two parts, users
> always answer no and would be aware if they really make changes on it. As
> example, even pull request requires document, its owner might no add it at
> first. The PR template is a guide but not which one have to learn.
>
> To sum up, (1) take advantage of GitHub's tag system to tag review progress
> (2) make the template more concise to avoid burden mature contributors and
> force new comer to learn too much.
>
> Best,
> tison.
>
>
> Rong Rong  于2018年9月18日周二 上午7:05写道:
>
> > Thanks for putting the review contribution doc together, Stephan! This
> will
> > definitely help the community to make the review process better.
> >
> > From my experience this will benefit on both contributors and reviewers
> > side! Thus +1 for putting into practice as well.
> >
> > --
> > Rong
> >
> > On Mon, Sep 17, 2018 at 10:18 AM Stephan Ewen  wrote:
> >
> > > Hi!
> > >
> > > Thanks you for the encouraging feedback so far.
> > >
> > > The overall goal is definitely to make the contribution process better
> > and
> > > get fewer pull requests that are disregarded.
> > >
> > > There are various reasons for the disregarded pull requests, one being
> > that
> > > fewer committers really participate in reviews beyond
> > > the component they are currently very involved with. This is a separate
> > > issue and I am thinking on how to encourage more
> > > activity there.
> > >
> > > The other reason I was lack of structure and lack of decision making,
> > which
> > > is what I am first trying to fix here.
> > > A follow-up to this will definitely be to improve the contribution
> guide
> > as
> > > well.
> > >
> > > Best,
> > > Stephan
> > >
> > >
> > > On Mon, Sep 17, 2018 at 12:05 PM, Zhijiang(wangzhijiang999) <
> > > wangzhijiang...@aliyun.com.invalid> wrote:
> > >
> > > > From my personal experience as a contributor for three years, I feel
> > > > better experience in contirbuting or reviewing than before, although
> we
> > > > still have some points for further progress.
> > > >
> > > > I reviewed the proposal doc, and it gives very constructive and
> > > meaningful
> > > > guides which could help both contributor and reviewer. I agree with
> the
> > > > bove suggestions and wish they can be praticed well!
> > > >
> > > > Best,
> > > > Zhijiang
> > > > --
> > > > 发件人:Till Rohrmann 
> > > > 发送时间:2018年9月17日(星期一) 16:27
> > > > 收件人:dev 
> > > > 主 题:Re: [PROPOSAL] [community] A more structured approach to reviews
> > and
> > > > contributions
> > > >
> > > > Thanks for writing this up Stephan. I like the steps and hope that it
> > > will
> > > > help the community to make the review process better. Thus, +1 for
> > > putting
> > > > your proposal to practice.
> > > >
> > > > Cheers,
> > > > Till
> > > >
> > > > On Mon, Sep 17, 2018 at 10:00 AM Stephan Ewen 
> > wrote:
> > > >
> > > > > Hi Flink community members!
> > > > >

Re: [PROPOSAL] [community] A more structured approach to reviews and contributions

2018-09-18 Thread 陈梓立
Maybe a little rearrange to the process would help.

(1). Does the contributor describe itself well?
  (1.1) By whom this contribution should be given attention. This often
shows by its title, "[FLINK-XXX] [module]", the module part infer.
  (1.2) What the purpose of this contribution is. Done by the PR template.
Even on JIRA an issue should cover these points.

(2). Is there consensus on the contribution?
This follows (1), because we need to clear what the purpose of the
contribution first. At this stage reviewers could cc to module maintainer
as a supplement to (1.1). Also reviewers might ask the contributor to
clarify his purpose to sharp(1.2)

(3). Is the implement architectural and fit code style?
This follows (2). And only after a consensus we talk about concrete
implement, which prevent spend time and put effort in vain.

In addition, ideally a "+1" comment or approval means the purpose of
contribution is supported by the reviewer and implement(if there is)
quality is fine, so the reviewer vote for a consensus.

Best,
tison.


Stephan Ewen  于2018年9月18日周二 下午6:44写道:

> On the template discussion, some thoughts
>
> *PR Template*
>
> I think the PR template went well. We can rethink the "checklist" at the
> bottom, but all other parts turned out helpful in my opinion.
>
> With the amount of contributions, it helps to ask the contributor to take a
> little more work in order for the reviewer to be more efficient.
> I would suggest to keep that mindset: Whenever we find a way that the
> contributor can prepare stuff in such a way that reviews become
> more efficient, we should do that. In my experience, most contributors are
> willing to put in some extra minutes if it helps that their
> PR gets merged faster.
>
> *Review Template*
>
> I think it would be helpful to have this checklist. It does not matter in
> which form, be that as a text template, be that as labels.
>
> The most important thing is to make explicit which questions have been
> answered in the review.
> Currently there is a lot of "+1" on pull requests which means "code quality
> is fine", but all other questions are unanswered.
> The contributors then rightfully wonder why this does not get merged.
>
>
>
> On Tue, Sep 18, 2018 at 7:26 AM, 陈梓立  wrote:
>
> > Hi all interested,
> >
> > Within the document there is a heated discussion about how the PR
> > template/review template should be.
> >
> > Here share my opinion:
> >
> > 1. For the review template, actually we don't need comment a review
> > template at all. GitHub has a tag system and only committer could add
> tags,
> > which we can make use of it. That is, tagging this PR is
> > waiting-for-proposal-approved, waiting-for-code-review,
> > waiting-for-benchmark or block-by-author and so on. Asfbot could pick
> > GitHub tag state to the corresponding JIRA and we always regard JIRA as
> the
> > main discussion borad.
> >
> > 2. For the PR template, the greeting message is redundant. Just
> emphasize a
> > JIRA associated is important and how to format the title is enough.
> > Besides, the "Does this pull request potentially affect one of the
> > following parts" part and "Documentation" should be coved from "What is
> the
> > purpose of the change" and "Brief change log". These two parts, users
> > always answer no and would be aware if they really make changes on it. As
> > example, even pull request requires document, its owner might no add it
> at
> > first. The PR template is a guide but not which one have to learn.
> >
> > To sum up, (1) take advantage of GitHub's tag system to tag review
> progress
> > (2) make the template more concise to avoid burden mature contributors
> and
> > force new comer to learn too much.
> >
> > Best,
> > tison.
> >
> >
> > Rong Rong  于2018年9月18日周二 上午7:05写道:
> >
> > > Thanks for putting the review contribution doc together, Stephan! This
> > will
> > > definitely help the community to make the review process better.
> > >
> > > From my experience this will benefit on both contributors and reviewers
> > > side! Thus +1 for putting into practice as well.
> > >
> > > --
> > > Rong
> > >
> > > On Mon, Sep 17, 2018 at 10:18 AM Stephan Ewen 
> wrote:
> > >
> > > > Hi!
> > > >
> > > > Thanks you for the encouraging feedback so far.
> > > >
> > > > The overall goal is definitely to make the contribution process
> better
> > > and
> > > > get fewer pull requests that are disregarded.
> > > >
> > > > There are various reasons for the disregarded pull requests, one
> being
> > > that
> > > > fewer committers really participate in reviews beyond
> > > > the component they are currently very involved with. This is a
> separate
> > > > issue and I am thinking on how to encourage more
> > > > activity there.
> > > >
> > > > The other reason I was lack of structure and lack of decision making,
> > > which
> > > > is what I am first trying to fix here.
> > > > A follow-up to this will definitely be to improve the contribution
> > guide
> > > as
> > > > well.
> > > >
>

Re: [PROPOSAL] [community] A more structured approach to reviews and contributions

2018-09-18 Thread 陈梓立
Put some good cases here might be helpful.

See how a contribution of runtime module be proposed, discussed,
implemented and merged from  https://github.com/apache/flink/pull/5931 to
https://github.com/apache/flink/pull/6132.

1. #5931 fix a bug, but remains points could be improved. Here sihuazhou
and shuai-xu share their considerations and require review(of the proposal)
by Stephan, Till and Gary, our committers.
2. After discussion, all people involved reach a consensus. So the rest
work is to implement it.
3. sihuazhou gives out an implementation #6132, Till reviews it and find it
is somewhat out of the "architectural" aspect, so suggests
implementation-level changes.
4. Addressing those implementation-level comments, the PR gets merged.

I think this is quite a good example how we think our review process should
go.

Best,
tison.


陈梓立  于2018年9月18日周二 下午10:53写道:

> Maybe a little rearrange to the process would help.
>
> (1). Does the contributor describe itself well?
>   (1.1) By whom this contribution should be given attention. This often
> shows by its title, "[FLINK-XXX] [module]", the module part infer.
>   (1.2) What the purpose of this contribution is. Done by the PR template.
> Even on JIRA an issue should cover these points.
>
> (2). Is there consensus on the contribution?
> This follows (1), because we need to clear what the purpose of the
> contribution first. At this stage reviewers could cc to module maintainer
> as a supplement to (1.1). Also reviewers might ask the contributor to
> clarify his purpose to sharp(1.2)
>
> (3). Is the implement architectural and fit code style?
> This follows (2). And only after a consensus we talk about concrete
> implement, which prevent spend time and put effort in vain.
>
> In addition, ideally a "+1" comment or approval means the purpose of
> contribution is supported by the reviewer and implement(if there is)
> quality is fine, so the reviewer vote for a consensus.
>
> Best,
> tison.
>
>
> Stephan Ewen  于2018年9月18日周二 下午6:44写道:
>
>> On the template discussion, some thoughts
>>
>> *PR Template*
>>
>> I think the PR template went well. We can rethink the "checklist" at the
>> bottom, but all other parts turned out helpful in my opinion.
>>
>> With the amount of contributions, it helps to ask the contributor to take
>> a
>> little more work in order for the reviewer to be more efficient.
>> I would suggest to keep that mindset: Whenever we find a way that the
>> contributor can prepare stuff in such a way that reviews become
>> more efficient, we should do that. In my experience, most contributors are
>> willing to put in some extra minutes if it helps that their
>> PR gets merged faster.
>>
>> *Review Template*
>>
>> I think it would be helpful to have this checklist. It does not matter in
>> which form, be that as a text template, be that as labels.
>>
>> The most important thing is to make explicit which questions have been
>> answered in the review.
>> Currently there is a lot of "+1" on pull requests which means "code
>> quality
>> is fine", but all other questions are unanswered.
>> The contributors then rightfully wonder why this does not get merged.
>>
>>
>>
>> On Tue, Sep 18, 2018 at 7:26 AM, 陈梓立  wrote:
>>
>> > Hi all interested,
>> >
>> > Within the document there is a heated discussion about how the PR
>> > template/review template should be.
>> >
>> > Here share my opinion:
>> >
>> > 1. For the review template, actually we don't need comment a review
>> > template at all. GitHub has a tag system and only committer could add
>> tags,
>> > which we can make use of it. That is, tagging this PR is
>> > waiting-for-proposal-approved, waiting-for-code-review,
>> > waiting-for-benchmark or block-by-author and so on. Asfbot could pick
>> > GitHub tag state to the corresponding JIRA and we always regard JIRA as
>> the
>> > main discussion borad.
>> >
>> > 2. For the PR template, the greeting message is redundant. Just
>> emphasize a
>> > JIRA associated is important and how to format the title is enough.
>> > Besides, the "Does this pull request potentially affect one of the
>> > following parts" part and "Documentation" should be coved from "What is
>> the
>> > purpose of the change" and "Brief change log". These two parts, users
>> > always answer no and would be aware if they really make changes on it.
>> As
>> > example, even pull request requires document, its owner might no add it
>> at
>> > first. The PR template is a guide but not which one have to learn.
>> >
>> > To sum up, (1) take advantage of GitHub's tag system to tag review
>> progress
>> > (2) make the template more concise to avoid burden mature contributors
>> and
>> > force new comer to learn too much.
>> >
>> > Best,
>> > tison.
>> >
>> >
>> > Rong Rong  于2018年9月18日周二 上午7:05写道:
>> >
>> > > Thanks for putting the review contribution doc together, Stephan! This
>> > will
>> > > definitely help the community to make the review process better.
>> > >
>> > > From my experience this will b