Re: [DISCUSS] Start a user...@flink.apache.org mailing list for the Chinese-speaking community?

2019-06-21 Thread vino yang
Hi Hequn,

Thanks for reporting this case.

The reason replied by QQ mail team is also caused by *bounce attack*.  So
this mail address has been intercepted and it's an IP level interception.

Today, the QQ mail team has unblocked this email address. So it can receive
the follow-up email from Apache mail server normally.

If this email address still can not work normally in the future. Please
report it here again.

Best,
Vino


Hequn Cheng  于2019年6月21日周五 下午2:39写道:

> Hi Vino,
>
> Great thanks for your help.
>
> > So if someone reports that they can't receive the email from Apache mail
> server, they can provide more detailed information to the QQ mailbox to
> facilitate the location problem.
>
> I just got one feedback.
> A user(173855...@qq.com) report that he can't receive the emails from the
> Chinese-speaking mailing list. He had subscripted successfully on
> 2019-05-10. Everything goes well until 2019-05-10 and no more emails come
> again from the mailing list.
>
> Best, Hequn
>
> On Fri, Jun 21, 2019 at 12:56 PM vino yang  wrote:
>
> > Hi Kurt,
> >
> > I have copied my reply to the Jira issue of INFRA[1].
> >
> > Within my ability, I am happy to coordinate and promote this problem.
> >
> > Best,
> > Vino
> >
> > [1]: https://issues.apache.org/jira/browse/INFRA-18249
> >
> > Kurt Young  于2019年6月21日周五 下午12:11写道:
> >
> > > Hi vino,
> > >
> > > Thanks for your effort. Could you also share this information with
> apache
> > > INFRA? Maybe we can find a workable solution together.
> > > You can try to leave comments in this jira:
> > > https://issues.apache.org/jira/browse/INFRA-18249)
> > >
> > > Best,
> > > Kurt
> > >
> > >
> > > On Fri, Jun 21, 2019 at 11:45 AM vino yang 
> > wrote:
> > >
> > > > Hi all,
> > > >
> > > > The main reason is that the *Apache mail server has been used and
> > caused
> > > > a bounce attack on the QQ mailbox*.
> > > >
> > > > Detailed description: The third parties forged the domain name of the
> > QQ
> > > > mailbox to send spam to the Apache mail server. The Apache mail
> server
> > > does
> > > > not make a correct check and mistakenly thought that the spammers
> were
> > > from
> > > > QQ mailbox instead of third parties, so that these spam emails were
> > > > returned back to the QQ mail server, and a large number of bouncers
> to
> > > the
> > > > QQ mailbox server caused a bounce attack. Therefore, the anti-spam
> > system
> > > > of QQ mailbox automatically applies the interception strategy.
> Besides
> > > > bounce emails, some normal emails are also blocked.
> > > >
> > > > At present, QQ mailbox temporarily uses a more relaxed anti-spam
> > strategy
> > > > for the Apache mail server. However, if QQ mail server continues to
> > > receive
> > > > a large number of bounce emails, it will also take effective
> > interception
> > > > measures. In the history of QQ mailbox, not all emails from the
> Apache
> > > mail
> > > > server will be intercepted, most of the rejections are part of the
> > bounce
> > > > attack.
> > > >
> > > > So if someone reports that they can't receive the email from Apache
> > mail
> > > > server, they can provide more detailed information to the QQ mailbox
> to
> > > > facilitate the location problem.
> > > >
> > > > The attached file contains a sample of spam that was rejected and
> > > returned
> > > > to the QQ mailbox.
> > > >
> > > > Best,
> > > > Vino
> > > >
> > > > vino yang  于2019年6月21日周五 上午10:16写道:
> > > >
> > > >> Hi Robert,
> > > >>
> > > >> Yes,  QQ mail product belongs to Tencent and I work at Tencent.
> > > >>
> > > >> I am contacting QQ mail team and trying to know the reason. Once I
> get
> > > >> the reply and explanation. I will sync here.
> > > >>
> > > >> Best,
> > > >> Vino.
> > > >>
> > > >> Robert Metzger  于2019年6月20日周四 下午10:59写道:
> > > >>
> > > >>> Thanks a lot!
> > > >>>
> > > >>> qq.com belongs to Tencent, right?
> > > >>> As far as I know, we have some active contributors working at
> Tencent
> > > >>> (Vino
> > > >>> Yang). Maybe he or other employees from Tencent following this
> > mailing
> > > >>> list, could help to make a connection to the QQ teams to resolve
> that
> > > >>> problem?
> > > >>>
> > > >>>
> > > >>>
> > > >>> On Thu, Jun 20, 2019 at 4:43 PM Kurt Young 
> wrote:
> > > >>>
> > > >>> > From INFRA's response: "Yes, they aggressively rate limit us, and
> > all
> > > >>> our
> > > >>> > efforts to contact them have gone unanswered. We recommend people
> > use
> > > >>> other
> > > >>> > providers."
> > > >>> >
> > > >>> > I think the only way is tell user not to use qq.com mails when
> > using
> > > >>> > apache
> > > >>> > mailing list.
> > > >>> >
> > > >>> > Best,
> > > >>> > Kurt
> > > >>> >
> > > >>> >
> > > >>> > On Thu, Jun 20, 2019 at 10:23 PM Kurt Young 
> > > wrote:
> > > >>> >
> > > >>> > > Thanks Robert, I left a comment in the JIRA you gave and see
> what
> > > >>> will
> > > >>> > > happen.
> > > >>> > >
> > > >>> > > Best,
> > > >>> > > Kurt
> > > >>> > >
> > > >>> > >
> > > >>> > > O

[jira] [Created] (FLINK-12923) Introduce a Task termination future

2019-06-21 Thread Chesnay Schepler (JIRA)
Chesnay Schepler created FLINK-12923:


 Summary: Introduce a Task termination future
 Key: FLINK-12923
 URL: https://issues.apache.org/jira/browse/FLINK-12923
 Project: Flink
  Issue Type: Improvement
  Components: Runtime / Task
Reporter: Chesnay Schepler
Assignee: Chesnay Schepler
 Fix For: 1.9.0


introduce a termination future into the Task that is completed once run() 
exits. This is useful for ensuring that resources are not cleaned up while a 
Task is still being canceled asynchronously.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (FLINK-12924) Introduce basic type inference interfaces

2019-06-21 Thread Timo Walther (JIRA)
Timo Walther created FLINK-12924:


 Summary: Introduce basic type inference interfaces
 Key: FLINK-12924
 URL: https://issues.apache.org/jira/browse/FLINK-12924
 Project: Flink
  Issue Type: Sub-task
  Components: Table SQL / API
Reporter: Timo Walther
Assignee: Timo Walther


This issue introduces basic interfaces for performing type inference. It 
includes validation of input types and deduction of accumulator or return 
types. Argument type inference (i.e. deducing a type for \{{NULL}}) will be 
introduced in a later step. The interfaces replace the need for 
\{{PlannerExpression}}.

A rough design document can be found here but it needs some update. 
Introduction of annotations and basic type extraction will come later: 
[https://docs.google.com/document/d/1RM_-XvD25AldUOl7xRSvA3sascWBa1-Jze3i5mvb7WU/edit#heading=h.kwyjplavecx4
]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (FLINK-12925) Docker embedded job end-to-end test fails

2019-06-21 Thread Alex (JIRA)
Alex created FLINK-12925:


 Summary: Docker embedded job end-to-end test fails
 Key: FLINK-12925
 URL: https://issues.apache.org/jira/browse/FLINK-12925
 Project: Flink
  Issue Type: Bug
  Components: Tests
Reporter: Alex


The test fails at the one of the Docker image build steps:
{code}
Step 16/22 : ADD $job_artifacts/* $FLINK_JOB_ARTIFACTS_DIR/
ADD failed: no source files were specified
{code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


Re: [DISCUSS] Releasing Flink 1.8.1

2019-06-21 Thread jincheng sun
Hi All,

The last blocker(FLINK-12863) of 1.8.1 release have been fixed!
But also welcome report any issues you think is a blocker!
I will also do the final check, if no new problems are found, I will
prepare RC1 as soon as possible! :)

Cheers,
Jincheng

jincheng sun  于2019年6月17日周一 上午9:24写道:

> Hi Till,
>
> Thank you for this timely discovery! @Till Rohrmann 
>
> We would start the release process after FLINK-12863
>  resolved!
>
> Best,
> Jincheng
>
> Till Rohrmann  于2019年6月16日周日 下午10:03写道:
>
>> Hi Jincheng,
>>
>> I just realized that we introduced a race condition with FLINK-11059. We
>> first need to fix this problem [1] before we can start the release
>> process.
>>
>> [1] https://issues.apache.org/jira/browse/FLINK-12863
>>
>> Cheers,
>> Till
>>
>> On Sat, Jun 15, 2019 at 3:48 PM jincheng sun 
>> wrote:
>>
>> > I am very happy to say that all the blockers and critical issues for
>> > released on 1.8.1 have been solved!!
>> >
>> > Great thanks to Aitozi, Yu Li, Congxian Qiu, Yun Tang, shuai.xu,
>> JiangJie
>> > Qin and zhijiang for the quick fix!
>> > Great thanks to tzulitai, Aljoscha, Till  and pnowojski  for the high
>> > priority review!
>> >
>> > Great thanks to all of you for the help(fix or review) in promoting the
>> > 1.8.1 release. Thank you!!!
>> >
>> > I will prepare the first RC of release 1.8.1 as soon as possible :)
>> >
>> > Cheers,
>> > Jincheng
>> >
>> > jincheng sun  于2019年6月14日周五 上午9:23写道:
>> >
>> > > Hi all,
>> > >
>> > > After the recent efforts of all of you, all the Issues of Blocker and
>> > > Critical will be completed soon! :)
>> > > There are only 2 issues left and they are also almost done:
>> > >
>> > > [Blocker]
>> > > - FLINK-12297  Work by  @Aitozi  Being
>> > > reviewed by @Aljoscha Krettek ! [almost done].
>> > > [Critical]
>> > > - FLINK-11059 Work by @shuai-xu  Being
>> > > reviewed by @Till Rohrmann ! [almost done]
>> > >
>> > > The detail can be found here:
>> > >
>> > >
>> >
>> https://docs.google.com/document/d/1858C7HdyDPIxxm2Rvnu4bYahq0Tr9NaY1mj7BzQi_0w/edit?usp=sharing
>> > >
>> > > Great thanks to all of you for the help(fix or review) in promoting
>> the
>> > > 1.8.1 release. Thank you!!!
>> > >
>> > > I think today would finish all the Blocker and Critical issues. And
>> I'll
>> > > do the last check before preparing the RC1 of release 1.8.1.  then the
>> > > first   RC of 1.8.1 will be coming :)
>> > >
>> > > Best,
>> > > Jincheng
>> > >
>> > > jincheng sun  于2019年6月10日周一 上午5:24写道:
>> > >
>> > >> Hi all,
>> > >> I am here to quickly update the progress of the issue that needs to
>> be
>> > >> tracked(going well):
>> > >>
>> > >> [Blocker]
>> > >> - FLINK-12297 
>> Work
>> > >> by @Aitozi  Being reviewed by @Aljoscha
>> > >> Krettek !
>> > >> - FLINK-11107 
>> Work
>> > >> by @Myasuka  Being reviewed by @Tzu-Li
>> > >> (Gordon) Tai 
>> > >>
>> > >> [Critical]
>> > >> - FLINK-11059 
>> Work
>> > >> by @shuai-xu  Being reviewed by @Till
>> > >> Rohrmann !
>> > >>
>> > >> [Nice to have]
>> > >> - FLINK-10455 
>> Work
>> > >> by @becketqin Need someone to volunteer review the PR
>> > >>
>> > >> The detail can be found here:
>> > >>
>> > >>
>> >
>> https://docs.google.com/document/d/1858C7HdyDPIxxm2Rvnu4bYahq0Tr9NaY1mj7BzQi_0w/edit?usp=sharing
>> > >>
>> > >> Great thanks to all of you for the help(fix or review) in promoting
>> the
>> > >> 1.8.1 release. Thank you!!!
>> > >>
>> > >> I hope to prepare the first RC of release 1.8.1 on Thursday, and
>> > >> FLINK-12297 ,
>> > >> FLINK-11107 ,
>> > >> FLINK-11059 
>> should
>> > >> be merged before the RC1.
>> > >> If the relevant PR can't be Merged, please let me know, and we will
>> put
>> > >> more energy into solving! :)
>> > >>
>> > >> Best,
>> > >> Jincheng
>> > >>
>> > >> jincheng sun  于2019年6月5日周三 下午5:33写道:
>> > >>
>> > >>> I am here to quickly update the progress of the issue that needs to
>> be
>> > >>> tracked(going well):
>> > >>>
>> > >>> [Blocker]
>> > >>> - FLINK-12296 
>> > [done]
>> > >>> - FLINK-11987 
>> > [done]
>> > >>> - FLINK-12297 
>> > >>> Being reviewed by @Aljoscha Krettek !
>> > >>> - FLINK-11107 
>> > >>> Being reviewed by @Tzu-Li (Gordon) Tai 
>> > >>>
>> > >>> [Critical]
>> > >>> - FLINK-10455 

Flink Elasticsearch Sink Issue

2019-06-21 Thread Ramya Ramamurthy
Hi,

We use Kafka->Flink->Elasticsearch in our project.
The data to the elasticsearch is not getting flushed, till the next batch
arrives.
E.g.: If the first batch contains 1000 packets, this gets pushed to the
Elastic, only after the next batch arrives [irrespective of reaching the
batch time limit].
Below are the sink configurations we use  currently.

esSinkBuilder.setBulkFlushMaxActions(2000); // 2K records
esSinkBuilder.setBulkFlushMaxSizeMb(5); // 5 Mb once
esSinkBuilder.setBulkFlushInterval(6); // 1 minute once
esSinkBuilder.setBulkFlushBackoffRetries(3); // Retry three times if bulk
fails
esSinkBuilder.setBulkFlushBackoffDelay(5000); // Retry after 5 seconds
esSinkBuilder.setBulkFlushBackoff(true);

Sink code :
List httpHosts = new ArrayList<>();
//httpHosts.add(new HttpHost("10.128.0.34", 9200, "http"));
httpHosts.add(new HttpHost("192.168.80.171", 9200, "http"));

ElasticsearchSink.Builder esSinkBuilder = new
ElasticsearchSink.Builder<>(
httpHosts,
new ElasticsearchSinkFunction() {

private IndexRequest createIndexRequest(byte[] document, String indexDate) {

return new IndexRequest(esIndex + indexDate, esType)
.source(document, XContentType.JSON);

}

@Override
public void process(Row r, RuntimeContext runtimeContext, RequestIndexer
requestIndexer) {
byte[] byteArray = serializationSchema.serialize(r);

ObjectMapper mapper = new ObjectMapper();
ObjectWriter writer = mapper.writer();

try {
JsonNode jsonNode = mapper.readTree(byteArray);

long tumbleStart = jsonNode.get("fseen").asLong();

ZonedDateTime utc =
Instant.ofEpochMilli(tumbleStart).atZone(ZoneOffset.UTC);
String indexDate = DateTimeFormatter.ofPattern(".MM.dd").format(utc);

byte[] document = writer.writeValueAsBytes(jsonNode);

requestIndexer.add(createIndexRequest(document, indexDate));
} catch (Exception e) {
System.out.println("In the error block");
}

}
}
);

Has anyone faced this issue? Any help would be appreciated !!

Thanks,


Re: Flink Elasticsearch Sink Issue

2019-06-21 Thread miki haiat
Did you set some checkpoints  configuration?

On Fri, Jun 21, 2019, 13:17 Ramya Ramamurthy  wrote:

> Hi,
>
> We use Kafka->Flink->Elasticsearch in our project.
> The data to the elasticsearch is not getting flushed, till the next batch
> arrives.
> E.g.: If the first batch contains 1000 packets, this gets pushed to the
> Elastic, only after the next batch arrives [irrespective of reaching the
> batch time limit].
> Below are the sink configurations we use  currently.
>
> esSinkBuilder.setBulkFlushMaxActions(2000); // 2K records
> esSinkBuilder.setBulkFlushMaxSizeMb(5); // 5 Mb once
> esSinkBuilder.setBulkFlushInterval(6); // 1 minute once
> esSinkBuilder.setBulkFlushBackoffRetries(3); // Retry three times if bulk
> fails
> esSinkBuilder.setBulkFlushBackoffDelay(5000); // Retry after 5 seconds
> esSinkBuilder.setBulkFlushBackoff(true);
>
> Sink code :
> List httpHosts = new ArrayList<>();
> //httpHosts.add(new HttpHost("10.128.0.34", 9200, "http"));
> httpHosts.add(new HttpHost("192.168.80.171", 9200, "http"));
>
> ElasticsearchSink.Builder esSinkBuilder = new
> ElasticsearchSink.Builder<>(
> httpHosts,
> new ElasticsearchSinkFunction() {
>
> private IndexRequest createIndexRequest(byte[] document, String indexDate)
> {
>
> return new IndexRequest(esIndex + indexDate, esType)
> .source(document, XContentType.JSON);
>
> }
>
> @Override
> public void process(Row r, RuntimeContext runtimeContext, RequestIndexer
> requestIndexer) {
> byte[] byteArray = serializationSchema.serialize(r);
>
> ObjectMapper mapper = new ObjectMapper();
> ObjectWriter writer = mapper.writer();
>
> try {
> JsonNode jsonNode = mapper.readTree(byteArray);
>
> long tumbleStart = jsonNode.get("fseen").asLong();
>
> ZonedDateTime utc =
> Instant.ofEpochMilli(tumbleStart).atZone(ZoneOffset.UTC);
> String indexDate = DateTimeFormatter.ofPattern(".MM.dd").format(utc);
>
> byte[] document = writer.writeValueAsBytes(jsonNode);
>
> requestIndexer.add(createIndexRequest(document, indexDate));
> } catch (Exception e) {
> System.out.println("In the error block");
> }
>
> }
> }
> );
>
> Has anyone faced this issue? Any help would be appreciated !!
>
> Thanks,
>


[jira] [Created] (FLINK-12926) Main thread checking in some tests fails

2019-06-21 Thread Zhu Zhu (JIRA)
Zhu Zhu created FLINK-12926:
---

 Summary: Main thread checking in some tests fails
 Key: FLINK-12926
 URL: https://issues.apache.org/jira/browse/FLINK-12926
 Project: Flink
  Issue Type: Bug
  Components: Runtime / Coordination
Affects Versions: 1.9.0
Reporter: Zhu Zhu
 Attachments: mainThreadCheckFailure.log

Currently all JM side job changing actions are expected to be taken in 
JobMaster main thread.

In current Flink tests, many cases tend to use the test main thread as the JM 
main thread. This can lead to 2 issues:

1. TestingComponentMainThreadExecutorServiceAdapter is a direct executor, so if 
it is invoked from any other thread, it will break the main thread checking and 
fail the submitted action (as in the attached log [^mainThreadCheckFailure.log])

2. The test main thread does not support other actions queued in its executor, 
as the test will end once the current test thread action(the current running 
test body) is done

 

In my observation, most cases which starts 
ExecutionGraph.scheduleForExecution() will encounter this issue. Cases include 
ExecutionGraphRestartTest, FailoverRegionTest, 
ConcurrentFailoverStrategyExecutionGraphTest, GlobalModVersionTest, 
ExecutionGraphDeploymentTest, etc.

 

One solution in my mind is to create a ScheduledExecutorService for those 
tests, use it as the main thread and run the test body in this thread.

 

 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


Re: [DISCUSS] Start a user...@flink.apache.org mailing list for the Chinese-speaking community?

2019-06-21 Thread Hequn Cheng
Hi vino,

Thanks a lot for unblocking the email address. I have told the user about
this.
Hope things can get better.

Best, Hequn

On Fri, Jun 21, 2019 at 3:14 PM vino yang  wrote:

> Hi Hequn,
>
> Thanks for reporting this case.
>
> The reason replied by QQ mail team is also caused by *bounce attack*.  So
> this mail address has been intercepted and it's an IP level interception.
>
> Today, the QQ mail team has unblocked this email address. So it can receive
> the follow-up email from Apache mail server normally.
>
> If this email address still can not work normally in the future. Please
> report it here again.
>
> Best,
> Vino
>
>
> Hequn Cheng  于2019年6月21日周五 下午2:39写道:
>
> > Hi Vino,
> >
> > Great thanks for your help.
> >
> > > So if someone reports that they can't receive the email from Apache
> mail
> > server, they can provide more detailed information to the QQ mailbox to
> > facilitate the location problem.
> >
> > I just got one feedback.
> > A user(173855...@qq.com) report that he can't receive the emails from
> the
> > Chinese-speaking mailing list. He had subscripted successfully on
> > 2019-05-10. Everything goes well until 2019-05-10 and no more emails come
> > again from the mailing list.
> >
> > Best, Hequn
> >
> > On Fri, Jun 21, 2019 at 12:56 PM vino yang 
> wrote:
> >
> > > Hi Kurt,
> > >
> > > I have copied my reply to the Jira issue of INFRA[1].
> > >
> > > Within my ability, I am happy to coordinate and promote this problem.
> > >
> > > Best,
> > > Vino
> > >
> > > [1]: https://issues.apache.org/jira/browse/INFRA-18249
> > >
> > > Kurt Young  于2019年6月21日周五 下午12:11写道:
> > >
> > > > Hi vino,
> > > >
> > > > Thanks for your effort. Could you also share this information with
> > apache
> > > > INFRA? Maybe we can find a workable solution together.
> > > > You can try to leave comments in this jira:
> > > > https://issues.apache.org/jira/browse/INFRA-18249)
> > > >
> > > > Best,
> > > > Kurt
> > > >
> > > >
> > > > On Fri, Jun 21, 2019 at 11:45 AM vino yang 
> > > wrote:
> > > >
> > > > > Hi all,
> > > > >
> > > > > The main reason is that the *Apache mail server has been used and
> > > caused
> > > > > a bounce attack on the QQ mailbox*.
> > > > >
> > > > > Detailed description: The third parties forged the domain name of
> the
> > > QQ
> > > > > mailbox to send spam to the Apache mail server. The Apache mail
> > server
> > > > does
> > > > > not make a correct check and mistakenly thought that the spammers
> > were
> > > > from
> > > > > QQ mailbox instead of third parties, so that these spam emails were
> > > > > returned back to the QQ mail server, and a large number of bouncers
> > to
> > > > the
> > > > > QQ mailbox server caused a bounce attack. Therefore, the anti-spam
> > > system
> > > > > of QQ mailbox automatically applies the interception strategy.
> > Besides
> > > > > bounce emails, some normal emails are also blocked.
> > > > >
> > > > > At present, QQ mailbox temporarily uses a more relaxed anti-spam
> > > strategy
> > > > > for the Apache mail server. However, if QQ mail server continues to
> > > > receive
> > > > > a large number of bounce emails, it will also take effective
> > > interception
> > > > > measures. In the history of QQ mailbox, not all emails from the
> > Apache
> > > > mail
> > > > > server will be intercepted, most of the rejections are part of the
> > > bounce
> > > > > attack.
> > > > >
> > > > > So if someone reports that they can't receive the email from Apache
> > > mail
> > > > > server, they can provide more detailed information to the QQ
> mailbox
> > to
> > > > > facilitate the location problem.
> > > > >
> > > > > The attached file contains a sample of spam that was rejected and
> > > > returned
> > > > > to the QQ mailbox.
> > > > >
> > > > > Best,
> > > > > Vino
> > > > >
> > > > > vino yang  于2019年6月21日周五 上午10:16写道:
> > > > >
> > > > >> Hi Robert,
> > > > >>
> > > > >> Yes,  QQ mail product belongs to Tencent and I work at Tencent.
> > > > >>
> > > > >> I am contacting QQ mail team and trying to know the reason. Once I
> > get
> > > > >> the reply and explanation. I will sync here.
> > > > >>
> > > > >> Best,
> > > > >> Vino.
> > > > >>
> > > > >> Robert Metzger  于2019年6月20日周四 下午10:59写道:
> > > > >>
> > > > >>> Thanks a lot!
> > > > >>>
> > > > >>> qq.com belongs to Tencent, right?
> > > > >>> As far as I know, we have some active contributors working at
> > Tencent
> > > > >>> (Vino
> > > > >>> Yang). Maybe he or other employees from Tencent following this
> > > mailing
> > > > >>> list, could help to make a connection to the QQ teams to resolve
> > that
> > > > >>> problem?
> > > > >>>
> > > > >>>
> > > > >>>
> > > > >>> On Thu, Jun 20, 2019 at 4:43 PM Kurt Young 
> > wrote:
> > > > >>>
> > > > >>> > From INFRA's response: "Yes, they aggressively rate limit us,
> and
> > > all
> > > > >>> our
> > > > >>> > efforts to contact them have gone unanswered. We recommend
> people
> > > use
> > > > >>> other
> > > > >>> > providers."
> > > > >>> >

Re: Flink Elasticsearch Sink Issue

2019-06-21 Thread Ramya Ramamurthy
Yes, we do maintain checkpoints
env.enableCheckpointing(30);

But we assumed it is for Kafka consumer offsets. Not sure how this is
useful in this case? Can you pls. elaborate on this.

~Ramya.



On Fri, Jun 21, 2019 at 4:33 PM miki haiat  wrote:

> Did you set some checkpoints  configuration?
>
> On Fri, Jun 21, 2019, 13:17 Ramya Ramamurthy  wrote:
>
> > Hi,
> >
> > We use Kafka->Flink->Elasticsearch in our project.
> > The data to the elasticsearch is not getting flushed, till the next batch
> > arrives.
> > E.g.: If the first batch contains 1000 packets, this gets pushed to the
> > Elastic, only after the next batch arrives [irrespective of reaching the
> > batch time limit].
> > Below are the sink configurations we use  currently.
> >
> > esSinkBuilder.setBulkFlushMaxActions(2000); // 2K records
> > esSinkBuilder.setBulkFlushMaxSizeMb(5); // 5 Mb once
> > esSinkBuilder.setBulkFlushInterval(6); // 1 minute once
> > esSinkBuilder.setBulkFlushBackoffRetries(3); // Retry three times if bulk
> > fails
> > esSinkBuilder.setBulkFlushBackoffDelay(5000); // Retry after 5 seconds
> > esSinkBuilder.setBulkFlushBackoff(true);
> >
> > Sink code :
> > List httpHosts = new ArrayList<>();
> > //httpHosts.add(new HttpHost("10.128.0.34", 9200, "http"));
> > httpHosts.add(new HttpHost("192.168.80.171", 9200, "http"));
> >
> > ElasticsearchSink.Builder esSinkBuilder = new
> > ElasticsearchSink.Builder<>(
> > httpHosts,
> > new ElasticsearchSinkFunction() {
> >
> > private IndexRequest createIndexRequest(byte[] document, String
> indexDate)
> > {
> >
> > return new IndexRequest(esIndex + indexDate, esType)
> > .source(document, XContentType.JSON);
> >
> > }
> >
> > @Override
> > public void process(Row r, RuntimeContext runtimeContext, RequestIndexer
> > requestIndexer) {
> > byte[] byteArray = serializationSchema.serialize(r);
> >
> > ObjectMapper mapper = new ObjectMapper();
> > ObjectWriter writer = mapper.writer();
> >
> > try {
> > JsonNode jsonNode = mapper.readTree(byteArray);
> >
> > long tumbleStart = jsonNode.get("fseen").asLong();
> >
> > ZonedDateTime utc =
> > Instant.ofEpochMilli(tumbleStart).atZone(ZoneOffset.UTC);
> > String indexDate = DateTimeFormatter.ofPattern(".MM.dd").format(utc);
> >
> > byte[] document = writer.writeValueAsBytes(jsonNode);
> >
> > requestIndexer.add(createIndexRequest(document, indexDate));
> > } catch (Exception e) {
> > System.out.println("In the error block");
> > }
> >
> > }
> > }
> > );
> >
> > Has anyone faced this issue? Any help would be appreciated !!
> >
> > Thanks,
> >
>


[jira] [Created] (FLINK-12927) YARNSessionCapacitySchedulerITCase failed due to non prohibited exception

2019-06-21 Thread Yun Tang (JIRA)
Yun Tang created FLINK-12927:


 Summary: YARNSessionCapacitySchedulerITCase failed due to non 
prohibited exception
 Key: FLINK-12927
 URL: https://issues.apache.org/jira/browse/FLINK-12927
 Project: Flink
  Issue Type: Bug
  Components: Tests
Affects Versions: 1.9.0
Reporter: Yun Tang


YARNSessionCapacitySchedulerITCase fails due to non prohibited exception.

[https://api.travis-ci.org/v3/job/548491542/log.txt]
{code:java}
2019-06-21 08:22:27,313 INFO  
org.apache.flink.runtime.executiongraph.ExecutionGraph- Reduce (SUM(1), 
at main(WordCount.java:79) (2/2) (a1708bb0544633b4e57e8bb84a1a48f3) switched 
from RUNNING to FAILED.
org.apache.flink.util.FlinkException: 0283de7d26d7fb08895955bfb75db496 is no 
longer allocated by job 8f8dced4fb89f8e5cb05d9286683ecaf.
org.apache.flink.util.FlinkException: 0283de7d26d7fb08895955bfb75db496 is no 
longer allocated by job 8f8dced4fb89f8e5cb05d9286683ecaf.
at 
org.apache.flink.runtime.taskexecutor.TaskExecutor.freeNoLongerUsedSlots(TaskExecutor.java:1475)
at 
org.apache.flink.runtime.taskexecutor.TaskExecutor.syncSlotsWithSnapshotFromJobMaster(TaskExecutor.java:1436)
at 
org.apache.flink.runtime.taskexecutor.TaskExecutor.access$3200(TaskExecutor.java:141)
at 
org.apache.flink.runtime.taskexecutor.TaskExecutor$JobManagerHeartbeatListener.lambda$reportPayload$1(TaskExecutor.java:1691)
at 
org.apache.flink.runtime.rpc.akka.AkkaRpcActor.handleRunAsync(AkkaRpcActor.java:397)
at 
org.apache.flink.runtime.rpc.akka.AkkaRpcActor.handleRpcMessage(AkkaRpcActor.java:190)
at 
org.apache.flink.runtime.rpc.akka.AkkaRpcActor.handleMessage(AkkaRpcActor.java:152)
at akka.japi.pf.UnitCaseStatement.apply(CaseStatements.scala:26)
at akka.japi.pf.UnitCaseStatement.apply(CaseStatements.scala:21)
at scala.PartialFunction$class.applyOrElse(PartialFunction.scala:123)
at akka.japi.pf.UnitCaseStatement.applyOrElse(CaseStatements.scala:21)
at scala.PartialFunction$OrElse.applyOrElse(PartialFunction.scala:170)
at scala.PartialFunction$OrElse.applyOrElse(PartialFunction.scala:171)
at scala.PartialFunction$OrElse.applyOrElse(PartialFunction.scala:171)
at akka.actor.Actor$class.aroundReceive(Actor.scala:517)
at akka.actor.AbstractActor.aroundReceive(AbstractActor.scala:225)
at akka.actor.ActorCell.receiveMessage(ActorCell.scala:592)
at akka.actor.ActorCell.invoke(ActorCell.scala:561)
at akka.dispatch.Mailbox.processMailbox(Mailbox.scala:258)
at akka.dispatch.Mailbox.run(Mailbox.scala:225)
at akka.dispatch.Mailbox.exec(Mailbox.scala:235)
at akka.dispatch.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
at 
akka.dispatch.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339)
at akka.dispatch.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)
at 
akka.dispatch.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)
2019-06-21 08:22:27,333 INFO  
org.apache.flink.runtime.executiongraph.ExecutionGraph- Job Flink Java 
Job at Fri Jun 21 08:22:16 UTC 2019 (8f8dced4fb89f8e5cb05d9286683ecaf) switched 
from state RUNNING to FAILING.
org.apache.flink.util.FlinkException: 0283de7d26d7fb08895955bfb75db496 is no 
longer allocated by job 8f8dced4fb89f8e5cb05d9286683ecaf.
at 
org.apache.flink.runtime.taskexecutor.TaskExecutor.freeNoLongerUsedSlots(TaskExecutor.java:1475)
at 
org.apache.flink.runtime.taskexecutor.TaskExecutor.syncSlotsWithSnapshotFromJobMaster(TaskExecutor.java:1436)
at 
org.apache.flink.runtime.taskexecutor.TaskExecutor.access$3200(TaskExecutor.java:141)
at 
org.apache.flink.runtime.taskexecutor.TaskExecutor$JobManagerHeartbeatListener.lambda$reportPayload$1(TaskExecutor.java:1691)
at 
org.apache.flink.runtime.rpc.akka.AkkaRpcActor.handleRunAsync(AkkaRpcActor.java:397)
at 
org.apache.flink.runtime.rpc.akka.AkkaRpcActor.handleRpcMessage(AkkaRpcActor.java:190)
at 
org.apache.flink.runtime.rpc.akka.AkkaRpcActor.handleMessage(AkkaRpcActor.java:152)
at akka.japi.pf.UnitCaseStatement.apply(CaseStatements.scala:26)
at akka.japi.pf.UnitCaseStatement.apply(CaseStatements.scala:21){code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


Re: Flink Elasticsearch Sink Issue

2019-06-21 Thread Ramya Ramamurthy
By default, flushOnCheckpoint is set to True.
So ideally, based on env.enableCheckpointing(30);  the flush to ES
must be triggered every 30seconds, though our ES Flush timeout is 60
seconds.
If the above assumption is correct, then still we do not see packets
getting flushed till the next packet/batch arrives.

Thanks.

On Fri, Jun 21, 2019 at 6:07 PM Ramya Ramamurthy  wrote:

> Yes, we do maintain checkpoints
> env.enableCheckpointing(30);
>
> But we assumed it is for Kafka consumer offsets. Not sure how this is
> useful in this case? Can you pls. elaborate on this.
>
> ~Ramya.
>
>
>
> On Fri, Jun 21, 2019 at 4:33 PM miki haiat  wrote:
>
>> Did you set some checkpoints  configuration?
>>
>> On Fri, Jun 21, 2019, 13:17 Ramya Ramamurthy  wrote:
>>
>> > Hi,
>> >
>> > We use Kafka->Flink->Elasticsearch in our project.
>> > The data to the elasticsearch is not getting flushed, till the next
>> batch
>> > arrives.
>> > E.g.: If the first batch contains 1000 packets, this gets pushed to the
>> > Elastic, only after the next batch arrives [irrespective of reaching the
>> > batch time limit].
>> > Below are the sink configurations we use  currently.
>> >
>> > esSinkBuilder.setBulkFlushMaxActions(2000); // 2K records
>> > esSinkBuilder.setBulkFlushMaxSizeMb(5); // 5 Mb once
>> > esSinkBuilder.setBulkFlushInterval(6); // 1 minute once
>> > esSinkBuilder.setBulkFlushBackoffRetries(3); // Retry three times if
>> bulk
>> > fails
>> > esSinkBuilder.setBulkFlushBackoffDelay(5000); // Retry after 5 seconds
>> > esSinkBuilder.setBulkFlushBackoff(true);
>> >
>> > Sink code :
>> > List httpHosts = new ArrayList<>();
>> > //httpHosts.add(new HttpHost("10.128.0.34", 9200, "http"));
>> > httpHosts.add(new HttpHost("192.168.80.171", 9200, "http"));
>> >
>> > ElasticsearchSink.Builder esSinkBuilder = new
>> > ElasticsearchSink.Builder<>(
>> > httpHosts,
>> > new ElasticsearchSinkFunction() {
>> >
>> > private IndexRequest createIndexRequest(byte[] document, String
>> indexDate)
>> > {
>> >
>> > return new IndexRequest(esIndex + indexDate, esType)
>> > .source(document, XContentType.JSON);
>> >
>> > }
>> >
>> > @Override
>> > public void process(Row r, RuntimeContext runtimeContext, RequestIndexer
>> > requestIndexer) {
>> > byte[] byteArray = serializationSchema.serialize(r);
>> >
>> > ObjectMapper mapper = new ObjectMapper();
>> > ObjectWriter writer = mapper.writer();
>> >
>> > try {
>> > JsonNode jsonNode = mapper.readTree(byteArray);
>> >
>> > long tumbleStart = jsonNode.get("fseen").asLong();
>> >
>> > ZonedDateTime utc =
>> > Instant.ofEpochMilli(tumbleStart).atZone(ZoneOffset.UTC);
>> > String indexDate =
>> DateTimeFormatter.ofPattern(".MM.dd").format(utc);
>> >
>> > byte[] document = writer.writeValueAsBytes(jsonNode);
>> >
>> > requestIndexer.add(createIndexRequest(document, indexDate));
>> > } catch (Exception e) {
>> > System.out.println("In the error block");
>> > }
>> >
>> > }
>> > }
>> > );
>> >
>> > Has anyone faced this issue? Any help would be appreciated !!
>> >
>> > Thanks,
>> >
>>
>


Re: [DISCUSS] Adopting a Code Style and Quality Guide

2019-06-21 Thread Robert Metzger
It seems that the discussion around this topic has settled.

I'm going to turn the Google Doc into a markdown file (maybe also multiple,
I'll try out different things) and then open a pull request for the Flink
website.
I'll post a link to the PR here once I'm done.

On Fri, Jun 14, 2019 at 9:36 AM zhijiang 
wrote:

> Thanks for providing this useful guide for benefiting both contributors
> and committers in consistency.
>
> I just reviewed and learned the whole doc which covers a lot of
> information. Wish it further categoried and put onto somewhere for easily
> traced future.
>
> Best,
> Zhijiang
> --
> From:Robert Metzger 
> Send Time:2019年6月14日(星期五) 14:24
> To:dev 
> Subject:Re: [DISCUSS] Adopting a Code Style and Quality Guide
>
> Thanks a lot for putting this together!
>
> I'm in the process of reworking the "How to contribute" pages [1] and I'm
> happy to add the guide to the Flink website, once the discussion here is
> over.
>
> [1] https://github.com/apache/flink-web/pull/217
>
> On Fri, Jun 14, 2019 at 3:21 AM Kurt Young  wrote:
>
> > Big +1 and thanks for preparing this.
> >
> > I think wha't more important is making sure most all the contributors can
> > follow
> > the same guide, a clear document is definitely a great start. Committers
> > can
> > first try to follow the guide by self, and spread the standard during
> code
> > reviewing.
> >
> > Best,
> > Kurt
> >
> >
> > On Thu, Jun 13, 2019 at 8:28 PM Congxian Qiu 
> > wrote:
> >
> > > +1 for this, I think all contributors can benefit from this.
> > >
> > > Best,
> > > Congxian
> > >
> > >
> > > Aljoscha Krettek  于2019年6月13日周四 下午8:14写道:
> > >
> > > > +1 I think this is a very good effort and should put to rest some
> > > > back-and-forth discussions on PRs and some differences in “style”
> > between
> > > > committers. ;-)
> > > >
> > > > > On 13. Jun 2019, at 10:21, JingsongLee  > > .INVALID>
> > > > wrote:
> > > > >
> > > > > big +1, the content is very useful and enlightening.
> > > > > But it's really too long to look at.
> > > > > +1 for splitting it and expose it to contributors.
> > > > >
> > > > > Even I think it's possible to put its link on the default
> description
> > > of
> > > > > pull request, so that the user has to see it when submits the code.
> > > > >
> > > > > Best, JingsongLee
> > > > >
> > > > >
> > > > > --
> > > > > From:Piotr Nowojski 
> > > > > Send Time:2019年6月13日(星期四) 16:03
> > > > > To:dev 
> > > > > Subject:Re: [DISCUSS] Adopting a Code Style and Quality Guide
> > > > >
> > > > > +1 for it and general content and thank everybody that was involved
> > in
> > > > creating & writing this down.
> > > > >
> > > > > +1 for splitting it up into some easily navigable topics.
> > > > >
> > > > > Piotrek
> > > > >
> > > > >> On 13 Jun 2019, at 09:54, Stephan Ewen  wrote:
> > > > >>
> > > > >> This should definitely be split up into topics, agreed.
> > > > >> And it should be linked form the "how to contribute" page or the
> PR
> > > > >> template to make contributors aware.
> > > > >>
> > > > >> On Thu, Jun 13, 2019 at 9:51 AM Zili Chen 
> > > wrote:
> > > > >>
> > > > >>> Thanks for creating this guide Stephan. It is also
> > > > >>> a good start point to internals doc.
> > > > >>>
> > > > >>> One suggestion is we could finally separate the guide
> > > > >>> into separated files each of which focus on a specific
> > > > >>> topic. Besides, add the guide to our repository should
> > > > >>> make contributors more aware of it.
> > > > >>>
> > > > >>> Best,
> > > > >>> tison.
> > > > >>>
> > > > >>>
> > > > >>> Jeff Zhang  于2019年6月13日周四 下午3:35写道:
> > > > >>>
> > > >  Thanks for the proposal, Stephan. Big +1 on this.
> > > > 
> > > >  This is definitely helpful and indispensable for flink's code
> > > quality
> > > > and
> > > >  healthy community growth.
> > > >  It would also benefit downstream project to integrate flink
> > easier.
> > > > 
> > > > 
> > > >  Till Rohrmann  于2019年6月13日周四 下午3:29写道:
> > > > 
> > > > > Thanks for creating this code style and quality guide Stephan.
> I
> > > > think
> > > > > Flink can benefit from such a guide. Thus +1 for introducing
> and
> > > > >>> adhering
> > > > > to it.
> > > > >
> > > > > Cheers,
> > > > > Till
> > > > >
> > > > > On Thu, Jun 13, 2019 at 5:26 AM Bowen Li 
> > > > wrote:
> > > > >
> > > > >> Hi Stephan,
> > > > >>
> > > > >> Definitely a good guide in principle IMO! I personally already
> > > find
> > > > >>> it
> > > > >> useful in practice to learn from it myself, and also promote
> and
> > > > > cultivate
> > > > >> a better coding habit around by referencing it. So big +1 to
> > adopt
> > > > >>> it.
> > > > >>
> > > > >> Also want to highlight that we need to make real use of it,
> and
> > > keep
> > > > >> ensuring people are aw

Re: [DISCUSS] Adopting a Code Style and Quality Guide

2019-06-21 Thread Stephan Ewen
Thanks, everyone, for the positive feedback :-)

@Robert - It probably makes sense to break this down into various pages,
like PR, general code style guide, Java, component specific guides,
formats, etc.

Best,
Stephan


On Fri, Jun 21, 2019 at 4:29 PM Robert Metzger  wrote:

> It seems that the discussion around this topic has settled.
>
> I'm going to turn the Google Doc into a markdown file (maybe also multiple,
> I'll try out different things) and then open a pull request for the Flink
> website.
> I'll post a link to the PR here once I'm done.
>
> On Fri, Jun 14, 2019 at 9:36 AM zhijiang  .invalid>
> wrote:
>
> > Thanks for providing this useful guide for benefiting both contributors
> > and committers in consistency.
> >
> > I just reviewed and learned the whole doc which covers a lot of
> > information. Wish it further categoried and put onto somewhere for easily
> > traced future.
> >
> > Best,
> > Zhijiang
> > --
> > From:Robert Metzger 
> > Send Time:2019年6月14日(星期五) 14:24
> > To:dev 
> > Subject:Re: [DISCUSS] Adopting a Code Style and Quality Guide
> >
> > Thanks a lot for putting this together!
> >
> > I'm in the process of reworking the "How to contribute" pages [1] and I'm
> > happy to add the guide to the Flink website, once the discussion here is
> > over.
> >
> > [1] https://github.com/apache/flink-web/pull/217
> >
> > On Fri, Jun 14, 2019 at 3:21 AM Kurt Young  wrote:
> >
> > > Big +1 and thanks for preparing this.
> > >
> > > I think wha't more important is making sure most all the contributors
> can
> > > follow
> > > the same guide, a clear document is definitely a great start.
> Committers
> > > can
> > > first try to follow the guide by self, and spread the standard during
> > code
> > > reviewing.
> > >
> > > Best,
> > > Kurt
> > >
> > >
> > > On Thu, Jun 13, 2019 at 8:28 PM Congxian Qiu 
> > > wrote:
> > >
> > > > +1 for this, I think all contributors can benefit from this.
> > > >
> > > > Best,
> > > > Congxian
> > > >
> > > >
> > > > Aljoscha Krettek  于2019年6月13日周四 下午8:14写道:
> > > >
> > > > > +1 I think this is a very good effort and should put to rest some
> > > > > back-and-forth discussions on PRs and some differences in “style”
> > > between
> > > > > committers. ;-)
> > > > >
> > > > > > On 13. Jun 2019, at 10:21, JingsongLee  > > > .INVALID>
> > > > > wrote:
> > > > > >
> > > > > > big +1, the content is very useful and enlightening.
> > > > > > But it's really too long to look at.
> > > > > > +1 for splitting it and expose it to contributors.
> > > > > >
> > > > > > Even I think it's possible to put its link on the default
> > description
> > > > of
> > > > > > pull request, so that the user has to see it when submits the
> code.
> > > > > >
> > > > > > Best, JingsongLee
> > > > > >
> > > > > >
> > > > > >
> --
> > > > > > From:Piotr Nowojski 
> > > > > > Send Time:2019年6月13日(星期四) 16:03
> > > > > > To:dev 
> > > > > > Subject:Re: [DISCUSS] Adopting a Code Style and Quality Guide
> > > > > >
> > > > > > +1 for it and general content and thank everybody that was
> involved
> > > in
> > > > > creating & writing this down.
> > > > > >
> > > > > > +1 for splitting it up into some easily navigable topics.
> > > > > >
> > > > > > Piotrek
> > > > > >
> > > > > >> On 13 Jun 2019, at 09:54, Stephan Ewen 
> wrote:
> > > > > >>
> > > > > >> This should definitely be split up into topics, agreed.
> > > > > >> And it should be linked form the "how to contribute" page or the
> > PR
> > > > > >> template to make contributors aware.
> > > > > >>
> > > > > >> On Thu, Jun 13, 2019 at 9:51 AM Zili Chen  >
> > > > wrote:
> > > > > >>
> > > > > >>> Thanks for creating this guide Stephan. It is also
> > > > > >>> a good start point to internals doc.
> > > > > >>>
> > > > > >>> One suggestion is we could finally separate the guide
> > > > > >>> into separated files each of which focus on a specific
> > > > > >>> topic. Besides, add the guide to our repository should
> > > > > >>> make contributors more aware of it.
> > > > > >>>
> > > > > >>> Best,
> > > > > >>> tison.
> > > > > >>>
> > > > > >>>
> > > > > >>> Jeff Zhang  于2019年6月13日周四 下午3:35写道:
> > > > > >>>
> > > > >  Thanks for the proposal, Stephan. Big +1 on this.
> > > > > 
> > > > >  This is definitely helpful and indispensable for flink's code
> > > > quality
> > > > > and
> > > > >  healthy community growth.
> > > > >  It would also benefit downstream project to integrate flink
> > > easier.
> > > > > 
> > > > > 
> > > > >  Till Rohrmann  于2019年6月13日周四 下午3:29写道:
> > > > > 
> > > > > > Thanks for creating this code style and quality guide
> Stephan.
> > I
> > > > > think
> > > > > > Flink can benefit from such a guide. Thus +1 for introducing
> > and
> > > > > >>> adhering
> > > > > > to it.
> > > > > >
> > > > > > Cheers,
> > > > > > Till
> > > > > >

[jira] [Created] (FLINK-12928) Remove old Flink ML docs

2019-06-21 Thread Seth Wiesman (JIRA)
Seth Wiesman created FLINK-12928:


 Summary: Remove old Flink ML docs
 Key: FLINK-12928
 URL: https://issues.apache.org/jira/browse/FLINK-12928
 Project: Flink
  Issue Type: Improvement
  Components: Documentation
Reporter: Seth Wiesman
Assignee: Seth Wiesman


The old Flink ML library has been removed so we should also remove the 
documentation. Existing users can use documentation from a prior version. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (FLINK-12929) scala.StreamExecutionEnvironment.addSource does not propagate TypeInformation

2019-06-21 Thread Fabio Lombardelli (JIRA)
Fabio Lombardelli created FLINK-12929:
-

 Summary: scala.StreamExecutionEnvironment.addSource does not 
propagate TypeInformation
 Key: FLINK-12929
 URL: https://issues.apache.org/jira/browse/FLINK-12929
 Project: Flink
  Issue Type: Bug
Reporter: Fabio Lombardelli


In {{scala.StreamExecutionEnvironment.addSource}} I would expect that 
{{typeInfo}} is also passed to the {{javaEnv.addSource}} as second parameter 
and not only passed to the {{returns}} method:
{code:java}
  def addSource[T: TypeInformation](function: SourceFunction[T]): DataStream[T] 
= {
require(function != null, "Function must not be null.")

val cleanFun = scalaClean(function)
val typeInfo = implicitly[TypeInformation[T]]
asScalaStream(javaEnv.addSource(cleanFun, ).returns(typeInfo))
  }
{code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (FLINK-12930) Update Chinese "how to contribute" pages

2019-06-21 Thread Robert Metzger (JIRA)
Robert Metzger created FLINK-12930:
--

 Summary: Update Chinese "how to contribute" pages
 Key: FLINK-12930
 URL: https://issues.apache.org/jira/browse/FLINK-12930
 Project: Flink
  Issue Type: Task
  Components: chinese-translation, Project Website
Reporter: Robert Metzger


FLINK-12605 updated the "How to contribute" pages. Thus, we need to update the 
Chinese translation of those pages.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


Re: [DISCUSS] Connectors and NULL handling

2019-06-21 Thread Rong Rong
Hi Aljoscha,

Sorry for the late reply, I think the solution makes sense. Using the NULL
return value to mark a message is corrupted is not a valid way since NULL
value has semantic meaning in not just Kafka but also in a lot of other
contexts.

I was wondering if we can have a more meaningful interface for dealing with
corrupted messages. I am thinking of 2 options on top of my head:
1. Create some special deserializer attribute (or a special record) to
indicate corrupted messages like you suggested; this way we can not only
encode the deserializing error but allow users to encode any corruption
information for downstream processing.
2. Create a standard fetch error handling API on AbstractFetcher (for
Kafka) and DataFetcher (for Kinesis); this way we can also handle error's
other than deserializing problem, for example some even lower level
exceptions like CRC check failure.

I think either way will work. Also, as long as there's a way for end users
to extend the error handling for message corruption, it will not
reintroduce the problems these 2 original JIRA was trying to address.

--
Rong

On Tue, Jun 18, 2019 at 2:06 AM Aljoscha Krettek 
wrote:

> Hi All,
>
> Thanks to Gary, I recently came upon an interesting cluster of issues:
>  - https://issues.apache.org/jira/browse/FLINK-3679: Allow Kafka consumer
> to skip corrupted messages
>  - https://issues.apache.org/jira/browse/FLINK-5583: Support flexible
> error handling in the Kafka consumer
>  - https://issues.apache.org/jira/browse/FLINK-11820: SimpleStringSchema
> handle message record which value is null
>
> In light of the last one I’d like to look again at the first two. What
> they introduced is that when the deserialisation schema returns NULL, the
> Kafka consumer (and maybe also the Kinesis consumer) silently drops the
> record. In Kafka NULL values have semantic meaning, i.e. they usually
> encode a DELETE for the key of the message. If SimpleStringSchema returned
> that null, our consumer would silently drop it and we would lose that
> DELETE message. That doesn’t seem right.
>
> I think the right solution for null handling is to introduce a custom
> record type that encodes both Kafka NULL values and the possibility of a
> corrupt message that cannot be deserialised. Something like an Either type.
> It’s then up to the application to handle those cases.
>
> Concretely, I want to discuss whether we should change our consumers to
> not silently drop null records, but instead see them as errors. For
> FLINK-11820, the solution is for users to write their own custom schema
> that handles null values and returns a user-defined types that signals null
> values.
>
> What do you think?
>
> Aljoscha
>
>


Re: [ANNOUNCEMENT] June 2019 Bay Area Apache Flink Meetup

2019-06-21 Thread Xuefu Zhang
Hi all,

As the event is around the corner. If you haven't responded, please RSVP at
meetup.com. Otherwise, I will see you next Wednesday, June 26.

Regards,
Xuefu

On Mon, Jun 10, 2019 at 7:50 PM Xuefu Zhang  wrote:

> Hi all,
>
> As promised, we planned to have quarterly Flink meetup and now it's about
> the time. Thus, I'm happy to announce that the next Bay Area Apache Flink
> Meetup [1] is scheduled on June 26 at Zendesk, 1019 Market Street in San
> Francisco.
>
> Schedule:
>
> 6:30 - 7:00 PM Networking and Refreshments
> 7:00 - 8:30 PM Short talks
>
> Many thanks go to Zendesk for their sponsorship. At the same time, we are
> open to 2 or 3 short talks. If interested, please let me know.
>
> Thanks,
>
> Xuefu
>
> [1] https://www.meetup.com/Bay-Area-Apache-Flink-Meetup/events/262216929
>>
>


Join in Apache Flink community

2019-06-21 Thread Chance Li
Hi Guys,
   I want to contribute to Apache Flink. Would you please give me
contributor permission? My JIRA ID: Chance Li, email: chanc...@gmail.com


Regards,
Chance


[jira] [Created] (FLINK-12931) lint-python.sh cannot find flake8

2019-06-21 Thread Bowen Li (JIRA)
Bowen Li created FLINK-12931:


 Summary: lint-python.sh cannot find flake8
 Key: FLINK-12931
 URL: https://issues.apache.org/jira/browse/FLINK-12931
 Project: Flink
  Issue Type: Bug
  Components: API / Python
Affects Versions: 1.9.0
Reporter: Bowen Li
Assignee: sunjincheng
 Fix For: 1.9.0


Hi guys,

I tried to run tests for flink-python with {{./dev/lint-python.sh}} by 
following README. But it reported it couldn't find flake8, error as

{code:java}
./dev/lint-python.sh: line 490: /.../flink/flink-python/dev/.conda/bin/flake8: 
No such file or directory
{code}

I've tried {{./dev/lint-python.sh -f}}, also didn't work.

I suspect the reason may be that I already have an anaconda3 installed and it 
conflicts with the miniconda installed by flink-python somehow. I'm not fully 
sure about that.

If that's the reason, I think we need to try to resolve the conflict because 
anaconda is a pretty common package that developers install and use. We 
shouldn't require devs to uninstall their existing conda environment in order 
to develop flink-python and run its tests. It's better if flink-python can have 
a well isolated environment on machines.




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (FLINK-12932) support show catalogs and show databases in SQL CLI

2019-06-21 Thread Bowen Li (JIRA)
Bowen Li created FLINK-12932:


 Summary: support show catalogs and show databases in SQL CLI
 Key: FLINK-12932
 URL: https://issues.apache.org/jira/browse/FLINK-12932
 Project: Flink
  Issue Type: Sub-task
  Components: Table SQL / API, Table SQL / Client
Reporter: Bowen Li
Assignee: Bowen Li
 Fix For: 1.9.0






--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (FLINK-12933) support "use catalog" and "use database" in SQL CLI

2019-06-21 Thread Bowen Li (JIRA)
Bowen Li created FLINK-12933:


 Summary: support "use catalog" and "use database" in SQL CLI
 Key: FLINK-12933
 URL: https://issues.apache.org/jira/browse/FLINK-12933
 Project: Flink
  Issue Type: Sub-task
  Components: Table SQL / Client
Reporter: Bowen Li
Assignee: Bowen Li
 Fix For: 1.9.0






--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (FLINK-12934) add additional dependencies for flink-connector-hive to connect to standalone hive metastore

2019-06-21 Thread Bowen Li (JIRA)
Bowen Li created FLINK-12934:


 Summary: add additional dependencies for flink-connector-hive to 
connect to standalone hive metastore
 Key: FLINK-12934
 URL: https://issues.apache.org/jira/browse/FLINK-12934
 Project: Flink
  Issue Type: Sub-task
  Components: Connectors / Hive
Reporter: Bowen Li
Assignee: Bowen Li
 Fix For: 1.9.0


we've been testing flink-connector-hive to connect to built-in hive metastore. 
Connecting to standalone hive metastore requires additional dependencies:


{code:java}
commons-configuration-1.6.jar
commons-logging-1.2.jar
servlet-api-2.5.jar
hadoop-auth-2.7.2.jar
{code}

All dependencies needed to make flink-connector-hive able to connect to hive 
metastore for HiveCatalog include:

{code:java}
commons-configuration-1.6.jar   hadoop-common-2.7.2.jar
commons-logging-1.2.jar 
hadoop-mapreduce-client-core-2.7.2.jar
flink-connector-hive_2.11-1.9-SNAPSHOT.jar  hive-exec-2.3.4.jar
flink-dist_2.11-1.9-SNAPSHOT.jarhive-metastore-2.3.4.jar
flink-hadoop-compatibility_2.11-1.9-SNAPSHOT.jarlog4j-1.2.17.jar
flink-sql-client_2.11-1.9-SNAPSHOT.jar  servlet-api-2.5.jar
flink-table_2.11-1.9-SNAPSHOT.jar   slf4j-log4j12-1.7.15.jar
hadoop-auth-2.7.2.jar
{code}




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (FLINK-12935) package flink-connector-hive and some flink dependencies into /opt of flink distribution

2019-06-21 Thread Bowen Li (JIRA)
Bowen Li created FLINK-12935:


 Summary: package flink-connector-hive and some flink dependencies 
into /opt of flink distribution
 Key: FLINK-12935
 URL: https://issues.apache.org/jira/browse/FLINK-12935
 Project: Flink
  Issue Type: Sub-task
  Components: Connectors / Hive
Reporter: Bowen Li
Assignee: Bowen Li
 Fix For: 1.9.0






--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


Specifying parallelism on join operation

2019-06-21 Thread Roshan Naik
 I cant find any place to specify the parallelism for the join here. 

stream1.join( stream2 )
                     .where( .. )
                    .equalTo( .. )
                    .window( .. )
                    .apply( .. );

How can we specify that ? 

-roshan


[jira] [Created] (FLINK-12936) Support intersect all and minus all to blink planner

2019-06-21 Thread Jingsong Lee (JIRA)
Jingsong Lee created FLINK-12936:


 Summary: Support intersect all and minus all to blink planner
 Key: FLINK-12936
 URL: https://issues.apache.org/jira/browse/FLINK-12936
 Project: Flink
  Issue Type: New Feature
  Components: Table SQL / Planner
Reporter: Jingsong Lee
Assignee: Jingsong Lee


Now, we just support intersect and minus, See ReplaceIntersectWithSemiJoinRule 
and ReplaceMinusWithAntiJoinRule, replace intersect with null aware semi-join 
and distinct aggregate.

We need support intersect all and minus all too.

Presto and Spark already support them:

[https://github.com/prestodb/presto/issues/4918]

https://issues.apache.org/jira/browse/SPARK-21274

I think them have a good rewrite design and we can follow them:

1.For intersect all

Input Query
{code:java}
SELECT c1 FROM ut1 INTERSECT ALL SELECT c1 FROM ut2
{code}
Rewritten Query
{code:java}
  SELECT c1
FROM (
 SELECT replicate_row(min_count, c1)
 FROM (
  SELECT c1,
 IF (vcol1_cnt > vcol2_cnt, vcol2_cnt, vcol1_cnt) AS 
min_count
  FROM (
   SELECT   c1, count(vcol1) as vcol1_cnt, count(vcol2) as 
vcol2_cnt
   FROM (
SELECT c1, true as vcol1, null as vcol2 FROM ut1
UNION ALL
SELECT c1, null as vcol1, true as vcol2 FROM ut2
) AS union_all
   GROUP BY c1
   HAVING vcol1_cnt >= 1 AND vcol2_cnt >= 1
  )
  )
  )
{code}
2.For minus all:

Input Query
{code:java}
SELECT c1 FROM ut1 EXCEPT ALL SELECT c1 FROM ut2
{code}
Rewritten Query
{code:java}
 SELECT c1
FROM (
 SELECT replicate_rows(sum_val, c1)
   FROM (
 SELECT c1, sum_val
   FROM (
 SELECT c1, sum(vcol) AS sum_val
   FROM (
 SELECT 1L as vcol, c1 FROM ut1
 UNION ALL
 SELECT -1L as vcol, c1 FROM ut2
  ) AS union_all
GROUP BY union_all.c1
  )
WHERE sum_val > 0
   )
   )
{code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (FLINK-12937) Introduce join reorder planner rules in blink planner

2019-06-21 Thread godfrey he (JIRA)
godfrey he created FLINK-12937:
--

 Summary: Introduce join reorder planner rules in blink planner
 Key: FLINK-12937
 URL: https://issues.apache.org/jira/browse/FLINK-12937
 Project: Flink
  Issue Type: New Feature
  Components: Table SQL / Planner
Reporter: godfrey he
Assignee: godfrey he


This issue aims to let blink planner support join reorder. 
`LoptOptimizeJoinRule` in Calcite could meet our requirement for now, so we 
could use directly this rule in blink planner. `JoinToMultiJoinRule` , 
`ProjectMultiJoinMergeRule` and `FilterMultiJoinMergeRule` should be also 
introduced to support `LoptOptimizeJoinRule`.

additionally, we add a new rule named `RewriteMultiJoinConditionRule` which 
could apply transitive closure on `MultiJoin` for equi-join predicates to 
create more optimization possibilities.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (FLINK-12938) Translate "Streaming Connectors" page into Chinese

2019-06-21 Thread Jark Wu (JIRA)
Jark Wu created FLINK-12938:
---

 Summary: Translate "Streaming Connectors" page into Chinese
 Key: FLINK-12938
 URL: https://issues.apache.org/jira/browse/FLINK-12938
 Project: Flink
  Issue Type: Sub-task
  Components: chinese-translation, Documentation
Reporter: Jark Wu


Translate the page 
"https://ci.apache.org/projects/flink/flink-docs-master/dev/connectors/"; into 
Chinese.

The doc located in "flink/docs/dev/connectors/index.zh.md"



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (FLINK-12939) Translate "Apache Kafka Connector" page into Chinese

2019-06-21 Thread Jark Wu (JIRA)
Jark Wu created FLINK-12939:
---

 Summary: Translate "Apache Kafka Connector" page into Chinese
 Key: FLINK-12939
 URL: https://issues.apache.org/jira/browse/FLINK-12939
 Project: Flink
  Issue Type: Sub-task
  Components: chinese-translation, Documentation
Reporter: Jark Wu


Translate the page 
"https://ci.apache.org/projects/flink/flink-docs-master/dev/connectors/kafka.html";
 into Chinese.

The doc located in "flink/docs/dev/connectors/kafka.zh.md"



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (FLINK-12940) Translate "Apache Cassandra Connector" page into Chinese

2019-06-21 Thread Jark Wu (JIRA)
Jark Wu created FLINK-12940:
---

 Summary: Translate "Apache Cassandra Connector" page into Chinese
 Key: FLINK-12940
 URL: https://issues.apache.org/jira/browse/FLINK-12940
 Project: Flink
  Issue Type: Sub-task
  Components: chinese-translation, Documentation
Reporter: Jark Wu


Translate the internal page 
"https://ci.apache.org/projects/flink/flink-docs-master/dev/connectors/cassandra.html";
 into Chinese.

 

The doc located in "flink/docs/dev/connectors/cassandra.zh.md"



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (FLINK-12941) Translate "Amazon AWS Kinesis Streams Connector" page into Chinese

2019-06-21 Thread Jark Wu (JIRA)
Jark Wu created FLINK-12941:
---

 Summary: Translate "Amazon AWS Kinesis Streams Connector" page 
into Chinese
 Key: FLINK-12941
 URL: https://issues.apache.org/jira/browse/FLINK-12941
 Project: Flink
  Issue Type: Sub-task
  Components: chinese-translation, Documentation
Reporter: Jark Wu


Translate the internal page 
"https://ci.apache.org/projects/flink/flink-docs-master/dev/connectors/kinesis.html";
 into Chinese.

 

The doc located in "flink/docs/dev/connectors/kinesis.zh.md"



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (FLINK-12942) Translate "Elasticsearch Connector" page into Chinese

2019-06-21 Thread Jark Wu (JIRA)
Jark Wu created FLINK-12942:
---

 Summary: Translate "Elasticsearch Connector" page into Chinese
 Key: FLINK-12942
 URL: https://issues.apache.org/jira/browse/FLINK-12942
 Project: Flink
  Issue Type: Sub-task
  Components: chinese-translation, Documentation
Reporter: Jark Wu


Translate the internal page 
"https://ci.apache.org/projects/flink/flink-docs-master/dev/connectors/elasticsearch.html";
 into Chinese.

 

The doc located in "flink/docs/dev/connectors/elasticsearch.zh.md"



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (FLINK-12943) Translate "HDFS Connector" page into Chinese

2019-06-21 Thread Jark Wu (JIRA)
Jark Wu created FLINK-12943:
---

 Summary: Translate "HDFS Connector" page into Chinese
 Key: FLINK-12943
 URL: https://issues.apache.org/jira/browse/FLINK-12943
 Project: Flink
  Issue Type: Sub-task
  Components: chinese-translation, Documentation
Reporter: Jark Wu


Translate the internal page 
"https://ci.apache.org/projects/flink/flink-docs-master/dev/connectors/filesystem_sink.html";
 into Chinese.

 

The doc located in "flink/docs/dev/connectors/filesystem_sink.zh.md"



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (FLINK-12946) Translate "Apache NiFi Connector" page into Chinese

2019-06-21 Thread Jark Wu (JIRA)
Jark Wu created FLINK-12946:
---

 Summary: Translate "Apache NiFi Connector" page into Chinese
 Key: FLINK-12946
 URL: https://issues.apache.org/jira/browse/FLINK-12946
 Project: Flink
  Issue Type: Sub-task
  Components: chinese-translation, Documentation
Reporter: Jark Wu


Translate the internal page 
"https://ci.apache.org/projects/flink/flink-docs-master/dev/connectors/nifi.htmll";
 into Chinese.

The doc located in "flink/docs/dev/connectors/nifi.zh.md"



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (FLINK-12944) Translate "Streaming File Sink" page into Chinese

2019-06-21 Thread Jark Wu (JIRA)
Jark Wu created FLINK-12944:
---

 Summary: Translate "Streaming File Sink" page into Chinese
 Key: FLINK-12944
 URL: https://issues.apache.org/jira/browse/FLINK-12944
 Project: Flink
  Issue Type: Sub-task
  Components: chinese-translation, Documentation
Reporter: Jark Wu


Translate the internal page 
"https://ci.apache.org/projects/flink/flink-docs-master/dev/connectors/streamfile_sink.html";
 into Chinese.

The doc located in "flink/docs/dev/connectors/streamfile_sink.zh.md"



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (FLINK-12945) Translate "RabbitMQ Connector" page into Chinese

2019-06-21 Thread Jark Wu (JIRA)
Jark Wu created FLINK-12945:
---

 Summary: Translate "RabbitMQ Connector" page into Chinese
 Key: FLINK-12945
 URL: https://issues.apache.org/jira/browse/FLINK-12945
 Project: Flink
  Issue Type: Sub-task
  Components: chinese-translation, Documentation
Reporter: Jark Wu


Translate the internal page 
"https://ci.apache.org/projects/flink/flink-docs-master/dev/connectors/rabbitmq.html";
 into Chinese.

The doc located in "flink/docs/dev/connectors/rabbitmq.zh.md"



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (FLINK-12947) Translate "Twitter Connector" page into Chinese

2019-06-21 Thread Jark Wu (JIRA)
Jark Wu created FLINK-12947:
---

 Summary: Translate "Twitter Connector" page into Chinese
 Key: FLINK-12947
 URL: https://issues.apache.org/jira/browse/FLINK-12947
 Project: Flink
  Issue Type: Sub-task
  Components: chinese-translation, Documentation
Reporter: Jark Wu


Translate the internal page 
"https://ci.apache.org/projects/flink/flink-docs-master/dev/connectors/twitter.html";
 into Chinese.

The doc located in "flink/docs/dev/connectors/twitter.zh.md"



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)