Re: [DISCUSS] Support notifyOnMaster for notifyCheckpointComplete

2019-09-06 Thread Dian Fu
Hi Shimin,

It can be guaranteed to be an atomic operation. This is ensured by the RPC 
framework. You could take a look at RpcEndpoint for more details.

Regards,
Dian

> 在 2019年9月6日,下午2:35,shimin yang  写道:
> 
> Hi Fu,
> 
> Thank you for the remind. I think it would work in my case as long as it's
> an atomic operation.
> 
> Dian Fu  于2019年9月6日周五 下午2:22写道:
> 
>> Hi Jingsong,
>> 
>> Thanks for bring up this discussion. You can try to look at the
>> GlobalAggregateManager to see if it can meet your requirements. It can be
>> got via StreamingRuntimeContext#getGlobalAggregateManager().
>> 
>> Regards,
>> Dian
>> 
>>> 在 2019年9月6日,下午1:39,shimin yang  写道:
>>> 
>>> Hi Jingsong,
>>> 
>>> Big fan of this idea. We faced the same problem and resolved by adding a
>>> distributed lock. It would be nice to have this feature in JobMaster,
>> which
>>> can replace the lock.
>>> 
>>> Best,
>>> Shimin
>>> 
>>> JingsongLee  于2019年9月6日周五 下午12:20写道:
>>> 
 Hi devs:
 
 I try to implement streaming file sink for table[1] like
>> StreamingFileSink.
 If the underlying is a HiveFormat, or a format that updates visibility
 through a metaStore, I have to update the metaStore in the
 notifyCheckpointComplete, but this operation occurs on the task side,
 which will lead to distributed access to the metaStore, which will
 lead to bottleneck.
 
 So I'm curious if we can support notifyOnMaster for
 notifyCheckpointComplete like FinalizeOnMaster.
 
 What do you think?
 
 [1]
 
>> https://docs.google.com/document/d/15R3vZ1R_pAHcvJkRx_CWleXgl08WL3k_ZpnWSdzP7GY/edit?usp=sharing
 
 Best,
 Jingsong Lee
>> 
>> 



Re: [DISCUSS] Support notifyOnMaster for notifyCheckpointComplete

2019-09-06 Thread shimin yang
Hi Fu,

That'll be nice.

Thanks.

Best,
Shimin

Dian Fu  于2019年9月6日周五 下午3:17写道:

> Hi Shimin,
>
> It can be guaranteed to be an atomic operation. This is ensured by the RPC
> framework. You could take a look at RpcEndpoint for more details.
>
> Regards,
> Dian
>
> > 在 2019年9月6日,下午2:35,shimin yang  写道:
> >
> > Hi Fu,
> >
> > Thank you for the remind. I think it would work in my case as long as
> it's
> > an atomic operation.
> >
> > Dian Fu  于2019年9月6日周五 下午2:22写道:
> >
> >> Hi Jingsong,
> >>
> >> Thanks for bring up this discussion. You can try to look at the
> >> GlobalAggregateManager to see if it can meet your requirements. It can
> be
> >> got via StreamingRuntimeContext#getGlobalAggregateManager().
> >>
> >> Regards,
> >> Dian
> >>
> >>> 在 2019年9月6日,下午1:39,shimin yang  写道:
> >>>
> >>> Hi Jingsong,
> >>>
> >>> Big fan of this idea. We faced the same problem and resolved by adding
> a
> >>> distributed lock. It would be nice to have this feature in JobMaster,
> >> which
> >>> can replace the lock.
> >>>
> >>> Best,
> >>> Shimin
> >>>
> >>> JingsongLee  于2019年9月6日周五 下午12:20写道:
> >>>
>  Hi devs:
> 
>  I try to implement streaming file sink for table[1] like
> >> StreamingFileSink.
>  If the underlying is a HiveFormat, or a format that updates visibility
>  through a metaStore, I have to update the metaStore in the
>  notifyCheckpointComplete, but this operation occurs on the task side,
>  which will lead to distributed access to the metaStore, which will
>  lead to bottleneck.
> 
>  So I'm curious if we can support notifyOnMaster for
>  notifyCheckpointComplete like FinalizeOnMaster.
> 
>  What do you think?
> 
>  [1]
> 
> >>
> https://docs.google.com/document/d/15R3vZ1R_pAHcvJkRx_CWleXgl08WL3k_ZpnWSdzP7GY/edit?usp=sharing
> 
>  Best,
>  Jingsong Lee
> >>
> >>
>
>


Re: [DISCUSS] FLIP-57 - Rework FunctionCatalog

2019-09-06 Thread Dawid Wysakowicz
Hi Xuefu,

Thank you for your answers.

Let me summarize my understanding. In principle we differ only in
regards to the fact if a temporary function can be only 1-part or only
3-part identified. I can reconfirm that if the community decides it
prefers the 1-part approach I will commit to that, with the assumption
that we will force ONLY 1-part function names. (We will parse identifier
and throw exception if a user tries to register e.g. db.temp_func).

My preference is though the 3-part approach:

  * there are some functions that it makes no sense to override, e.g.
CAST, moreover I'm afraid that allowing overriding such will lead to
high inconsistency, similar to those that I mentioned spark has
  * you cannot shadow a fully-qualified function. (If a user fully
qualifies his/her objects in a SQL query, which is often considered
a good practice)
  * it does not differentiate between functions & temporary functions.
Temporary functions just differ with regards to their life-cycle.
The registration & usage is exactly the same.

As it can be seen, the proposed concept regarding temp function and
function resolution is quite simple.

Both approaches are equally simple. I would even say the 3-part approach
is slightly simpler as it does not have to care about some special
built-in functions such as CAST.

I don't want to express my opinion on the differentiation between
built-in functions and "external" built-in functions in this thread as
it is rather orthogonal, but I also like the modular approach and I
definitely don't like the special syntax "cat::function". I think it's
better to stick to a standard or at least other proved solutions from
other systems.

Best,

Dawid

On 05/09/2019 10:12, Xuefu Z wrote:
> Hi David,
>
> Thanks for sharing your thoughts and  request for clarifications. I believe
> that I fully understood your proposal, which does has its merit. However,
> it's different from ours. Here are the answers to your questions:
>
> Re #1: yes, the temp functions in the proposal are global and have just
> one-part names, similar to built-in functions. Two- or three-part names are
> not allowed.
>
> Re #2: not applicable as two- or three-part names are disallowed.
>
> Re #3: same as above. Referencing external built-in functions is achieved
> either implicitly (only the built-in functions in the current catalogs are
> considered) or via special syntax such as cat::function. However, we are
> looking into the modular approach that Time suggested with other feedback
> received from the community.
>
> Re #4: the resolution order goes like the following in our proposal:
>
> 1. temporary functions
> 2. bulit-in functions (including those augmented by add-on modules)
> 3. built-in functions in current catalog (this will not be needed if the
> special syntax "cat::function" is required)
> 4. functions in current catalog and db.
>
> If we go with the modular approach and make external built-in functions as
> an add-on module, the 2 and 3 above will be combined. In essence, the
> resolution order is equivalent in the two approaches.
>
> By the way, resolution order matters only for simple name reference. For
> names such as db.function (interpreted as current_cat/db/function) or
> cat.db.function, the reference is unambiguous, so on resolution is needed.
>
> As it can be seen, the proposed concept regarding temp function and
> function resolution is quite simple. Additionally, the proposed resolution
> order allows temp function to shadow a built-in function, which is
> important (though not decisive) in our opinion.
>
> I started liking the modular approach as the resolution order will only
> include 1, 2, and 4, which is simpler and more generic. That's why I
> suggested we look more into this direction.
>
> Please let me know if there are further questions.
>
> Thanks,
> Xuefu
>
>
>
>
> On Thu, Sep 5, 2019 at 2:42 PM Dawid Wysakowicz 
> wrote:
>
>> Hi Xuefu,
>>
>> Just wanted to summarize my opinion on the one topic (temporary functions).
>>
>> My preference would be to make temporary functions always 3-part qualified
>> (as a result that would prohibit overriding built-in functions). Having
>> said that if the community decides that it's better to allow overriding
>> built-in functions I am fine with it and can commit to that decision.
>>
>> I wanted to ask if you could clarify a few points for me around that
>> option.
>>
>>1. Would you enforce temporary functions to be always just a single
>>name (without db & cat) as hive does, or would you allow also 3 or even 2
>>part identifiers?
>>2. Assuming 2/3-part paths. How would you register a function from a
>>following statement: CREATE TEMPORARY FUNCTION db.func? Would that shadow
>>all functions named 'func' in all databases named 'db' in all catalogs? Or
>>would you shadow only function 'func' in database 'db' in current catalog?
>>3. This point is still under discussion, but was mentioned a few
>>

[jira] [Created] (FLINK-13987) add new logs api, see more log files and can see logs by pages

2019-09-06 Thread lining (Jira)
lining created FLINK-13987:
--

 Summary: add new logs api, see more log files and can see logs by 
pages 
 Key: FLINK-13987
 URL: https://issues.apache.org/jira/browse/FLINK-13987
 Project: Flink
  Issue Type: New Feature
  Components: Runtime / REST
Reporter: lining


As log files becoming more large, current log api often blocks or don't work. 
It's unfriendly for user. As application runs on jvm, sometime user need see 
log of gc. Above all, we need new api.
 * /taskmanagers/taskmanagerid/logs -> list all log file
 * /taskmanagers/taskmanagerid/logs/:filename?start=[start]&count=[count]
 * /jobmanager/logs -> list all log file
 * /jobmanager/logs/:filename?start=[start]&count=[count]



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


Re: [DISCUSS] FLIP-64: Support for Temporary Objects in Table module

2019-09-06 Thread Xuefu Z
Thanks to Dawid for starting the discussion and writeup. It looks pretty
good to me except that I'm a little concerned about the object reference
and string parsing in the code, which seems to an anti-pattern to OOP. Have
we considered using ObjectIdenitifier with optional catalog and db parts,
esp. if we are worried about arguments of variable length or method
overloading? It's quite likely that the result of string parsing is an
ObjectIdentifier instance any way.

Having string parsing logic in the code is a little dangerous as it
duplicates part of the DDL/DML parsing, and they can easily get out of sync.

Thanks,
Xuefu

On Fri, Sep 6, 2019 at 1:57 PM JingsongLee 
wrote:

> Thanks dawid, +1 for this approach.
>
> One concern is the removal of registerTableSink & registerTableSource
>  in TableEnvironment. It has two alternatives:
> 1.the properties approach (DDL, descriptor).
> 2.from/toDataStream.
>
> #1 can only be properties, not java states, and some Connectors
>  are difficult to convert all states to properties.
> #2 can contain java state. But can't use TableSource-related features,
> like project & filter push down, partition support, etc..
>
> Any idea about this?
>
> Best,
> Jingsong Lee
>
>
> --
> From:Dawid Wysakowicz 
> Send Time:2019年9月4日(星期三) 22:20
> To:dev 
> Subject:[DISCUSS] FLIP-64: Support for Temporary Objects in Table module
>
> Hi all,
> As part of FLIP-30 a Catalog API was introduced that enables storing table
> meta objects permanently. At the same time the majority of current APIs
> create temporary objects that cannot be serialized. We should clarify the
> creation of meta objects (tables, views, functions) in a unified way.
> Another current problem in the API is that all the temporary objects are
> stored in a special built-in catalog, which is not very intuitive for many
> users, as they must be aware of that catalog to reference temporary objects.
> Lastly, different APIs have different ways of providing object paths:
>
> String path…,
> String path, String pathContinued…
> String name
> We should choose one approach and unify it across all APIs.
> I suggest a FLIP to address the above issues.
> Looking forward to your opinions.
> FLIP link:
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-64%3A+Support+for+Temporary+Objects+in+Table+module
>
>

-- 
Xuefu Zhang

"In Honey We Trust!"


Re: [DISCUSS] FLIP-64: Support for Temporary Objects in Table module

2019-09-06 Thread Dawid Wysakowicz
Hi all,

@Jingsong Could you elaborate a bit more what do you mean by

/"//some Connectors//are difficult to convert all states to properties"/

All the Flink provided connectors will definitely be expressible with
properties (In the end you should be able to use them from DDL). I think
if a TableSource is complex enough that it handles filter push down,
partition support etc. should rather be made available both from DDL &
java/scala code. I'm happy to reconsider adding
registerTemporaryTable(String path, TableSource source) if you have some
concrete examples in mind.
//


@Xuefu: We also considered the ObjectIdentifier (or actually introducing
a new identifier representation to differentiate between resolved and
unresolved identifiers) with the same concerns. We decided to suggest
the string & parsing logic because of usability.

/    tEnv.from("cat.db.table")/

is shorter and easier to write than

/    tEnv.from(Identifier.for("cat", "db", "name")/

And also implicitly solves the problem what happens if a user (e.g. used
to other systems) uses that API in a following manner:

/    tEnv.from(Identifier.for("db.name")/

I'm happy to revisit it if the general consensus is that it's better to
use the OO aproach.

Best,

Dawid
//

On 06/09/2019 10:00, Xuefu Z wrote:
> Thanks to Dawid for starting the discussion and writeup. It looks pretty
> good to me except that I'm a little concerned about the object reference
> and string parsing in the code, which seems to an anti-pattern to OOP. Have
> we considered using ObjectIdenitifier with optional catalog and db parts,
> esp. if we are worried about arguments of variable length or method
> overloading? It's quite likely that the result of string parsing is an
> ObjectIdentifier instance any way.
>
> Having string parsing logic in the code is a little dangerous as it
> duplicates part of the DDL/DML parsing, and they can easily get out of sync.
>
> Thanks,
> Xuefu
>
> On Fri, Sep 6, 2019 at 1:57 PM JingsongLee 
> wrote:
>
>> Thanks dawid, +1 for this approach.
>>
>> One concern is the removal of registerTableSink & registerTableSource
>>  in TableEnvironment. It has two alternatives:
>> 1.the properties approach (DDL, descriptor).
>> 2.from/toDataStream.
>>
>> #1 can only be properties, not java states, and some Connectors
>>  are difficult to convert all states to properties.
>> #2 can contain java state. But can't use TableSource-related features,
>> like project & filter push down, partition support, etc..
>>
>> Any idea about this?
>>
>> Best,
>> Jingsong Lee
>>
>>
>> --
>> From:Dawid Wysakowicz 
>> Send Time:2019年9月4日(星期三) 22:20
>> To:dev 
>> Subject:[DISCUSS] FLIP-64: Support for Temporary Objects in Table module
>>
>> Hi all,
>> As part of FLIP-30 a Catalog API was introduced that enables storing table
>> meta objects permanently. At the same time the majority of current APIs
>> create temporary objects that cannot be serialized. We should clarify the
>> creation of meta objects (tables, views, functions) in a unified way.
>> Another current problem in the API is that all the temporary objects are
>> stored in a special built-in catalog, which is not very intuitive for many
>> users, as they must be aware of that catalog to reference temporary objects.
>> Lastly, different APIs have different ways of providing object paths:
>>
>> String path…,
>> String path, String pathContinued…
>> String name
>> We should choose one approach and unify it across all APIs.
>> I suggest a FLIP to address the above issues.
>> Looking forward to your opinions.
>> FLIP link:
>> https://cwiki.apache.org/confluence/display/FLINK/FLIP-64%3A+Support+for+Temporary+Objects+in+Table+module
>>
>>


signature.asc
Description: OpenPGP digital signature


Re: [DISCUSS] FLIP-57 - Rework FunctionCatalog

2019-09-06 Thread Xuefu Z
Hi Dawid,

Thank you for your summary. While the only difference in the two proposals
is one- or three-part in naming, the consequence would be substantial.

To me, there are two major use cases of temporary functions compared to
persistent ones:
1. Temporary in nature and auto managed by the session. More often than
not, admin doesn't even allow user to create persistent functions.
2. Provide an opportunity to overwriting system built-in functions.

Since built-in functions has one-part name, requiring three-part name for
temporary functions eliminates the overwriting opportunity.

One-part naming essentially puts all temp functions under a single
namespace and simplifies function resolution, such as we don't need to
consider the case of a temp function and a persistent function with the
same name under the same database.

I agree having three-parts does have its merits, such as consistency with
other temporary objects (table) and minor difference between temp vs
catalog functions. However, there is a slight difference between tables and
function in that there is no built-in table in SQL so there is no need to
overwrite it.

I'm not sure if I fully agree the benefits you listed as the advantages of
the three-part naming of temp functions.
  -- Allowing overwriting built-in functions is a benefit and the solution
for disallowing certain overwriting shouldn't be totally banning it.
  -- Catalog functions are defined by users, and we suppose they can
drop/alter it in any way they want. Thus, overwriting a catalog function
doesn't seem to be a strong use case that we should be concerned about.
Rather, there are known use case for overwriting built-in functions.

Thus, personally I would prefer one-part name for temporary functions. In
lack of SQL standard on this, I certainly like to get opinions from others
to see if a consensus can be eventually reached.

(To your point on modular approach to support external built-in functions,
we saw the value and are actively looking into it. Thanks for sharing your
opinion on that.)

Thanks,
Xuefu

On Fri, Sep 6, 2019 at 3:48 PM Dawid Wysakowicz 
wrote:

> Hi Xuefu,
>
> Thank you for your answers.
>
> Let me summarize my understanding. In principle we differ only in regards
> to the fact if a temporary function can be only 1-part or only 3-part
> identified. I can reconfirm that if the community decides it prefers the
> 1-part approach I will commit to that, with the assumption that we will
> force ONLY 1-part function names. (We will parse identifier and throw
> exception if a user tries to register e.g. db.temp_func).
>
> My preference is though the 3-part approach:
>
>- there are some functions that it makes no sense to override, e.g.
>CAST, moreover I'm afraid that allowing overriding such will lead to high
>inconsistency, similar to those that I mentioned spark has
>- you cannot shadow a fully-qualified function. (If a user fully
>qualifies his/her objects in a SQL query, which is often considered a good
>practice)
>- it does not differentiate between functions & temporary functions.
>Temporary functions just differ with regards to their life-cycle. The
>registration & usage is exactly the same.
>
> As it can be seen, the proposed concept regarding temp function and
> function resolution is quite simple.
>
> Both approaches are equally simple. I would even say the 3-part approach
> is slightly simpler as it does not have to care about some special built-in
> functions such as CAST.
>
> I don't want to express my opinion on the differentiation between built-in
> functions and "external" built-in functions in this thread as it is rather
> orthogonal, but I also like the modular approach and I definitely don't
> like the special syntax "cat::function". I think it's better to stick to a
> standard or at least other proved solutions from other systems.
>
> Best,
>
> Dawid
> On 05/09/2019 10:12, Xuefu Z wrote:
>
> Hi David,
>
> Thanks for sharing your thoughts and  request for clarifications. I believe
> that I fully understood your proposal, which does has its merit. However,
> it's different from ours. Here are the answers to your questions:
>
> Re #1: yes, the temp functions in the proposal are global and have just
> one-part names, similar to built-in functions. Two- or three-part names are
> not allowed.
>
> Re #2: not applicable as two- or three-part names are disallowed.
>
> Re #3: same as above. Referencing external built-in functions is achieved
> either implicitly (only the built-in functions in the current catalogs are
> considered) or via special syntax such as cat::function. However, we are
> looking into the modular approach that Time suggested with other feedback
> received from the community.
>
> Re #4: the resolution order goes like the following in our proposal:
>
> 1. temporary functions
> 2. bulit-in functions (including those augmented by add-on modules)
> 3. built-in functions in current catalog (this will not be 

Re: [jira] [Created] (FLINK-13987) add new logs api, see more log files and can see logs by pages

2019-09-06 Thread lining jing
Hi folks, I have updated the description,
this is the new :

As the job running, the log files are becoming large.
Current log api often blocks or don't work.
It's unfriendly for user. As application runs on jvm, sometime user need
see log of gc.
Above all, i list new api:

   -

   list taskmanager all log file
   - /taskmanagers/taskmanagerid/logs

   {
 "logs": [
   {
 "name": "taskmanager.log",
 "size": 12529
   }
 ]}

   -

   see taskmanager log file by range
   - /taskmanagers/taskmanagerid/logs/:filename?start=[start]&count=[count]

   {
 "data": "logcontent",
 "file_size": 342882}

   -

   list jobmananger all log file
   - /jobmanager/logs

   {
 "logs": [
   {
 "name": "taskmanager.log",
 "size": 12529
   }
 ]}

   -

   see jobmanager log file by range
   - /jobmanager/logs/:filename?start=[start]&count=[count]

   {
 "data": "logcontent",
 "file_size": 342882}



lining (Jira)  于2019年9月6日周五 下午3:57写道:

> lining created FLINK-13987:
> --
>
>  Summary: add new logs api, see more log files and can see
> logs by pages
>  Key: FLINK-13987
>  URL: https://issues.apache.org/jira/browse/FLINK-13987
>  Project: Flink
>   Issue Type: New Feature
>   Components: Runtime / REST
> Reporter: lining
>
>
> As log files becoming more large, current log api often blocks or don't
> work. It's unfriendly for user. As application runs on jvm, sometime user
> need see log of gc. Above all, we need new api.
>  * /taskmanagers/taskmanagerid/logs -> list all log file
>  * /taskmanagers/taskmanagerid/logs/:filename?start=[start]&count=[count]
>  * /jobmanager/logs -> list all log file
>  * /jobmanager/logs/:filename?start=[start]&count=[count]
>
>
>
> --
> This message was sent by Atlassian Jira
> (v8.3.2#803003)
>


Re: [DISCUSS] FLIP-57 - Rework FunctionCatalog

2019-09-06 Thread Dawid Wysakowicz
I agree the consequences of the decision are substantial. Let's see what
others think.

-- Catalog functions are defined by users, and we suppose they can
drop/alter it in any way they want. Thus, overwriting a catalog function
doesn't seem to be a strong use case that we should be concerned about.
Rather, there are known use case for overwriting built-in functions.

Not always users are in full control of the catalog functions. There is
also the case where different teams manage the catalog & use the
catalog. As for overriding built-in functions with 3-part approach user
can always use an equally named function from a catalog. E.g. to override

/    SELECT explode(arr) FROM .../

user can always write:

/    SELECT db.explode(arr) FROM .../

Best,

Dawid
//

On 06/09/2019 10:54, Xuefu Z wrote:
> Hi Dawid,
>
> Thank you for your summary. While the only difference in the two proposals
> is one- or three-part in naming, the consequence would be substantial.
>
> To me, there are two major use cases of temporary functions compared to
> persistent ones:
> 1. Temporary in nature and auto managed by the session. More often than
> not, admin doesn't even allow user to create persistent functions.
> 2. Provide an opportunity to overwriting system built-in functions.
>
> Since built-in functions has one-part name, requiring three-part name for
> temporary functions eliminates the overwriting opportunity.
>
> One-part naming essentially puts all temp functions under a single
> namespace and simplifies function resolution, such as we don't need to
> consider the case of a temp function and a persistent function with the
> same name under the same database.
>
> I agree having three-parts does have its merits, such as consistency with
> other temporary objects (table) and minor difference between temp vs
> catalog functions. However, there is a slight difference between tables and
> function in that there is no built-in table in SQL so there is no need to
> overwrite it.
>
> I'm not sure if I fully agree the benefits you listed as the advantages of
> the three-part naming of temp functions.
>   -- Allowing overwriting built-in functions is a benefit and the solution
> for disallowing certain overwriting shouldn't be totally banning it.
>   -- Catalog functions are defined by users, and we suppose they can
> drop/alter it in any way they want. Thus, overwriting a catalog function
> doesn't seem to be a strong use case that we should be concerned about.
> Rather, there are known use case for overwriting built-in functions.
>
> Thus, personally I would prefer one-part name for temporary functions. In
> lack of SQL standard on this, I certainly like to get opinions from others
> to see if a consensus can be eventually reached.
>
> (To your point on modular approach to support external built-in functions,
> we saw the value and are actively looking into it. Thanks for sharing your
> opinion on that.)
>
> Thanks,
> Xuefu
>
> On Fri, Sep 6, 2019 at 3:48 PM Dawid Wysakowicz 
> wrote:
>
>> Hi Xuefu,
>>
>> Thank you for your answers.
>>
>> Let me summarize my understanding. In principle we differ only in regards
>> to the fact if a temporary function can be only 1-part or only 3-part
>> identified. I can reconfirm that if the community decides it prefers the
>> 1-part approach I will commit to that, with the assumption that we will
>> force ONLY 1-part function names. (We will parse identifier and throw
>> exception if a user tries to register e.g. db.temp_func).
>>
>> My preference is though the 3-part approach:
>>
>>- there are some functions that it makes no sense to override, e.g.
>>CAST, moreover I'm afraid that allowing overriding such will lead to high
>>inconsistency, similar to those that I mentioned spark has
>>- you cannot shadow a fully-qualified function. (If a user fully
>>qualifies his/her objects in a SQL query, which is often considered a good
>>practice)
>>- it does not differentiate between functions & temporary functions.
>>Temporary functions just differ with regards to their life-cycle. The
>>registration & usage is exactly the same.
>>
>> As it can be seen, the proposed concept regarding temp function and
>> function resolution is quite simple.
>>
>> Both approaches are equally simple. I would even say the 3-part approach
>> is slightly simpler as it does not have to care about some special built-in
>> functions such as CAST.
>>
>> I don't want to express my opinion on the differentiation between built-in
>> functions and "external" built-in functions in this thread as it is rather
>> orthogonal, but I also like the modular approach and I definitely don't
>> like the special syntax "cat::function". I think it's better to stick to a
>> standard or at least other proved solutions from other systems.
>>
>> Best,
>>
>> Dawid
>> On 05/09/2019 10:12, Xuefu Z wrote:
>>
>> Hi David,
>>
>> Thanks for sharing your thoughts and  request for clarifications. I believe
>> that I fully 

Re: [VOTE] FLIP-61 Simplify Flink's cluster level RestartStrategy configuration

2019-09-06 Thread Till Rohrmann
Thanks a lot for voting. I'm closing the vote now. The vote received:

* +1 votes (3 binding; 3 non-binding):
- Chesnay (binding)
- Till (binding)
- Zhijiang (binding)
- Zhu Zhu
- Zili Chen
- Vino Yang

* 0/-1 votes: none

Thereby, the community has accepted FLIP-61. I will update the FLIP wiki
page accordingly.

Cheers,
Till


On Thu, Sep 5, 2019 at 8:42 AM vino yang  wrote:

> +1 (non-binding)
>
> Zili Chen  于2019年9月5日周四 上午10:55写道:
>
> > +1
> >
> >
> > zhijiang  于2019年9月5日周四 上午12:36写道:
> >
> > > +1
> > > --
> > > From:Till Rohrmann 
> > > Send Time:2019年9月4日(星期三) 13:39
> > > To:dev 
> > > Cc:Zhu Zhu 
> > > Subject:Re: [VOTE] FLIP-61 Simplify Flink's cluster level
> RestartStrategy
> > > configuration
> > >
> > > +1 (binding)
> > >
> > > On Wed, Sep 4, 2019 at 12:39 PM Chesnay Schepler 
> > > wrote:
> > >
> > > > +1 (binding)
> > > >
> > > > On 04/09/2019 11:13, Zhu Zhu wrote:
> > > > > +1 (non-binding)
> > > > >
> > > > > Thanks,
> > > > > Zhu Zhu
> > > > >
> > > > > Till Rohrmann  于2019年9月4日周三 下午5:05写道:
> > > > >
> > > > >> Hi everyone,
> > > > >>
> > > > >> I would like to start the voting process for FLIP-61 [1], which is
> > > > >> discussed and reached consensus in this thread [2].
> > > > >>
> > > > >> Since the change is rather small I'd like to shorten the voting
> > period
> > > > to
> > > > >> 48 hours. Hence, I'll try to close it September 6th, 11:00 am CET,
> > > > unless
> > > > >> there is an objection or not enough votes.
> > > > >>
> > > > >> [1]
> > > > >>
> > > > >>
> > > >
> > >
> >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-61+Simplify+Flink%27s+cluster+level+RestartStrategy+configuration
> > > > >> [2]
> > > > >>
> > > > >>
> > > >
> > >
> >
> https://lists.apache.org/thread.html/e206390127bcbd9b24d9c41a838faa75157e468e01552ad241e3e24b@%3Cdev.flink.apache.org%3E
> > > > >>
> > > > >> Cheers,
> > > > >> Till
> > > > >>
> > > >
> > > >
> > >
> > >
> >
>


Re: [VOTE] FLIP-62: Set default restart delay for FixedDelay- and FailureRateRestartStrategy to 1s

2019-09-06 Thread Till Rohrmann
Thanks a lot for voting. I'm closing the vote now. The vote received:

* +1 votes (4 binding, 5 non-binding):
- Zhu Zhu
- Jingsong
- Chesnay (binding)
- Till (binding)
- David
- Jark (binding)
- Zhijiang (binding)
- Yu
- Vino Yang

* 0/-1 votes: none

Thereby, the community accepted FLIP-62. I'll update the FLIP wiki page
accordingly.

Cheers,
Till

On Fri, Sep 6, 2019 at 4:09 AM vino yang  wrote:

> +1 (non-binding)
>
> Best,
> Vino
>
> Yu Li  于2019年9月6日周五 上午2:13写道:
>
> > +1 (non-binding)
> >
> > Best Regards,
> > Yu
> >
> >
> > On Thu, 5 Sep 2019 at 00:23, zhijiang  .invalid>
> > wrote:
> >
> > > +1
> > >
> > > Best,
> > > Zhijiang
> > > --
> > > From:Jark Wu 
> > > Send Time:2019年9月4日(星期三) 13:45
> > > To:dev 
> > > Subject:Re: [VOTE] FLIP-62: Set default restart delay for FixedDelay-
> and
> > > FailureRateRestartStrategy to 1s
> > >
> > > +1
> > >
> > > Best,
> > > Jark
> > >
> > > > 在 2019年9月4日,19:43,David Morávek  写道:
> > > >
> > > > +1
> > > >
> > > > On Wed, Sep 4, 2019 at 1:38 PM Till Rohrmann 
> > > wrote:
> > > >
> > > >> +1 (binding)
> > > >>
> > > >> On Wed, Sep 4, 2019 at 12:43 PM Chesnay Schepler <
> ches...@apache.org>
> > > >> wrote:
> > > >>
> > > >>> +1 (binding)
> > > >>>
> > > >>> On 04/09/2019 11:18, JingsongLee wrote:
> > >  +1 (non-binding)
> > > 
> > >  default 0 is really not user production friendly.
> > > 
> > >  Best,
> > >  Jingsong Lee
> > > 
> > > 
> > >  --
> > >  From:Zhu Zhu 
> > >  Send Time:2019年9月4日(星期三) 17:13
> > >  To:dev 
> > >  Subject:Re: [VOTE] FLIP-62: Set default restart delay for
> > FixedDelay-
> > > >>> and FailureRateRestartStrategy to 1s
> > > 
> > >  +1 (non-binding)
> > > 
> > >  Thanks,
> > >  Zhu Zhu
> > > 
> > >  Till Rohrmann  于2019年9月4日周三 下午5:06写道:
> > > 
> > > > Hi everyone,
> > > >
> > > > I would like to start the voting process for FLIP-62 [1], which
> > > > is discussed and reached consensus in this thread [2].
> > > >
> > > > Since the change is rather small I'd like to shorten the voting
> > > period
> > > >>> to
> > > > 48 hours. Hence, I'll try to close it September 6th, 11:00 am
> CET,
> > > >>> unless
> > > > there is an objection or not enough votes.
> > > >
> > > > [1]
> > > >
> > > >
> > > >>>
> > > >>
> > >
> >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-62%3A+Set+default+restart+delay+for+FixedDelay-+and+FailureRateRestartStrategy+to+1s
> > > > [2]
> > > >
> > > >
> > > >>>
> > > >>
> > >
> >
> https://lists.apache.org/thread.html/9602b342602a0181fcb618581f3b12e692ed2fad98c59fd6c1caeabd@%3Cdev.flink.apache.org%3E
> > > >
> > > > Cheers,
> > > > Till
> > > >
> > > >>>
> > > >>>
> > > >>
> > >
> > >
> >
>


Is Flink documentation deployment script broken ?

2019-09-06 Thread Jark Wu
Hi all,

I merged several documentation pull requests[1][2][3] days ago.
AFAIK, the documentation deployment is scheduled every day.
However, I didn't see the changes are available in the Flink doc website[4]
until now.
The same to Till's PR[5] merged 3 days ago.


Best,
Jark

[1]: https://github.com/apache/flink/pull/9545
[2]: https://github.com/apache/flink/pull/9511
[3]: https://github.com/apache/flink/pull/9525
[4]: https://ci.apache.org/projects/flink/flink-docs-master/
[5]: https://github.com/apache/flink/pull/9571


Re: Is Flink documentation deployment script broken ?

2019-09-06 Thread Chesnay Schepler

The scripts are fine, but the buildbot slave is currently down.

I've already opened a ticket with INFRA: 
https://issues.apache.org/jira/browse/INFRA-18986


On 06/09/2019 11:44, Jark Wu wrote:

Hi all,

I merged several documentation pull requests[1][2][3] days ago.
AFAIK, the documentation deployment is scheduled every day.
However, I didn't see the changes are available in the Flink doc website[4]
until now.
The same to Till's PR[5] merged 3 days ago.


Best,
Jark

[1]: https://github.com/apache/flink/pull/9545
[2]: https://github.com/apache/flink/pull/9511
[3]: https://github.com/apache/flink/pull/9525
[4]: https://ci.apache.org/projects/flink/flink-docs-master/
[5]: https://github.com/apache/flink/pull/9571





[jira] [Created] (FLINK-13988) Remove legacy JobManagerMode

2019-09-06 Thread TisonKun (Jira)
TisonKun created FLINK-13988:


 Summary: Remove legacy JobManagerMode
 Key: FLINK-13988
 URL: https://issues.apache.org/jira/browse/FLINK-13988
 Project: Flink
  Issue Type: Sub-task
  Components: Runtime / Coordination
Affects Versions: 1.10.0
Reporter: TisonKun
 Fix For: 1.10.0


Indeed it belongs to pre FLIP-6 framework.

Also remove its usage in {{JobManagerCliOptions}} and the the unused 
{{JobManagerCliOptions}}.

cc [~till.rohrmann]



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


Re: [DISCUSS] Flink Python User-Defined Function for Table API

2019-09-06 Thread Aljoscha Krettek
Hi,

Regarding stateful functions and MapView/DataView/ListView: I think it’s best 
to keep that for a later FLIP and focus on a more basic version. Supporting 
stateful functions, especially with MapView can potentially be very slow so we 
have to see what we can do there.

For the method names, I don’t know. If FLIP-64 passes they have to be changed. 
So we could use the final names right away, but I’m also fine with using the 
old method names for now.

Best,
Aljoscha

> On 5. Sep 2019, at 12:40, jincheng sun  wrote:
> 
> Hi Aljoscha,
> 
> Thanks for your comments!
> 
> Regarding to the FLIP scope, it seems that we have agreed on the design of
> the stateless function support.
> What do you think about starting the development of the stateless function
> support firstly and continue the discussion of stateful function support?
> Or you think we should split the current FLIP into two FLIPs and discuss
> the stateful function support in another thread?
> 
> Currently, the Python DataView/MapView/ListView interfaces design follow
> the Java/Scala naming conversions.
> Of couse, We can continue to discuss whether there are better solutions,
> i.e. using annotations.
> 
> Regarding to the magic logic to support DataView/MapView/ListView, it will
> be done by the framework and is transparent for users.
> Per my understanding, the magic logic is unavoidable no matter what the
> interfaces will be.
> 
> Regarding to the catalog support of python function:1) If it's stored in
> memory as temporary object, just as you said, users can call
> TableEnvironment.register_function(will change to
> register_temporary_function in FLIP-64)
> 2) If it's persisted in external storage, users can call
> Catalog.create_function. There will be no API change per my understanding.
> 
> What do you think?
> Best,Jincheng
> 
> Aljoscha Krettek  于2019年9月5日周四 下午5:32写道:
> 
>> Hi,
>> 
>> Another thing to consider is the Scope of the FLIP. Currently, we try to
>> support (stateful) AggregateFunctions. I have some concerns about whether
>> or not DataView/MapView/ListView is a good interface because it requires
>> quite some magic from the runners to make it work, such as messing with the
>> TypeInformation and injecting objects at runtime. If the FLIP aims for the
>> minimum of ScalarFunctions and the whole execution harness, that should be
>> easier to agree on.
>> 
>> Another point is the naming of the new methods. I think Timo hinted at the
>> fact that we have to consider catalog support for functions. There is
>> ongoing work about differentiating between temporary objects and objects
>> that are stored in a catalog (FLIP-64 [1]). With this in mind, the method
>> for registering functions should be called register_temporary_function()
>> and so on. Unless we want to already think about mixing Python and Java
>> functions in the catalog, which is outside the scope of this FLIP, I think.
>> 
>> Best,
>> Aljoscha
>> 
>> [1]
>> https://cwiki.apache.org/confluence/display/FLINK/FLIP-64%3A+Support+for+Temporary+Objects+in+Table+module
>> 
>> 
>>> On 5. Sep 2019, at 05:01, jincheng sun  wrote:
>>> 
>>> Hi Aljoscha,
>>> 
>>> That's a good points, so far, most of the code will live in flink-python
>>> module, and the rules and relNodes will be put into the both blink and
>>> flink planner modules, some of the common interface of required by
>> planners
>>> will be placed in flink-table-common. I think you are right, we should
>> try
>>> to ensure the changes of this feature is minimal.  For more detail we
>> would
>>> follow this principle when review the PRs.
>>> 
>>> Great thanks for your questions and remind!
>>> 
>>> Best,
>>> Jincheng
>>> 
>>> 
>>> Aljoscha Krettek  于2019年9月4日周三 下午8:58写道:
>>> 
 Hi,
 
 Things looks interesting so far!
 
 I had one question: Where will most of the support code for this live?
 Will this add the required code to flink-table-common or the different
 runners? Can we implement this in such a way that only a minimal amount
>> of
 support code is required in the parts of the Table API (and Table API
 runners) that  are not python specific?
 
 Best,
 Aljoscha
 
> On 4. Sep 2019, at 14:14, Timo Walther  wrote:
> 
> Hi Jincheng,
> 
> 2. Serializability of functions: "#2 is very convenient for users"
>> means
 only until they have the first backwards-compatibility issue, after that
 they will find it not so convinient anymore and will ask why the
>> framework
 allowed storing such objects in a persistent storage. I don't want to be
 picky about it, but wanted to raise awareness that sometimes it is ok to
 limit use cases to guide users for devloping backwards-compatible
>> programs.
> 
> Thanks for the explanation fo the remaining items. It sounds reasonable
 to me. Regarding the example with `getKind()`, I actually meant
 `org.apache.flink.table.functions.ScalarFunction#getKind` we don't allow
 users to 

[jira] [Created] (FLINK-13989) Remove legacy ClassloadingProps

2019-09-06 Thread TisonKun (Jira)
TisonKun created FLINK-13989:


 Summary: Remove legacy ClassloadingProps
 Key: FLINK-13989
 URL: https://issues.apache.org/jira/browse/FLINK-13989
 Project: Flink
  Issue Type: Sub-task
  Components: Runtime / Coordination
Affects Versions: 1.10.0
Reporter: TisonKun
 Fix For: 1.10.0


{{ClassloadingProps}} is used for legacy {{JobManager}}, removed as dead code.

[~till.rohrmann]



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


Re: [DISCUSS] Flink Python User-Defined Function for Table API

2019-09-06 Thread jincheng sun
Hi,

Sure, for ensure the 1.10 relesae of flink, let's split the FLIPs, and
FLIP-58 only do the stateless part.

Cheers,
Jincheng

Aljoscha Krettek  于2019年9月6日周五 下午5:53写道:

> Hi,
>
> Regarding stateful functions and MapView/DataView/ListView: I think it’s
> best to keep that for a later FLIP and focus on a more basic version.
> Supporting stateful functions, especially with MapView can potentially be
> very slow so we have to see what we can do there.
>
> For the method names, I don’t know. If FLIP-64 passes they have to be
> changed. So we could use the final names right away, but I’m also fine with
> using the old method names for now.
>
> Best,
> Aljoscha
>
> > On 5. Sep 2019, at 12:40, jincheng sun  wrote:
> >
> > Hi Aljoscha,
> >
> > Thanks for your comments!
> >
> > Regarding to the FLIP scope, it seems that we have agreed on the design
> of
> > the stateless function support.
> > What do you think about starting the development of the stateless
> function
> > support firstly and continue the discussion of stateful function support?
> > Or you think we should split the current FLIP into two FLIPs and discuss
> > the stateful function support in another thread?
> >
> > Currently, the Python DataView/MapView/ListView interfaces design follow
> > the Java/Scala naming conversions.
> > Of couse, We can continue to discuss whether there are better solutions,
> > i.e. using annotations.
> >
> > Regarding to the magic logic to support DataView/MapView/ListView, it
> will
> > be done by the framework and is transparent for users.
> > Per my understanding, the magic logic is unavoidable no matter what the
> > interfaces will be.
> >
> > Regarding to the catalog support of python function:1) If it's stored in
> > memory as temporary object, just as you said, users can call
> > TableEnvironment.register_function(will change to
> > register_temporary_function in FLIP-64)
> > 2) If it's persisted in external storage, users can call
> > Catalog.create_function. There will be no API change per my
> understanding.
> >
> > What do you think?
> > Best,Jincheng
> >
> > Aljoscha Krettek  于2019年9月5日周四 下午5:32写道:
> >
> >> Hi,
> >>
> >> Another thing to consider is the Scope of the FLIP. Currently, we try to
> >> support (stateful) AggregateFunctions. I have some concerns about
> whether
> >> or not DataView/MapView/ListView is a good interface because it requires
> >> quite some magic from the runners to make it work, such as messing with
> the
> >> TypeInformation and injecting objects at runtime. If the FLIP aims for
> the
> >> minimum of ScalarFunctions and the whole execution harness, that should
> be
> >> easier to agree on.
> >>
> >> Another point is the naming of the new methods. I think Timo hinted at
> the
> >> fact that we have to consider catalog support for functions. There is
> >> ongoing work about differentiating between temporary objects and objects
> >> that are stored in a catalog (FLIP-64 [1]). With this in mind, the
> method
> >> for registering functions should be called register_temporary_function()
> >> and so on. Unless we want to already think about mixing Python and Java
> >> functions in the catalog, which is outside the scope of this FLIP, I
> think.
> >>
> >> Best,
> >> Aljoscha
> >>
> >> [1]
> >>
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-64%3A+Support+for+Temporary+Objects+in+Table+module
> >>
> >>
> >>> On 5. Sep 2019, at 05:01, jincheng sun 
> wrote:
> >>>
> >>> Hi Aljoscha,
> >>>
> >>> That's a good points, so far, most of the code will live in
> flink-python
> >>> module, and the rules and relNodes will be put into the both blink and
> >>> flink planner modules, some of the common interface of required by
> >> planners
> >>> will be placed in flink-table-common. I think you are right, we should
> >> try
> >>> to ensure the changes of this feature is minimal.  For more detail we
> >> would
> >>> follow this principle when review the PRs.
> >>>
> >>> Great thanks for your questions and remind!
> >>>
> >>> Best,
> >>> Jincheng
> >>>
> >>>
> >>> Aljoscha Krettek  于2019年9月4日周三 下午8:58写道:
> >>>
>  Hi,
> 
>  Things looks interesting so far!
> 
>  I had one question: Where will most of the support code for this live?
>  Will this add the required code to flink-table-common or the different
>  runners? Can we implement this in such a way that only a minimal
> amount
> >> of
>  support code is required in the parts of the Table API (and Table API
>  runners) that  are not python specific?
> 
>  Best,
>  Aljoscha
> 
> > On 4. Sep 2019, at 14:14, Timo Walther  wrote:
> >
> > Hi Jincheng,
> >
> > 2. Serializability of functions: "#2 is very convenient for users"
> >> means
>  only until they have the first backwards-compatibility issue, after
> that
>  they will find it not so convinient anymore and will ask why the
> >> framework
>  allowed storing such objects in a persistent storage. I don't want to
> b

[jira] [Created] (FLINK-13990) Remove JobModificationException

2019-09-06 Thread TisonKun (Jira)
TisonKun created FLINK-13990:


 Summary: Remove JobModificationException
 Key: FLINK-13990
 URL: https://issues.apache.org/jira/browse/FLINK-13990
 Project: Flink
  Issue Type: Sub-task
  Components: Runtime / Coordination
Affects Versions: 1.10.0
Reporter: TisonKun
 Fix For: 1.10.0


As for its name {{JobModificationException}}, I'm not sure whether the purpose 
underneath still valid. But none of our codepaths use this exception.  I think 
it was mainly used in {{Dispatcher}} but we evolve exception handling there. We 
can always add back once it is back to valid.

Propose to remove it.

cc [~till.rohrmann]



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


Re: Is Flink documentation deployment script broken ?

2019-09-06 Thread Jark Wu
Thanks Chesnay for reporting this. 


> 在 2019年9月6日,17:47,Chesnay Schepler  写道:
> 
> The scripts are fine, but the buildbot slave is currently down.
> 
> I've already opened a ticket with INFRA: 
> https://issues.apache.org/jira/browse/INFRA-18986
> 
> On 06/09/2019 11:44, Jark Wu wrote:
>> Hi all,
>> 
>> I merged several documentation pull requests[1][2][3] days ago.
>> AFAIK, the documentation deployment is scheduled every day.
>> However, I didn't see the changes are available in the Flink doc website[4]
>> until now.
>> The same to Till's PR[5] merged 3 days ago.
>> 
>> 
>> Best,
>> Jark
>> 
>> [1]: https://github.com/apache/flink/pull/9545
>> [2]: https://github.com/apache/flink/pull/9511
>> [3]: https://github.com/apache/flink/pull/9525
>> [4]: https://ci.apache.org/projects/flink/flink-docs-master/
>> [5]: https://github.com/apache/flink/pull/9571
>> 
> 



[jira] [Created] (FLINK-13991) Add git exclusion for 1.9+ features to 1.8

2019-09-06 Thread Chesnay Schepler (Jira)
Chesnay Schepler created FLINK-13991:


 Summary: Add git exclusion for 1.9+ features to 1.8
 Key: FLINK-13991
 URL: https://issues.apache.org/jira/browse/FLINK-13991
 Project: Flink
  Issue Type: Improvement
  Components: Build System
Reporter: Chesnay Schepler
 Fix For: 1.8.3


Switching from 1.9+ to 1.8 is kind of a pain since git picks up various files 
from later versions as new files, including:
* the zip file of flink-python
* the entire new WebUI
* the copy of the old WebUI
* some generated classes in flink-parquet

This isn't _problematic_ in that sense, but just inconvenient.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


Re: [DISCUSS] Flink Python User-Defined Function for Table API

2019-09-06 Thread Dian Fu
Hi all,

Thanks a lot for the discussion here. It makes sense to limit the scope of this 
FLIP to only ScalarFunction. I'll update the FLIP and remove the content 
relating to UDAF.

Thanks,
Dian

> 在 2019年9月6日,下午6:02,jincheng sun  写道:
> 
> Hi,
> 
> Sure, for ensure the 1.10 relesae of flink, let's split the FLIPs, and
> FLIP-58 only do the stateless part.
> 
> Cheers,
> Jincheng
> 
> Aljoscha Krettek  于2019年9月6日周五 下午5:53写道:
> 
>> Hi,
>> 
>> Regarding stateful functions and MapView/DataView/ListView: I think it’s
>> best to keep that for a later FLIP and focus on a more basic version.
>> Supporting stateful functions, especially with MapView can potentially be
>> very slow so we have to see what we can do there.
>> 
>> For the method names, I don’t know. If FLIP-64 passes they have to be
>> changed. So we could use the final names right away, but I’m also fine with
>> using the old method names for now.
>> 
>> Best,
>> Aljoscha
>> 
>>> On 5. Sep 2019, at 12:40, jincheng sun  wrote:
>>> 
>>> Hi Aljoscha,
>>> 
>>> Thanks for your comments!
>>> 
>>> Regarding to the FLIP scope, it seems that we have agreed on the design
>> of
>>> the stateless function support.
>>> What do you think about starting the development of the stateless
>> function
>>> support firstly and continue the discussion of stateful function support?
>>> Or you think we should split the current FLIP into two FLIPs and discuss
>>> the stateful function support in another thread?
>>> 
>>> Currently, the Python DataView/MapView/ListView interfaces design follow
>>> the Java/Scala naming conversions.
>>> Of couse, We can continue to discuss whether there are better solutions,
>>> i.e. using annotations.
>>> 
>>> Regarding to the magic logic to support DataView/MapView/ListView, it
>> will
>>> be done by the framework and is transparent for users.
>>> Per my understanding, the magic logic is unavoidable no matter what the
>>> interfaces will be.
>>> 
>>> Regarding to the catalog support of python function:1) If it's stored in
>>> memory as temporary object, just as you said, users can call
>>> TableEnvironment.register_function(will change to
>>> register_temporary_function in FLIP-64)
>>> 2) If it's persisted in external storage, users can call
>>> Catalog.create_function. There will be no API change per my
>> understanding.
>>> 
>>> What do you think?
>>> Best,Jincheng
>>> 
>>> Aljoscha Krettek  于2019年9月5日周四 下午5:32写道:
>>> 
 Hi,
 
 Another thing to consider is the Scope of the FLIP. Currently, we try to
 support (stateful) AggregateFunctions. I have some concerns about
>> whether
 or not DataView/MapView/ListView is a good interface because it requires
 quite some magic from the runners to make it work, such as messing with
>> the
 TypeInformation and injecting objects at runtime. If the FLIP aims for
>> the
 minimum of ScalarFunctions and the whole execution harness, that should
>> be
 easier to agree on.
 
 Another point is the naming of the new methods. I think Timo hinted at
>> the
 fact that we have to consider catalog support for functions. There is
 ongoing work about differentiating between temporary objects and objects
 that are stored in a catalog (FLIP-64 [1]). With this in mind, the
>> method
 for registering functions should be called register_temporary_function()
 and so on. Unless we want to already think about mixing Python and Java
 functions in the catalog, which is outside the scope of this FLIP, I
>> think.
 
 Best,
 Aljoscha
 
 [1]
 
>> https://cwiki.apache.org/confluence/display/FLINK/FLIP-64%3A+Support+for+Temporary+Objects+in+Table+module
 
 
> On 5. Sep 2019, at 05:01, jincheng sun 
>> wrote:
> 
> Hi Aljoscha,
> 
> That's a good points, so far, most of the code will live in
>> flink-python
> module, and the rules and relNodes will be put into the both blink and
> flink planner modules, some of the common interface of required by
 planners
> will be placed in flink-table-common. I think you are right, we should
 try
> to ensure the changes of this feature is minimal.  For more detail we
 would
> follow this principle when review the PRs.
> 
> Great thanks for your questions and remind!
> 
> Best,
> Jincheng
> 
> 
> Aljoscha Krettek  于2019年9月4日周三 下午8:58写道:
> 
>> Hi,
>> 
>> Things looks interesting so far!
>> 
>> I had one question: Where will most of the support code for this live?
>> Will this add the required code to flink-table-common or the different
>> runners? Can we implement this in such a way that only a minimal
>> amount
 of
>> support code is required in the parts of the Table API (and Table API
>> runners) that  are not python specific?
>> 
>> Best,
>> Aljoscha
>> 
>>> On 4. Sep 2019, at 14:14, Timo Walther  wrote:
>>> 
>>> Hi Jincheng,
>>> 
>>> 2. Ser

[jira] [Created] (FLINK-13992) Refactor Optional parameter in InputGateWithMetrics#updateMetrics

2019-09-06 Thread TisonKun (Jira)
TisonKun created FLINK-13992:


 Summary: Refactor Optional parameter in 
InputGateWithMetrics#updateMetrics
 Key: FLINK-13992
 URL: https://issues.apache.org/jira/browse/FLINK-13992
 Project: Flink
  Issue Type: Bug
  Components: Runtime / Coordination
Affects Versions: 1.10.0
Reporter: TisonKun
 Fix For: 1.10.0


As consensus from community code style discussion, in 
{{InputGateWithMetrics#updateMetrics}} we can refactor to reduce the usage of 
Optional parameter.

cc [~azagrebin]

{code:java}
diff --git 
a/flink-runtime/src/main/java/org/apache/flink/runtime/taskmanager/InputGateWithMetrics.java
 
b/flink-runtime/src/main/java/org/apache/flink/runtime/taskmanager/InputGateWithMetrics.java
index 5d2cfd95c4..e548fbf02b 100644
--- 
a/flink-runtime/src/main/java/org/apache/flink/runtime/taskmanager/InputGateWithMetrics.java
+++ 
b/flink-runtime/src/main/java/org/apache/flink/runtime/taskmanager/InputGateWithMetrics.java
@@ -24,6 +24,8 @@ import 
org.apache.flink.runtime.io.network.partition.consumer.BufferOrEvent;
 import org.apache.flink.runtime.io.network.partition.consumer.InputGate;
 import org.apache.flink.runtime.metrics.groups.TaskIOMetricGroup;
 
+import javax.annotation.Nonnull;
+
 import java.io.IOException;
 import java.util.Optional;
 import java.util.concurrent.CompletableFuture;
@@ -67,12 +69,12 @@ public class InputGateWithMetrics extends InputGate {
 
@Override
public Optional getNext() throws IOException, 
InterruptedException {
-   return updateMetrics(inputGate.getNext());
+   return inputGate.getNext().map(this::updateMetrics);
}
 
@Override
public Optional pollNext() throws IOException, 
InterruptedException {
-   return updateMetrics(inputGate.pollNext());
+   return inputGate.pollNext().map(this::updateMetrics);
}
 
@Override
@@ -85,8 +87,8 @@ public class InputGateWithMetrics extends InputGate {
inputGate.close();
}
 
-   private Optional updateMetrics(Optional 
bufferOrEvent) {
-   bufferOrEvent.ifPresent(b -> numBytesIn.inc(b.getSize()));
+   private BufferOrEvent updateMetrics(@Nonnull BufferOrEvent 
bufferOrEvent) {
+   numBytesIn.inc(bufferOrEvent.getSize());
return bufferOrEvent;
}
 }
{code}




--
This message was sent by Atlassian Jira
(v8.3.2#803003)


Re: [DISCUSS] Flink Python User-Defined Function for Table API

2019-09-06 Thread Dian Fu
Hi all,

I have updated the FLIP and removed content relate to UDAF and also changed the 
title of the FLIP to "Flink Python User-Defined Stateless Function for Table". 
Does it make sense to you? 

Regards,
Dian

> 在 2019年9月6日,下午6:09,Dian Fu  写道:
> 
> Hi all,
> 
> Thanks a lot for the discussion here. It makes sense to limit the scope of 
> this FLIP to only ScalarFunction. I'll update the FLIP and remove the content 
> relating to UDAF.
> 
> Thanks,
> Dian
> 
>> 在 2019年9月6日,下午6:02,jincheng sun  写道:
>> 
>> Hi,
>> 
>> Sure, for ensure the 1.10 relesae of flink, let's split the FLIPs, and
>> FLIP-58 only do the stateless part.
>> 
>> Cheers,
>> Jincheng
>> 
>> Aljoscha Krettek  于2019年9月6日周五 下午5:53写道:
>> 
>>> Hi,
>>> 
>>> Regarding stateful functions and MapView/DataView/ListView: I think it’s
>>> best to keep that for a later FLIP and focus on a more basic version.
>>> Supporting stateful functions, especially with MapView can potentially be
>>> very slow so we have to see what we can do there.
>>> 
>>> For the method names, I don’t know. If FLIP-64 passes they have to be
>>> changed. So we could use the final names right away, but I’m also fine with
>>> using the old method names for now.
>>> 
>>> Best,
>>> Aljoscha
>>> 
 On 5. Sep 2019, at 12:40, jincheng sun  wrote:
 
 Hi Aljoscha,
 
 Thanks for your comments!
 
 Regarding to the FLIP scope, it seems that we have agreed on the design
>>> of
 the stateless function support.
 What do you think about starting the development of the stateless
>>> function
 support firstly and continue the discussion of stateful function support?
 Or you think we should split the current FLIP into two FLIPs and discuss
 the stateful function support in another thread?
 
 Currently, the Python DataView/MapView/ListView interfaces design follow
 the Java/Scala naming conversions.
 Of couse, We can continue to discuss whether there are better solutions,
 i.e. using annotations.
 
 Regarding to the magic logic to support DataView/MapView/ListView, it
>>> will
 be done by the framework and is transparent for users.
 Per my understanding, the magic logic is unavoidable no matter what the
 interfaces will be.
 
 Regarding to the catalog support of python function:1) If it's stored in
 memory as temporary object, just as you said, users can call
 TableEnvironment.register_function(will change to
 register_temporary_function in FLIP-64)
 2) If it's persisted in external storage, users can call
 Catalog.create_function. There will be no API change per my
>>> understanding.
 
 What do you think?
 Best,Jincheng
 
 Aljoscha Krettek  于2019年9月5日周四 下午5:32写道:
 
> Hi,
> 
> Another thing to consider is the Scope of the FLIP. Currently, we try to
> support (stateful) AggregateFunctions. I have some concerns about
>>> whether
> or not DataView/MapView/ListView is a good interface because it requires
> quite some magic from the runners to make it work, such as messing with
>>> the
> TypeInformation and injecting objects at runtime. If the FLIP aims for
>>> the
> minimum of ScalarFunctions and the whole execution harness, that should
>>> be
> easier to agree on.
> 
> Another point is the naming of the new methods. I think Timo hinted at
>>> the
> fact that we have to consider catalog support for functions. There is
> ongoing work about differentiating between temporary objects and objects
> that are stored in a catalog (FLIP-64 [1]). With this in mind, the
>>> method
> for registering functions should be called register_temporary_function()
> and so on. Unless we want to already think about mixing Python and Java
> functions in the catalog, which is outside the scope of this FLIP, I
>>> think.
> 
> Best,
> Aljoscha
> 
> [1]
> 
>>> https://cwiki.apache.org/confluence/display/FLINK/FLIP-64%3A+Support+for+Temporary+Objects+in+Table+module
> 
> 
>> On 5. Sep 2019, at 05:01, jincheng sun 
>>> wrote:
>> 
>> Hi Aljoscha,
>> 
>> That's a good points, so far, most of the code will live in
>>> flink-python
>> module, and the rules and relNodes will be put into the both blink and
>> flink planner modules, some of the common interface of required by
> planners
>> will be placed in flink-table-common. I think you are right, we should
> try
>> to ensure the changes of this feature is minimal.  For more detail we
> would
>> follow this principle when review the PRs.
>> 
>> Great thanks for your questions and remind!
>> 
>> Best,
>> Jincheng
>> 
>> 
>> Aljoscha Krettek  于2019年9月4日周三 下午8:58写道:
>> 
>>> Hi,
>>> 
>>> Things looks interesting so far!
>>> 
>>> I had one question: Where will most of the support code for this live?
>>> Will this add the required code to flink-

Re: [DISCUSS] Flink Python User-Defined Function for Table API

2019-09-06 Thread Aljoscha Krettek
Hi,

Thanks for the quick response! I think this looks good now and it should be 
something that everyone can agree on as a first step.

Best,
Aljoscha

> On 6. Sep 2019, at 12:22, Dian Fu  wrote:
> 
> Hi all,
> 
> I have updated the FLIP and removed content relate to UDAF and also changed 
> the title of the FLIP to "Flink Python User-Defined Stateless Function for 
> Table". Does it make sense to you? 
> 
> Regards,
> Dian
> 
>> 在 2019年9月6日,下午6:09,Dian Fu  写道:
>> 
>> Hi all,
>> 
>> Thanks a lot for the discussion here. It makes sense to limit the scope of 
>> this FLIP to only ScalarFunction. I'll update the FLIP and remove the 
>> content relating to UDAF.
>> 
>> Thanks,
>> Dian
>> 
>>> 在 2019年9月6日,下午6:02,jincheng sun  写道:
>>> 
>>> Hi,
>>> 
>>> Sure, for ensure the 1.10 relesae of flink, let's split the FLIPs, and
>>> FLIP-58 only do the stateless part.
>>> 
>>> Cheers,
>>> Jincheng
>>> 
>>> Aljoscha Krettek  于2019年9月6日周五 下午5:53写道:
>>> 
 Hi,
 
 Regarding stateful functions and MapView/DataView/ListView: I think it’s
 best to keep that for a later FLIP and focus on a more basic version.
 Supporting stateful functions, especially with MapView can potentially be
 very slow so we have to see what we can do there.
 
 For the method names, I don’t know. If FLIP-64 passes they have to be
 changed. So we could use the final names right away, but I’m also fine with
 using the old method names for now.
 
 Best,
 Aljoscha
 
> On 5. Sep 2019, at 12:40, jincheng sun  wrote:
> 
> Hi Aljoscha,
> 
> Thanks for your comments!
> 
> Regarding to the FLIP scope, it seems that we have agreed on the design
 of
> the stateless function support.
> What do you think about starting the development of the stateless
 function
> support firstly and continue the discussion of stateful function support?
> Or you think we should split the current FLIP into two FLIPs and discuss
> the stateful function support in another thread?
> 
> Currently, the Python DataView/MapView/ListView interfaces design follow
> the Java/Scala naming conversions.
> Of couse, We can continue to discuss whether there are better solutions,
> i.e. using annotations.
> 
> Regarding to the magic logic to support DataView/MapView/ListView, it
 will
> be done by the framework and is transparent for users.
> Per my understanding, the magic logic is unavoidable no matter what the
> interfaces will be.
> 
> Regarding to the catalog support of python function:1) If it's stored in
> memory as temporary object, just as you said, users can call
> TableEnvironment.register_function(will change to
> register_temporary_function in FLIP-64)
> 2) If it's persisted in external storage, users can call
> Catalog.create_function. There will be no API change per my
 understanding.
> 
> What do you think?
> Best,Jincheng
> 
> Aljoscha Krettek  于2019年9月5日周四 下午5:32写道:
> 
>> Hi,
>> 
>> Another thing to consider is the Scope of the FLIP. Currently, we try to
>> support (stateful) AggregateFunctions. I have some concerns about
 whether
>> or not DataView/MapView/ListView is a good interface because it requires
>> quite some magic from the runners to make it work, such as messing with
 the
>> TypeInformation and injecting objects at runtime. If the FLIP aims for
 the
>> minimum of ScalarFunctions and the whole execution harness, that should
 be
>> easier to agree on.
>> 
>> Another point is the naming of the new methods. I think Timo hinted at
 the
>> fact that we have to consider catalog support for functions. There is
>> ongoing work about differentiating between temporary objects and objects
>> that are stored in a catalog (FLIP-64 [1]). With this in mind, the
 method
>> for registering functions should be called register_temporary_function()
>> and so on. Unless we want to already think about mixing Python and Java
>> functions in the catalog, which is outside the scope of this FLIP, I
 think.
>> 
>> Best,
>> Aljoscha
>> 
>> [1]
>> 
 https://cwiki.apache.org/confluence/display/FLINK/FLIP-64%3A+Support+for+Temporary+Objects+in+Table+module
>> 
>> 
>>> On 5. Sep 2019, at 05:01, jincheng sun 
 wrote:
>>> 
>>> Hi Aljoscha,
>>> 
>>> That's a good points, so far, most of the code will live in
 flink-python
>>> module, and the rules and relNodes will be put into the both blink and
>>> flink planner modules, some of the common interface of required by
>> planners
>>> will be placed in flink-table-common. I think you are right, we should
>> try
>>> to ensure the changes of this feature is minimal.  For more detail we
>> would
>>> follow this principle when review the PRs.
>>> 
>>> Great th

Re: [VOTE] FLIP-53: Fine Grained Operator Resource Management

2019-09-06 Thread Zhu Zhu
Thanks Xintong for proposing this better resource management.
This helps a lot to users who want to better manage the job resources. And
would be even more useful if in the future we can have auto-tuning
mechanism for jobs.

+1 (non-binding)

Thanks,
Zhu Zhu

Xintong Song  于2019年9月6日周五 上午11:17写道:

> Hi all,
>
> I would like to start the voting process for FLIP-53 [1], which is
> discussed and reached consensus in this thread [2].
>
> This voting will be open for at least 72 hours (excluding weekends). I'll
> try to close it Sep. 11, 04:00 UTC, unless there is an objection or not
> enough votes.
>
> Thank you~
>
> Xintong Song
>
>
> [1]
>
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-53%3A+Fine+Grained+Operator+Resource+Management
>
> [2]
>
> http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-FLIP-53-Fine-Grained-Resource-Management-td31831.html
>


Checkpointing clarification

2019-09-06 Thread Dominik Wosiński
Hello,
I have a slight doubt on checkpointing in Flink and wanted to clarify my
understanding. Flink uses barriers internally to keep track of the records
that were processed. The documentation[1] describes it as the checkpoint
was only happening when the barriers are transferred to the sink. So  let's
consider a toy example of `TumblingEventTimeWindow` set to 5 hours and
`CheckpointInterval` set to 10 minutes. So, if the documentation is
correct, the checkpoint should occur only when the window is processed and
gets to sink (which can take several hours) , which is not true as far as I
know. I am surely wrong somewhere, could someone explain where is the error
in my logic ?


[1]
https://ci.apache.org/projects/flink/flink-docs-stable/internals/stream_checkpointing.html


[jira] [Created] (FLINK-13993) Using FlinkUserCodeClassLoaders to load the user class in the perjob mode

2019-09-06 Thread Guowei Ma (Jira)
Guowei Ma created FLINK-13993:
-

 Summary: Using FlinkUserCodeClassLoaders to load the user class in 
the perjob mode
 Key: FLINK-13993
 URL: https://issues.apache.org/jira/browse/FLINK-13993
 Project: Flink
  Issue Type: Improvement
  Components: Runtime / Coordination
Affects Versions: 1.9.0, 1.10.0
Reporter: Guowei Ma


Currently, Flink has the FlinkUserCodeClassLoader, which is using to load 
user’s class. However, the user class and the system class are all loaded by 
the system classloader in the perjob mode. This introduces some conflicts.

This document[1] gives a proposal that makes the FlinkUserClassLoader load the 
user class in perjob mode. (disscuss with Till[2])

 

[1][https://docs.google.com/document/d/1fH2Cwrrmps5RxxvVuUdeprruvDNabEaIHPyYps28WM8/edit#heading=h.815t5dodlxh7]

[2] 
[https://docs.google.com/document/d/1SUhFt1BmsGMLUYVa72SWLbNrrWzunvcjAlEm8iusvq0/edit]



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


Re: Checkpointing clarification

2019-09-06 Thread Dian Fu
When a WindowOperator receives all the barrier from the upstream, it will 
forward the barrier to downstream operator and perform the checkpoint 
asynchronously. 
It doesn't have to wait the window to trigger before sending out the barrier.

Regards,
Dian

> 在 2019年9月6日,下午8:02,Dominik Wosiński  写道:
> 
> Hello,
> I have a slight doubt on checkpointing in Flink and wanted to clarify my
> understanding. Flink uses barriers internally to keep track of the records
> that were processed. The documentation[1] describes it as the checkpoint
> was only happening when the barriers are transferred to the sink. So  let's
> consider a toy example of `TumblingEventTimeWindow` set to 5 hours and
> `CheckpointInterval` set to 10 minutes. So, if the documentation is
> correct, the checkpoint should occur only when the window is processed and
> gets to sink (which can take several hours) , which is not true as far as I
> know. I am surely wrong somewhere, could someone explain where is the error
> in my logic ?
> 
> 
> [1]
> https://ci.apache.org/projects/flink/flink-docs-stable/internals/stream_checkpointing.html



[jira] [Created] (FLINK-13994) Translate "Getting Started" overview to Chinese

2019-09-06 Thread Fabian Hueske (Jira)
Fabian Hueske created FLINK-13994:
-

 Summary: Translate "Getting Started" overview to Chinese
 Key: FLINK-13994
 URL: https://issues.apache.org/jira/browse/FLINK-13994
 Project: Flink
  Issue Type: Task
  Components: chinese-translation, Documentation
Reporter: Fabian Hueske


The "Getting Started" overview page needs to be translated to Chinese: 

https://github.com/apache/flink/blob/master/docs/getting-started/index.zh.md



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[VOTE] Release 1.8.2, release candidate #1

2019-09-06 Thread Jark Wu
 Hi everyone,

Please review and vote on the release candidate #1 for the version 1.8.2,
as follows:
[ ] +1, Approve the release
[ ] -1, Do not approve the release (please provide specific comments)


The complete staging area is available for your review, which includes:
* JIRA release notes [1],
* the official Apache source release and binary convenience releases to be
deployed to dist.apache.org [2], which are signed with the key with
fingerprint E2C45417BED5C104154F341085BACB5AEFAE3202 [3],
* all artifacts to be deployed to the Maven Central Repository [4],
* source code tag "release-1.8.2-rc1" [5],
* website pull request listing the new release and adding announcement blog
post [6].

The vote will be open for at least 72 hours.
Please cast your votes before *Sep. 11th 2019, 13:00 UTC*.

It is adopted by majority approval, with at least 3 PMC affirmative votes.

Thanks,
Jark

[1]
https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12315522&version=12345670
[2] https://dist.apache.org/repos/dist/dev/flink/flink-1.8.2-rc1/
[3] https://dist.apache.org/repos/dist/release/flink/KEYS
[4] https://repository.apache.org/content/repositories/orgapacheflink-1262
[5]
https://github.com/apache/flink/commit/6322618bb0f1b7942d86cb1b2b7bc55290d9e330
[6] https://github.com/apache/flink-web/pull/262


Re: [DISCUSS] Releasing Flink 1.8.2

2019-09-06 Thread Jark Wu
Hi all,

Thanks all of you for fixing issues for 1.8.2 release!
The VOTE mail thread of the first RC of 1.8.2 already brought up.
I would appreciate if you can help to check the release and VOTE the RC1.

Thanks,
Jark

On Wed, 4 Sep 2019 at 16:57, Aljoscha Krettek  wrote:

> Hi,
>
> I’m just running the last tests on FLINK-13586 on Travis and them I’m
> merging.
>
> Best,
> Aljoscha
>
> On 4. Sep 2019, at 07:37, Jark Wu  wrote:
>
> Thanks for the work Jincheng!
>
> I have moved remaining major issues to 1.8.3 except FLINK-13586.
>
> Hi @Aljoscha Krettek  , is that possible to merge
> FLINK-13586 today?
>
> Best,
> Jark
>
> On Wed, 4 Sep 2019 at 10:47, jincheng sun 
> wrote:
>
>> Thanks for the udpate Jark!
>>
>> I have add the new version 1.8.3 in JIRA, could you please remark the
>> JIRAs(Such as FLINK-13689) which we do not want merge into the 1.8.2
>> release :)
>>
>>  You are right, I think FLINK-13586 is better to be contained in 1.8.2
>> release!
>>
>> Thanks,
>> Jincheng
>>
>>
>> Jark Wu  于2019年9月4日周三 上午10:15写道:
>>
>> > Hi all,
>> >
>> > I am very happy to say that all the blockers and critical issues for
>> > release 1.8.2 have been resolved!
>> >
>> > Great thanks to everyone who contribute to the release.
>> >
>> > I hope to create the first RC on Sep 05, at 10:00 UTC+8.
>> > If you find some other blocker issues for 1.8.2, please let me know
>> before
>> > that to account for it for the 1.8.2 release.
>> >
>> > Before cutting the RC1, I think it has chance to merge the
>> > ClosureCleaner.clean fix (FLINK-13586), because the review and travis
>> are
>> > both passed.
>> >
>> > Cheers,
>> > Jark
>> >
>> > On Wed, 4 Sep 2019 at 00:45, Kostas Kloudas  wrote:
>> >
>> > > Yes, I will do that Jark!
>> > >
>> > > Kostas
>> > >
>> > > On Tue, Sep 3, 2019 at 4:19 PM Jark Wu  wrote:
>> > > >
>> > > > Thanks Kostas for the quick fixing.
>> > > >
>> > > > However, I find that FLINK-13940 still target to 1.8.2 as a blocker.
>> > If I
>> > > > understand correctly, FLINK-13940 is aiming for a nicer and better
>> > > solution
>> > > > in the future.
>> > > > So should we update the fixVersion of FLINK-13940?
>> > > >
>> > > > Best,
>> > > > Jark
>> > > >
>> > > > On Tue, 3 Sep 2019 at 21:33, Kostas Kloudas 
>> > wrote:
>> > > >
>> > > > > Thanks for waiting!
>> > > > >
>> > > > > A fix for FLINK-13940 has been merged on 1.8, 1.9 and the master
>> > under
>> > > > > FLINK-13941.
>> > > > >
>> > > > > Cheers,
>> > > > > Kostas
>> > > > >
>> > > > > On Tue, Sep 3, 2019 at 11:25 AM jincheng sun <
>> > sunjincheng...@gmail.com
>> > > >
>> > > > > wrote:
>> > > > > >
>> > > > > > +1 FLINK-13940 <
>> https://issues.apache.org/jira/browse/FLINK-13940>
>> > > is a
>> > > > > > blocker, due to loss data is very important bug, And great
>> thanks
>> > for
>> > > > > > helping fix it  Kostas!
>> > > > > >
>> > > > > > Best, Jincheng
>> > > > > >
>> > > > > > Kostas Kloudas  于2019年9月2日周一 下午7:20写道:
>> > > > > >
>> > > > > > > Hi all,
>> > > > > > >
>> > > > > > > I think this should be also considered a blocker
>> > > > > > > https://issues.apache.org/jira/browse/FLINK-13940.
>> > > > > > > It is not a regression but it can result to data loss.
>> > > > > > >
>> > > > > > > I think I can have a quick fix by tomorrow.
>> > > > > > >
>> > > > > > > Cheers,
>> > > > > > > Kostas
>> > > > > > >
>> > > > > > > On Mon, Sep 2, 2019 at 12:01 PM jincheng sun <
>> > > sunjincheng...@gmail.com
>> > > > > >
>> > > > > > > wrote:
>> > > > > > > >
>> > > > > > > > Thanks for all of your feedback!
>> > > > > > > >
>> > > > > > > > Hi Jark, Glad to see that you are doing what RM should
>> doing.
>> > > > > > > >
>> > > > > > > > Only one tips here is before the RC1 all the blocker should
>> be
>> > > > > fixed, but
>> > > > > > > > othrers is nice to have. So you can decide when to prepare
>> RC1
>> > > after
>> > > > > the
>> > > > > > > > blokcer is resolved.
>> > > > > > > >
>> > > > > > > > Feel free to tell me if you have any questions.
>> > > > > > > >
>> > > > > > > > Best,Jincheng
>> > > > > > > >
>> > > > > > > > Aljoscha Krettek  于2019年9月2日周一
>> 下午5:03写道:
>> > > > > > > >
>> > > > > > > > > I cut a PR for FLINK-13586:
>> > > > > https://github.com/apache/flink/pull/9595
>> > > > > > > <
>> > > > > > > > > https://github.com/apache/flink/pull/9595>
>> > > > > > > > >
>> > > > > > > > > > On 2. Sep 2019, at 05:03, Yu Li 
>> wrote:
>> > > > > > > > > >
>> > > > > > > > > > +1 for a 1.8.2 release, thanks for bringing this up
>> > Jincheng!
>> > > > > > > > > >
>> > > > > > > > > > Best Regards,
>> > > > > > > > > > Yu
>> > > > > > > > > >
>> > > > > > > > > >
>> > > > > > > > > > On Mon, 2 Sep 2019 at 09:19, Thomas Weise <
>> t...@apache.org>
>> > > > > wrote:
>> > > > > > > > > >
>> > > > > > > > > >> +1 for the 1.8.2 release
>> > > > > > > > > >>
>> > > > > > > > > >> I marked
>> > https://issues.apache.org/jira/browse/FLINK-13586
>> > > for
>> > > > > this
>> > > > > > > > > >> release. It would be good to compensate for the
>> ba

[ANNOUNCE] Kostas Kloudas joins the Flink PMC

2019-09-06 Thread Fabian Hueske
Hi everyone,

I'm very happy to announce that Kostas Kloudas is joining the Flink PMC.
Kostas is contributing to Flink for many years and puts lots of effort in
helping our users and growing the Flink community.

Please join me in congratulating Kostas!

Cheers,
Fabian


Re: [ANNOUNCE] Kostas Kloudas joins the Flink PMC

2019-09-06 Thread Dian Fu
Congratulations Kostas!

Regards,
Dian

> 在 2019年9月6日,下午8:58,Wesley Peng  写道:
> 
> On 2019/9/6 8:55 下午, Fabian Hueske wrote:
>> I'm very happy to announce that Kostas Kloudas is joining the Flink PMC.
>> Kostas is contributing to Flink for many years and puts lots of effort in 
>> helping our users and growing the Flink community.
>> Please join me in congratulating Kostas!
> 
> congratulation Kostas!
> 
> regards.



[jira] [Created] (FLINK-13995) Fix shading of the licence information of netty

2019-09-06 Thread Arvid Heise (Jira)
Arvid Heise created FLINK-13995:
---

 Summary: Fix shading of the licence information of netty
 Key: FLINK-13995
 URL: https://issues.apache.org/jira/browse/FLINK-13995
 Project: Flink
  Issue Type: Bug
  Components: BuildSystem / Shaded
Affects Versions: 1.9.0
Reporter: Arvid Heise


The license filter isn't actually filtering anything. It should be 
META-INF/license/**.

The first filter seems to be outdated btw.

Multiple modules affected.

{code:xml}

io.netty:netty

META-INF/maven/io.netty/**

META-INF/license

META-INF/NOTICE.txt


{code}



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[DISCUSS] FLIP-67: Global partitions lifecycle

2019-09-06 Thread Chesnay Schepler

Hello,

FLIP-36 (interactive programming) 
 
proposes a new programming paradigm where jobs are built incrementally 
by the user.


To support this in an efficient manner I propose to extend partition 
life-cycle to support the notion of /global partitions/, which are 
partitions that can exist beyond the life-time of a job.


These partitions could then be re-used by subsequent jobs in a fairly 
efficient manner, as they don't have to persisted to an external storage 
first and consuming tasks could be scheduled to exploit data-locality.


The FLIP outlines the required changes on the JobMaster, TaskExecutor 
and ResourceManager to support this from a life-cycle perspective.


This FLIP does /not/ concern itself with the /usage/ of global 
partitions, including client-side APIs, job-submission, scheduling and 
reading said partitions; these are all follow-ups that will either be 
part of FLIP-36 or spliced out into separate FLIPs.




Re: [ANNOUNCE] Kostas Kloudas joins the Flink PMC

2019-09-06 Thread Till Rohrmann
Congrats Klou!

Cheers,
Till

On Fri, Sep 6, 2019 at 3:00 PM Dian Fu  wrote:

> Congratulations Kostas!
>
> Regards,
> Dian
>
> > 在 2019年9月6日,下午8:58,Wesley Peng  写道:
> >
> > On 2019/9/6 8:55 下午, Fabian Hueske wrote:
> >> I'm very happy to announce that Kostas Kloudas is joining the Flink PMC.
> >> Kostas is contributing to Flink for many years and puts lots of effort
> in helping our users and growing the Flink community.
> >> Please join me in congratulating Kostas!
> >
> > congratulation Kostas!
> >
> > regards.
>
>


Re: [ANNOUNCE] Kostas Kloudas joins the Flink PMC

2019-09-06 Thread Zili Chen
Congrats Klou!

Best,
tison.


Till Rohrmann  于2019年9月6日周五 下午9:23写道:

> Congrats Klou!
>
> Cheers,
> Till
>
> On Fri, Sep 6, 2019 at 3:00 PM Dian Fu  wrote:
>
>> Congratulations Kostas!
>>
>> Regards,
>> Dian
>>
>> > 在 2019年9月6日,下午8:58,Wesley Peng  写道:
>> >
>> > On 2019/9/6 8:55 下午, Fabian Hueske wrote:
>> >> I'm very happy to announce that Kostas Kloudas is joining the Flink
>> PMC.
>> >> Kostas is contributing to Flink for many years and puts lots of effort
>> in helping our users and growing the Flink community.
>> >> Please join me in congratulating Kostas!
>> >
>> > congratulation Kostas!
>> >
>> > regards.
>>
>>


Re: [VOTE] FLIP-53: Fine Grained Operator Resource Management

2019-09-06 Thread Till Rohrmann
Hi Xintong,

thanks for starting this vote. The proposal looks good and, hence, +1 for
it.

One comment I have is concerning the first implementation step. I would
suggest to not add the flag allSourcesInSamePipelinedRegion to the
ExecutionConfig because the ExecutionConfig is public API. Ideally we keep
this flag internal and don't expose it to the user.

Cheers,
Till

On Fri, Sep 6, 2019 at 1:47 PM Zhu Zhu  wrote:

> Thanks Xintong for proposing this better resource management.
> This helps a lot to users who want to better manage the job resources. And
> would be even more useful if in the future we can have auto-tuning
> mechanism for jobs.
>
> +1 (non-binding)
>
> Thanks,
> Zhu Zhu
>
> Xintong Song  于2019年9月6日周五 上午11:17写道:
>
> > Hi all,
> >
> > I would like to start the voting process for FLIP-53 [1], which is
> > discussed and reached consensus in this thread [2].
> >
> > This voting will be open for at least 72 hours (excluding weekends). I'll
> > try to close it Sep. 11, 04:00 UTC, unless there is an objection or not
> > enough votes.
> >
> > Thank you~
> >
> > Xintong Song
> >
> >
> > [1]
> >
> >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-53%3A+Fine+Grained+Operator+Resource+Management
> >
> > [2]
> >
> >
> http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-FLIP-53-Fine-Grained-Resource-Management-td31831.html
> >
>


Re: [ANNOUNCE] Kostas Kloudas joins the Flink PMC

2019-09-06 Thread Jeff Zhang
Congrats Klou!

Zili Chen  于2019年9月6日周五 下午9:51写道:

> Congrats Klou!
>
> Best,
> tison.
>
>
> Till Rohrmann  于2019年9月6日周五 下午9:23写道:
>
>> Congrats Klou!
>>
>> Cheers,
>> Till
>>
>> On Fri, Sep 6, 2019 at 3:00 PM Dian Fu  wrote:
>>
>>> Congratulations Kostas!
>>>
>>> Regards,
>>> Dian
>>>
>>> > 在 2019年9月6日,下午8:58,Wesley Peng  写道:
>>> >
>>> > On 2019/9/6 8:55 下午, Fabian Hueske wrote:
>>> >> I'm very happy to announce that Kostas Kloudas is joining the Flink
>>> PMC.
>>> >> Kostas is contributing to Flink for many years and puts lots of
>>> effort in helping our users and growing the Flink community.
>>> >> Please join me in congratulating Kostas!
>>> >
>>> > congratulation Kostas!
>>> >
>>> > regards.
>>>
>>>

-- 
Best Regards

Jeff Zhang


[jira] [Created] (FLINK-13996) Maven instructions for 3.3+ do not cover all shading special cases

2019-09-06 Thread Chesnay Schepler (Jira)
Chesnay Schepler created FLINK-13996:


 Summary: Maven instructions for 3.3+ do not cover all shading 
special cases
 Key: FLINK-13996
 URL: https://issues.apache.org/jira/browse/FLINK-13996
 Project: Flink
  Issue Type: Improvement
  Components: Build System, Documentation
Affects Versions: 1.8.0
Reporter: Chesnay Schepler


When building Flink on Maven 3.3+ extra care must be taken to ensure that the 
shading works as expected. Since 3.3 the dependency graph is immutable, as a 
result of which downstream modules (like flink-dist) see the unaltered set of 
dependencies of bundled modules; regardless of these were bundled or not. As a 
result dependencies may be bundled multiple times (original and relocated 
versions).

The [instructions for building Flink with Maven 
3.3+|https://ci.apache.org/projects/flink/flink-docs-master/flinkDev/building.html#dependency-shading]
 correctly point out that flink-dist must be built separately, however (at the 
very least) all filesystems relying on {{flink-fs-hadoop-shaded}} are also 
affected.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


Re: [VOTE] FLIP-53: Fine Grained Operator Resource Management

2019-09-06 Thread Andrey Zagrebin
Thanks for starting the vote @Xintong

+1 for the FLIP-53

Best,
Andrey

On Fri, Sep 6, 2019 at 3:53 PM Till Rohrmann  wrote:

> Hi Xintong,
>
> thanks for starting this vote. The proposal looks good and, hence, +1 for
> it.
>
> One comment I have is concerning the first implementation step. I would
> suggest to not add the flag allSourcesInSamePipelinedRegion to the
> ExecutionConfig because the ExecutionConfig is public API. Ideally we keep
> this flag internal and don't expose it to the user.
>
> Cheers,
> Till
>
> On Fri, Sep 6, 2019 at 1:47 PM Zhu Zhu  wrote:
>
> > Thanks Xintong for proposing this better resource management.
> > This helps a lot to users who want to better manage the job resources.
> And
> > would be even more useful if in the future we can have auto-tuning
> > mechanism for jobs.
> >
> > +1 (non-binding)
> >
> > Thanks,
> > Zhu Zhu
> >
> > Xintong Song  于2019年9月6日周五 上午11:17写道:
> >
> > > Hi all,
> > >
> > > I would like to start the voting process for FLIP-53 [1], which is
> > > discussed and reached consensus in this thread [2].
> > >
> > > This voting will be open for at least 72 hours (excluding weekends).
> I'll
> > > try to close it Sep. 11, 04:00 UTC, unless there is an objection or not
> > > enough votes.
> > >
> > > Thank you~
> > >
> > > Xintong Song
> > >
> > >
> > > [1]
> > >
> > >
> >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-53%3A+Fine+Grained+Operator+Resource+Management
> > >
> > > [2]
> > >
> > >
> >
> http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-FLIP-53-Fine-Grained-Resource-Management-td31831.html
> > >
> >
>


Re: [ANNOUNCE] Kostas Kloudas joins the Flink PMC

2019-09-06 Thread Dawid Wysakowicz
Congratulations Klou!

Best,

Dawid

On 06/09/2019 14:55, Fabian Hueske wrote:
> Hi everyone,
>
> I'm very happy to announce that Kostas Kloudas is joining the Flink PMC.
> Kostas is contributing to Flink for many years and puts lots of effort in
> helping our users and growing the Flink community.
>
> Please join me in congratulating Kostas!
>
> Cheers,
> Fabian
>



signature.asc
Description: OpenPGP digital signature


Re: [ANNOUNCE] Kostas Kloudas joins the Flink PMC

2019-09-06 Thread Forward Xu
Congratulations Kloudas!


Best,

Forward

Dawid Wysakowicz  于2019年9月6日周五 下午10:36写道:

> Congratulations Klou!
>
> Best,
>
> Dawid
>
> On 06/09/2019 14:55, Fabian Hueske wrote:
> > Hi everyone,
> >
> > I'm very happy to announce that Kostas Kloudas is joining the Flink PMC.
> > Kostas is contributing to Flink for many years and puts lots of effort in
> > helping our users and growing the Flink community.
> >
> > Please join me in congratulating Kostas!
> >
> > Cheers,
> > Fabian
> >
>
>


Re: [ANNOUNCE] Kostas Kloudas joins the Flink PMC

2019-09-06 Thread Yu Li
Congratulations Klou!

Best Regards,
Yu


On Fri, 6 Sep 2019 at 22:43, Forward Xu  wrote:

> Congratulations Kloudas!
>
>
> Best,
>
> Forward
>
> Dawid Wysakowicz  于2019年9月6日周五 下午10:36写道:
>
> > Congratulations Klou!
> >
> > Best,
> >
> > Dawid
> >
> > On 06/09/2019 14:55, Fabian Hueske wrote:
> > > Hi everyone,
> > >
> > > I'm very happy to announce that Kostas Kloudas is joining the Flink
> PMC.
> > > Kostas is contributing to Flink for many years and puts lots of effort
> in
> > > helping our users and growing the Flink community.
> > >
> > > Please join me in congratulating Kostas!
> > >
> > > Cheers,
> > > Fabian
> > >
> >
> >
>


Re: [ANNOUNCE] Java 11 cron builds activated on master

2019-09-06 Thread Yu Li
Great to know! Thanks for the efforts Chesnay and will keep an eye on it.

Best Regards,
Yu


On Thu, 5 Sep 2019 at 21:58, Chesnay Schepler  wrote:

> Hello everyone,
>
> I just wanted to inform everyone that we now run Java 11 builds on
> Travis as part of the cron jobs, subsuming the existing Java 9 tests.
> All existing Java 9 build/test infrastructure has been removed.
>
> If you spot any test failures that appear to be specific to Java 11,
> please add a sub-task to FLINK-10725.
>
> I would also encourage everyone to try out Java 11 for local development
> and usage, so that we can find pain points in the dev and user experience.
>
>


[DISCUSS] Features for Apache Flink 1.10

2019-09-06 Thread Gary Yao
Hi community,

Since Apache Flink 1.9.0 has been released more than 2 weeks ago, I want to
start kicking off the discussion about what we want to achieve for the 1.10
release.

Based on discussions with various people as well as observations from
mailing
list threads, Yu Li and I have compiled a list of features that we deem
important to be included in the next release. Note that the features
presented
here are not meant to be exhaustive. As always, I am sure that there will be
other contributions that will make it into the next release. This email
thread
is merely to kick off a discussion, and to give users and contributors an
understanding where the focus of the next release lies. If there is anything
we have missed that somebody is working on, please reply to this thread.


** Proposed features and focus

Following the contribution of Blink to Apache Flink, the community released
a
preview of the Blink SQL Query Processor, which offers better SQL coverage
and
improved performance for batch queries, in Flink 1.9.0. However, the
integration of the Blink query processor is not fully completed yet as there
are still pending tasks, such as implementing full TPC-DS support. With the
next Flink release, we aim at finishing the Blink integration.

Furthermore, there are several ongoing work threads addressing long-standing
issues reported by users, such as improving checkpointing under
backpressure,
and limiting RocksDBs native memory usage, which can be especially
problematic
in containerized Flink deployments.

Notable features surrounding Flink’s ecosystem that are planned for the next
release include active Kubernetes support (i.e., enabling Flink’s
ResourceManager to launch new pods), improved Hive integration, Java 11
support, and new algorithms for the Flink ML library.

Below I have included the list of features that we compiled ordered by
priority – some of which already have ongoing mailing list threads, JIRAs,
or
FLIPs.

- Improving Flink’s build system & CI [1] [2]
- Support Java 11 [3]
- Table API improvements
- Configuration Evolution [4] [5]
- Finish type system: Expression Re-design [6] and UDF refactor
- Streaming DDL: Time attribute (watermark) and Changelog support
- Full SQL partition support for both batch & streaming [7]
- New Java Expression DSL [8]
- SQL CLI with DDL and DML support
- Hive compatibility completion (DDL/UDF) to support full Hive integration
- Partition/Function/View support
- Remaining Blink planner/runtime merge
- Support all TPC-DS queries [9]
- Finer grained resource management
- Unified TaskExecutor Memory Configuration [10]
- Fine Grained Operator Resource Management [11]
- Dynamic Slots Allocation [12]
- Finish scheduler re-architecture [13]
- Allows implementing more sophisticated scheduling strategies such as
better batch scheduler or speculative execution.
- New DataStream Source Interface [14]
- A new source connector architecture to unify the implementation of
source connectors and make it simpler to implement custom source connectors.
- Add more source/system metrics
- For better flink job monitoring and facilitate customized solutions
like auto-scaling.
- Executor Interface / Client API [15]
- Allow Flink downstream projects to easier and better monitor and
control flink jobs.
- Interactive Programming [16]
- Allow users to cache the intermediate results in Table API for later
usage to avoid redundant computation when a Flink application contains
multiple jobs.
- Python User Defined Function [17]
- Support native user-defined functions in Flink Python, including
UDF/UDAF/UDTF in Table API and Python-Java mixed UDF.
- Spillable heap backend [18]
- A new state backend supporting automatic data spill and load when
memory exhausted/regained.
- RocksDB backend memory control [19]
- Prevent excessive memory usage from RocksDB, especially in container
environment.
- Unaligned checkpoints [20]
- Resolve the checkpoint timeout issue under backpressure.
- Separate framework and user class loader in per-job mode
- Active Kubernetes Integration [21]
- Allow ResourceManager talking to Kubernetes to launch new pods
similar to Flink's Yarn/Mesos integration
- ML pipeline/library
- Aims at delivering several core algorithms, including Logistic
Regression, Native Bayes, Random Forest, KMeans, etc.
- Add vertex subtask log url on WebUI [22]


** Suggested release timeline

Based on our usual time-based release schedule [23], and considering that
several events, such as Flink Forward Europe and Asia, are overlapping with
the current release cycle, we should aim at releasing 1.10 around the
beginning of January 2020. To give the community enough testing time, I
propose the feature freeze to be at the end of November. We should announce
an
exact date later in the release cycle.

Lastly, I would like to use the opportunity to propose Yu Li and myself as
release managers for the upcoming release

Re: [ANNOUNCE] Kostas Kloudas joins the Flink PMC

2019-09-06 Thread Zhu Zhu
Congratulations Kostas!

Thanks,
Zhu Zhu

Yu Li  于2019年9月6日周五 下午10:49写道:

> Congratulations Klou!
>
> Best Regards,
> Yu
>
>
> On Fri, 6 Sep 2019 at 22:43, Forward Xu  wrote:
>
> > Congratulations Kloudas!
> >
> >
> > Best,
> >
> > Forward
> >
> > Dawid Wysakowicz  于2019年9月6日周五 下午10:36写道:
> >
> > > Congratulations Klou!
> > >
> > > Best,
> > >
> > > Dawid
> > >
> > > On 06/09/2019 14:55, Fabian Hueske wrote:
> > > > Hi everyone,
> > > >
> > > > I'm very happy to announce that Kostas Kloudas is joining the Flink
> > PMC.
> > > > Kostas is contributing to Flink for many years and puts lots of
> effort
> > in
> > > > helping our users and growing the Flink community.
> > > >
> > > > Please join me in congratulating Kostas!
> > > >
> > > > Cheers,
> > > > Fabian
> > > >
> > >
> > >
> >
>


Re: [DISCUSS] Features for Apache Flink 1.10

2019-09-06 Thread Kostas Kloudas
Hi Gary,

Thanks for kicking off the feature discussion.

+1 for Gary and Yu as release managers.

Cheers,
Kostas

On Fri, Sep 6, 2019 at 5:06 PM Gary Yao  wrote:
>
> Hi community,
>
> Since Apache Flink 1.9.0 has been released more than 2 weeks ago, I want to
> start kicking off the discussion about what we want to achieve for the 1.10
> release.
>
> Based on discussions with various people as well as observations from
> mailing
> list threads, Yu Li and I have compiled a list of features that we deem
> important to be included in the next release. Note that the features
> presented
> here are not meant to be exhaustive. As always, I am sure that there will be
> other contributions that will make it into the next release. This email
> thread
> is merely to kick off a discussion, and to give users and contributors an
> understanding where the focus of the next release lies. If there is anything
> we have missed that somebody is working on, please reply to this thread.
>
>
> ** Proposed features and focus
>
> Following the contribution of Blink to Apache Flink, the community released
> a
> preview of the Blink SQL Query Processor, which offers better SQL coverage
> and
> improved performance for batch queries, in Flink 1.9.0. However, the
> integration of the Blink query processor is not fully completed yet as there
> are still pending tasks, such as implementing full TPC-DS support. With the
> next Flink release, we aim at finishing the Blink integration.
>
> Furthermore, there are several ongoing work threads addressing long-standing
> issues reported by users, such as improving checkpointing under
> backpressure,
> and limiting RocksDBs native memory usage, which can be especially
> problematic
> in containerized Flink deployments.
>
> Notable features surrounding Flink’s ecosystem that are planned for the next
> release include active Kubernetes support (i.e., enabling Flink’s
> ResourceManager to launch new pods), improved Hive integration, Java 11
> support, and new algorithms for the Flink ML library.
>
> Below I have included the list of features that we compiled ordered by
> priority – some of which already have ongoing mailing list threads, JIRAs,
> or
> FLIPs.
>
> - Improving Flink’s build system & CI [1] [2]
> - Support Java 11 [3]
> - Table API improvements
> - Configuration Evolution [4] [5]
> - Finish type system: Expression Re-design [6] and UDF refactor
> - Streaming DDL: Time attribute (watermark) and Changelog support
> - Full SQL partition support for both batch & streaming [7]
> - New Java Expression DSL [8]
> - SQL CLI with DDL and DML support
> - Hive compatibility completion (DDL/UDF) to support full Hive integration
> - Partition/Function/View support
> - Remaining Blink planner/runtime merge
> - Support all TPC-DS queries [9]
> - Finer grained resource management
> - Unified TaskExecutor Memory Configuration [10]
> - Fine Grained Operator Resource Management [11]
> - Dynamic Slots Allocation [12]
> - Finish scheduler re-architecture [13]
> - Allows implementing more sophisticated scheduling strategies such as
> better batch scheduler or speculative execution.
> - New DataStream Source Interface [14]
> - A new source connector architecture to unify the implementation of
> source connectors and make it simpler to implement custom source connectors.
> - Add more source/system metrics
> - For better flink job monitoring and facilitate customized solutions
> like auto-scaling.
> - Executor Interface / Client API [15]
> - Allow Flink downstream projects to easier and better monitor and
> control flink jobs.
> - Interactive Programming [16]
> - Allow users to cache the intermediate results in Table API for later
> usage to avoid redundant computation when a Flink application contains
> multiple jobs.
> - Python User Defined Function [17]
> - Support native user-defined functions in Flink Python, including
> UDF/UDAF/UDTF in Table API and Python-Java mixed UDF.
> - Spillable heap backend [18]
> - A new state backend supporting automatic data spill and load when
> memory exhausted/regained.
> - RocksDB backend memory control [19]
> - Prevent excessive memory usage from RocksDB, especially in container
> environment.
> - Unaligned checkpoints [20]
> - Resolve the checkpoint timeout issue under backpressure.
> - Separate framework and user class loader in per-job mode
> - Active Kubernetes Integration [21]
> - Allow ResourceManager talking to Kubernetes to launch new pods
> similar to Flink's Yarn/Mesos integration
> - ML pipeline/library
> - Aims at delivering several core algorithms, including Logistic
> Regression, Native Bayes, Random Forest, KMeans, etc.
> - Add vertex subtask log url on WebUI [22]
>
>
> ** Suggested release timeline
>
> Based on our usual time-based release schedule [23], and considering that
> several events, such as Flink Forward Europe and Asia, are overlapping with
> the current 

Re: [ANNOUNCE] Kostas Kloudas joins the Flink PMC

2019-09-06 Thread zhijiang
Congratulations Klou!

Best,
Zhijiang
--
From:Zhu Zhu 
Send Time:2019年9月6日(星期五) 17:19
To:dev 
Subject:Re: [ANNOUNCE] Kostas Kloudas joins the Flink PMC

Congratulations Kostas!

Thanks,
Zhu Zhu

Yu Li  于2019年9月6日周五 下午10:49写道:

> Congratulations Klou!
>
> Best Regards,
> Yu
>
>
> On Fri, 6 Sep 2019 at 22:43, Forward Xu  wrote:
>
> > Congratulations Kloudas!
> >
> >
> > Best,
> >
> > Forward
> >
> > Dawid Wysakowicz  于2019年9月6日周五 下午10:36写道:
> >
> > > Congratulations Klou!
> > >
> > > Best,
> > >
> > > Dawid
> > >
> > > On 06/09/2019 14:55, Fabian Hueske wrote:
> > > > Hi everyone,
> > > >
> > > > I'm very happy to announce that Kostas Kloudas is joining the Flink
> > PMC.
> > > > Kostas is contributing to Flink for many years and puts lots of
> effort
> > in
> > > > helping our users and growing the Flink community.
> > > >
> > > > Please join me in congratulating Kostas!
> > > >
> > > > Cheers,
> > > > Fabian
> > > >
> > >
> > >
> >
>



Re: [DISCUSS] Features for Apache Flink 1.10

2019-09-06 Thread zhijiang
Hi Gary,

Thanks for kicking off the features for next release 1.10.  I am very 
supportive of you and Yu Li to be the relaese managers.

Just mention another two improvements which want to be covered in FLINK-1.10 
and I already confirmed with Piotr to reach an agreement before.

1. Data serialize and copy only once for broadcast partition [1]: It would 
improve the throughput performance greatly in broadcast mode and was actually 
proposed in Flink-1.8. Most of works already done before and only left the last 
critical jira/PR. It will not take much efforts to make it ready.

2. Let Netty use Flink's buffers directly in credit-based mode [2] : It could 
avoid memory copy from netty stack to flink managed network buffer. The obvious 
benefit is decreasing the direct memory overhead greatly in large-scale jobs. I 
also heard of some user cases encounter direct OOM caused by netty memory 
overhead. Actually this improvment was proposed by nico in FLINK-1.7 and always 
no time to focus then. Yun Gao already submitted a PR half an year ago but have 
not been reviewed yet. I could help review the deign and PR codes to make it 
ready. 

And you could make these two items as lowest priority if possible.

[1] https://issues.apache.org/jira/browse/FLINK-10745
[2] https://issues.apache.org/jira/browse/FLINK-10742

Best,
Zhijiang
--
From:Gary Yao 
Send Time:2019年9月6日(星期五) 17:06
To:dev 
Cc:carp84 
Subject:[DISCUSS] Features for Apache Flink 1.10

Hi community,

Since Apache Flink 1.9.0 has been released more than 2 weeks ago, I want to
start kicking off the discussion about what we want to achieve for the 1.10
release.

Based on discussions with various people as well as observations from
mailing
list threads, Yu Li and I have compiled a list of features that we deem
important to be included in the next release. Note that the features
presented
here are not meant to be exhaustive. As always, I am sure that there will be
other contributions that will make it into the next release. This email
thread
is merely to kick off a discussion, and to give users and contributors an
understanding where the focus of the next release lies. If there is anything
we have missed that somebody is working on, please reply to this thread.


** Proposed features and focus

Following the contribution of Blink to Apache Flink, the community released
a
preview of the Blink SQL Query Processor, which offers better SQL coverage
and
improved performance for batch queries, in Flink 1.9.0. However, the
integration of the Blink query processor is not fully completed yet as there
are still pending tasks, such as implementing full TPC-DS support. With the
next Flink release, we aim at finishing the Blink integration.

Furthermore, there are several ongoing work threads addressing long-standing
issues reported by users, such as improving checkpointing under
backpressure,
and limiting RocksDBs native memory usage, which can be especially
problematic
in containerized Flink deployments.

Notable features surrounding Flink’s ecosystem that are planned for the next
release include active Kubernetes support (i.e., enabling Flink’s
ResourceManager to launch new pods), improved Hive integration, Java 11
support, and new algorithms for the Flink ML library.

Below I have included the list of features that we compiled ordered by
priority – some of which already have ongoing mailing list threads, JIRAs,
or
FLIPs.

- Improving Flink’s build system & CI [1] [2]
- Support Java 11 [3]
- Table API improvements
- Configuration Evolution [4] [5]
- Finish type system: Expression Re-design [6] and UDF refactor
- Streaming DDL: Time attribute (watermark) and Changelog support
- Full SQL partition support for both batch & streaming [7]
- New Java Expression DSL [8]
- SQL CLI with DDL and DML support
- Hive compatibility completion (DDL/UDF) to support full Hive integration
- Partition/Function/View support
- Remaining Blink planner/runtime merge
- Support all TPC-DS queries [9]
- Finer grained resource management
- Unified TaskExecutor Memory Configuration [10]
- Fine Grained Operator Resource Management [11]
- Dynamic Slots Allocation [12]
- Finish scheduler re-architecture [13]
- Allows implementing more sophisticated scheduling strategies such as
better batch scheduler or speculative execution.
- New DataStream Source Interface [14]
- A new source connector architecture to unify the implementation of
source connectors and make it simpler to implement custom source connectors.
- Add more source/system metrics
- For better flink job monitoring and facilitate customized solutions
like auto-scaling.
- Executor Interface / Client API [15]
- Allow Flink downstream projects to easier and better monitor and
control flink jobs.
- Interactive Programming [16]
- Allow users to cache the intermediate results in Table API for later
usage to avoid redundant 

Re: Checkpointing clarification

2019-09-06 Thread Zhu Zhu
Hi Dominik,

A record is already processed once it enters the window. Thus the
checkpoint barrier does not get blocked before the window containing the
leading records is triggered.
A window is actually part of the states of the WindowOperator and the data
records processing is to build up this state.

Thanks,
Zhu Zhu

Dian Fu  于2019年9月6日周五 下午8:17写道:

> When a WindowOperator receives all the barrier from the upstream, it will
> forward the barrier to downstream operator and perform the checkpoint
> asynchronously.
> It doesn't have to wait the window to trigger before sending out the
> barrier.
>
> Regards,
> Dian
>
> > 在 2019年9月6日,下午8:02,Dominik Wosiński  写道:
> >
> > Hello,
> > I have a slight doubt on checkpointing in Flink and wanted to clarify my
> > understanding. Flink uses barriers internally to keep track of the
> records
> > that were processed. The documentation[1] describes it as the checkpoint
> > was only happening when the barriers are transferred to the sink. So
> let's
> > consider a toy example of `TumblingEventTimeWindow` set to 5 hours and
> > `CheckpointInterval` set to 10 minutes. So, if the documentation is
> > correct, the checkpoint should occur only when the window is processed
> and
> > gets to sink (which can take several hours) , which is not true as far
> as I
> > know. I am surely wrong somewhere, could someone explain where is the
> error
> > in my logic ?
> >
> >
> > [1]
> >
> https://ci.apache.org/projects/flink/flink-docs-stable/internals/stream_checkpointing.html
>
>


[jira] [Created] (FLINK-13997) Remove legacy LeaderAddressAndId

2019-09-06 Thread TisonKun (Jira)
TisonKun created FLINK-13997:


 Summary: Remove legacy LeaderAddressAndId
 Key: FLINK-13997
 URL: https://issues.apache.org/jira/browse/FLINK-13997
 Project: Flink
  Issue Type: Sub-task
  Components: Runtime / Coordination
Affects Versions: 1.10.0
Reporter: TisonKun
 Fix For: 1.10.0


Also {{OneTimeLeaderListenerFuture}} which use {{LeaderAddressAndId}} but is 
dead code, too.

I'd like to supersede FLINK-11664 with this one because I can see the 
requirement tight {{leader address}} with {{leader session id}}, but it is not 
{{LeaderAddressAndId}}. It would be more natural to introduce such class when 
addressing FLINK-10333. Instead of a dedicate JIRA changes here and there.

WDYT? cc [~till.rohrmann]



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Created] (FLINK-13998) Fix ORC test failure with Hive 2.0.x

2019-09-06 Thread Xuefu Zhang (Jira)
Xuefu Zhang created FLINK-13998:
---

 Summary: Fix ORC test failure with Hive 2.0.x
 Key: FLINK-13998
 URL: https://issues.apache.org/jira/browse/FLINK-13998
 Project: Flink
  Issue Type: Improvement
  Components: Connectors / Hive
Reporter: Xuefu Zhang
Assignee: Xuefu Zhang
 Fix For: 1.10.0


Including 2.0.0 and 2.0.1.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


Re: [ANNOUNCE] Kostas Kloudas joins the Flink PMC

2019-09-06 Thread Jark Wu
Congratulations Klou!


> 在 2019年9月7日,00:21,zhijiang  写道:
> 
> Congratulations Klou!
> 
> Best,
> Zhijiang
> --
> From:Zhu Zhu 
> Send Time:2019年9月6日(星期五) 17:19
> To:dev 
> Subject:Re: [ANNOUNCE] Kostas Kloudas joins the Flink PMC
> 
> Congratulations Kostas!
> 
> Thanks,
> Zhu Zhu
> 
> Yu Li  于2019年9月6日周五 下午10:49写道:
> 
>> Congratulations Klou!
>> 
>> Best Regards,
>> Yu
>> 
>> 
>> On Fri, 6 Sep 2019 at 22:43, Forward Xu  wrote:
>> 
>>> Congratulations Kloudas!
>>> 
>>> 
>>> Best,
>>> 
>>> Forward
>>> 
>>> Dawid Wysakowicz  于2019年9月6日周五 下午10:36写道:
>>> 
 Congratulations Klou!
 
 Best,
 
 Dawid
 
 On 06/09/2019 14:55, Fabian Hueske wrote:
> Hi everyone,
> 
> I'm very happy to announce that Kostas Kloudas is joining the Flink
>>> PMC.
> Kostas is contributing to Flink for many years and puts lots of
>> effort
>>> in
> helping our users and growing the Flink community.
> 
> Please join me in congratulating Kostas!
> 
> Cheers,
> Fabian
> 
 
 
>>> 
>> 
> 



Re: [DISCUSS] Features for Apache Flink 1.10

2019-09-06 Thread Jark Wu
Thanks Gary for kicking off the discussion for 1.10 release.

+1 for Gary and Yu as release managers. Thank you for you effort. 

Best,
Jark


> 在 2019年9月7日,00:52,zhijiang  写道:
> 
> Hi Gary,
> 
> Thanks for kicking off the features for next release 1.10.  I am very 
> supportive of you and Yu Li to be the relaese managers.
> 
> Just mention another two improvements which want to be covered in FLINK-1.10 
> and I already confirmed with Piotr to reach an agreement before.
> 
> 1. Data serialize and copy only once for broadcast partition [1]: It would 
> improve the throughput performance greatly in broadcast mode and was actually 
> proposed in Flink-1.8. Most of works already done before and only left the 
> last critical jira/PR. It will not take much efforts to make it ready.
> 
> 2. Let Netty use Flink's buffers directly in credit-based mode [2] : It could 
> avoid memory copy from netty stack to flink managed network buffer. The 
> obvious benefit is decreasing the direct memory overhead greatly in 
> large-scale jobs. I also heard of some user cases encounter direct OOM caused 
> by netty memory overhead. Actually this improvment was proposed by nico in 
> FLINK-1.7 and always no time to focus then. Yun Gao already submitted a PR 
> half an year ago but have not been reviewed yet. I could help review the 
> deign and PR codes to make it ready. 
> 
> And you could make these two items as lowest priority if possible.
> 
> [1] https://issues.apache.org/jira/browse/FLINK-10745
> [2] https://issues.apache.org/jira/browse/FLINK-10742
> 
> Best,
> Zhijiang
> --
> From:Gary Yao 
> Send Time:2019年9月6日(星期五) 17:06
> To:dev 
> Cc:carp84 
> Subject:[DISCUSS] Features for Apache Flink 1.10
> 
> Hi community,
> 
> Since Apache Flink 1.9.0 has been released more than 2 weeks ago, I want to
> start kicking off the discussion about what we want to achieve for the 1.10
> release.
> 
> Based on discussions with various people as well as observations from
> mailing
> list threads, Yu Li and I have compiled a list of features that we deem
> important to be included in the next release. Note that the features
> presented
> here are not meant to be exhaustive. As always, I am sure that there will be
> other contributions that will make it into the next release. This email
> thread
> is merely to kick off a discussion, and to give users and contributors an
> understanding where the focus of the next release lies. If there is anything
> we have missed that somebody is working on, please reply to this thread.
> 
> 
> ** Proposed features and focus
> 
> Following the contribution of Blink to Apache Flink, the community released
> a
> preview of the Blink SQL Query Processor, which offers better SQL coverage
> and
> improved performance for batch queries, in Flink 1.9.0. However, the
> integration of the Blink query processor is not fully completed yet as there
> are still pending tasks, such as implementing full TPC-DS support. With the
> next Flink release, we aim at finishing the Blink integration.
> 
> Furthermore, there are several ongoing work threads addressing long-standing
> issues reported by users, such as improving checkpointing under
> backpressure,
> and limiting RocksDBs native memory usage, which can be especially
> problematic
> in containerized Flink deployments.
> 
> Notable features surrounding Flink’s ecosystem that are planned for the next
> release include active Kubernetes support (i.e., enabling Flink’s
> ResourceManager to launch new pods), improved Hive integration, Java 11
> support, and new algorithms for the Flink ML library.
> 
> Below I have included the list of features that we compiled ordered by
> priority – some of which already have ongoing mailing list threads, JIRAs,
> or
> FLIPs.
> 
> - Improving Flink’s build system & CI [1] [2]
> - Support Java 11 [3]
> - Table API improvements
>- Configuration Evolution [4] [5]
>- Finish type system: Expression Re-design [6] and UDF refactor
>- Streaming DDL: Time attribute (watermark) and Changelog support
>- Full SQL partition support for both batch & streaming [7]
>- New Java Expression DSL [8]
>- SQL CLI with DDL and DML support
> - Hive compatibility completion (DDL/UDF) to support full Hive integration
>- Partition/Function/View support
> - Remaining Blink planner/runtime merge
>- Support all TPC-DS queries [9]
> - Finer grained resource management
>- Unified TaskExecutor Memory Configuration [10]
>- Fine Grained Operator Resource Management [11]
>- Dynamic Slots Allocation [12]
> - Finish scheduler re-architecture [13]
>- Allows implementing more sophisticated scheduling strategies such as
> better batch scheduler or speculative execution.
> - New DataStream Source Interface [14]
>- A new source connector architecture to unify the implementation of
> source connectors and make it simpler to implement custom source connectors.
> - Add 

Re: [ANNOUNCE] Kostas Kloudas joins the Flink PMC

2019-09-06 Thread Hequn Cheng
Congratulations Kostas! Well deserved.

Best, Hequn

On Sat, Sep 7, 2019 at 10:48 AM Jark Wu  wrote:

> Congratulations Klou!
>
>
> > 在 2019年9月7日,00:21,zhijiang  写道:
> >
> > Congratulations Klou!
> >
> > Best,
> > Zhijiang
> > --
> > From:Zhu Zhu 
> > Send Time:2019年9月6日(星期五) 17:19
> > To:dev 
> > Subject:Re: [ANNOUNCE] Kostas Kloudas joins the Flink PMC
> >
> > Congratulations Kostas!
> >
> > Thanks,
> > Zhu Zhu
> >
> > Yu Li  于2019年9月6日周五 下午10:49写道:
> >
> >> Congratulations Klou!
> >>
> >> Best Regards,
> >> Yu
> >>
> >>
> >> On Fri, 6 Sep 2019 at 22:43, Forward Xu  wrote:
> >>
> >>> Congratulations Kloudas!
> >>>
> >>>
> >>> Best,
> >>>
> >>> Forward
> >>>
> >>> Dawid Wysakowicz  于2019年9月6日周五 下午10:36写道:
> >>>
>  Congratulations Klou!
> 
>  Best,
> 
>  Dawid
> 
>  On 06/09/2019 14:55, Fabian Hueske wrote:
> > Hi everyone,
> >
> > I'm very happy to announce that Kostas Kloudas is joining the Flink
> >>> PMC.
> > Kostas is contributing to Flink for many years and puts lots of
> >> effort
> >>> in
> > helping our users and growing the Flink community.
> >
> > Please join me in congratulating Kostas!
> >
> > Cheers,
> > Fabian
> >
> 
> 
> >>>
> >>
> >
>
>


Re: [ANNOUNCE] Kostas Kloudas joins the Flink PMC

2019-09-06 Thread Thomas Weise
Congratulations!


On Fri, Sep 6, 2019 at 9:22 AM zhijiang 
wrote:

> Congratulations Klou!
>
> Best,
> Zhijiang
> --
> From:Zhu Zhu 
> Send Time:2019年9月6日(星期五) 17:19
> To:dev 
> Subject:Re: [ANNOUNCE] Kostas Kloudas joins the Flink PMC
>
> Congratulations Kostas!
>
> Thanks,
> Zhu Zhu
>
> Yu Li  于2019年9月6日周五 下午10:49写道:
>
> > Congratulations Klou!
> >
> > Best Regards,
> > Yu
> >
> >
> > On Fri, 6 Sep 2019 at 22:43, Forward Xu  wrote:
> >
> > > Congratulations Kloudas!
> > >
> > >
> > > Best,
> > >
> > > Forward
> > >
> > > Dawid Wysakowicz  于2019年9月6日周五 下午10:36写道:
> > >
> > > > Congratulations Klou!
> > > >
> > > > Best,
> > > >
> > > > Dawid
> > > >
> > > > On 06/09/2019 14:55, Fabian Hueske wrote:
> > > > > Hi everyone,
> > > > >
> > > > > I'm very happy to announce that Kostas Kloudas is joining the Flink
> > > PMC.
> > > > > Kostas is contributing to Flink for many years and puts lots of
> > effort
> > > in
> > > > > helping our users and growing the Flink community.
> > > > >
> > > > > Please join me in congratulating Kostas!
> > > > >
> > > > > Cheers,
> > > > > Fabian
> > > > >
> > > >
> > > >
> > >
> >
>
>


Re: [DISCUSS] Features for Apache Flink 1.10

2019-09-06 Thread Dian Fu
Hi Gary,

Thanks for kicking off the release schedule of 1.10. +1 for you and Yu Li as 
the release manager.

The feature freeze/release time sounds reasonable.

Thanks,
Dian

> 在 2019年9月7日,上午11:30,Jark Wu  写道:
> 
> Thanks Gary for kicking off the discussion for 1.10 release.
> 
> +1 for Gary and Yu as release managers. Thank you for you effort. 
> 
> Best,
> Jark
> 
> 
>> 在 2019年9月7日,00:52,zhijiang  写道:
>> 
>> Hi Gary,
>> 
>> Thanks for kicking off the features for next release 1.10.  I am very 
>> supportive of you and Yu Li to be the relaese managers.
>> 
>> Just mention another two improvements which want to be covered in FLINK-1.10 
>> and I already confirmed with Piotr to reach an agreement before.
>> 
>> 1. Data serialize and copy only once for broadcast partition [1]: It would 
>> improve the throughput performance greatly in broadcast mode and was 
>> actually proposed in Flink-1.8. Most of works already done before and only 
>> left the last critical jira/PR. It will not take much efforts to make it 
>> ready.
>> 
>> 2. Let Netty use Flink's buffers directly in credit-based mode [2] : It 
>> could avoid memory copy from netty stack to flink managed network buffer. 
>> The obvious benefit is decreasing the direct memory overhead greatly in 
>> large-scale jobs. I also heard of some user cases encounter direct OOM 
>> caused by netty memory overhead. Actually this improvment was proposed by 
>> nico in FLINK-1.7 and always no time to focus then. Yun Gao already 
>> submitted a PR half an year ago but have not been reviewed yet. I could help 
>> review the deign and PR codes to make it ready. 
>> 
>> And you could make these two items as lowest priority if possible.
>> 
>> [1] https://issues.apache.org/jira/browse/FLINK-10745
>> [2] https://issues.apache.org/jira/browse/FLINK-10742
>> 
>> Best,
>> Zhijiang
>> --
>> From:Gary Yao 
>> Send Time:2019年9月6日(星期五) 17:06
>> To:dev 
>> Cc:carp84 
>> Subject:[DISCUSS] Features for Apache Flink 1.10
>> 
>> Hi community,
>> 
>> Since Apache Flink 1.9.0 has been released more than 2 weeks ago, I want to
>> start kicking off the discussion about what we want to achieve for the 1.10
>> release.
>> 
>> Based on discussions with various people as well as observations from
>> mailing
>> list threads, Yu Li and I have compiled a list of features that we deem
>> important to be included in the next release. Note that the features
>> presented
>> here are not meant to be exhaustive. As always, I am sure that there will be
>> other contributions that will make it into the next release. This email
>> thread
>> is merely to kick off a discussion, and to give users and contributors an
>> understanding where the focus of the next release lies. If there is anything
>> we have missed that somebody is working on, please reply to this thread.
>> 
>> 
>> ** Proposed features and focus
>> 
>> Following the contribution of Blink to Apache Flink, the community released
>> a
>> preview of the Blink SQL Query Processor, which offers better SQL coverage
>> and
>> improved performance for batch queries, in Flink 1.9.0. However, the
>> integration of the Blink query processor is not fully completed yet as there
>> are still pending tasks, such as implementing full TPC-DS support. With the
>> next Flink release, we aim at finishing the Blink integration.
>> 
>> Furthermore, there are several ongoing work threads addressing long-standing
>> issues reported by users, such as improving checkpointing under
>> backpressure,
>> and limiting RocksDBs native memory usage, which can be especially
>> problematic
>> in containerized Flink deployments.
>> 
>> Notable features surrounding Flink’s ecosystem that are planned for the next
>> release include active Kubernetes support (i.e., enabling Flink’s
>> ResourceManager to launch new pods), improved Hive integration, Java 11
>> support, and new algorithms for the Flink ML library.
>> 
>> Below I have included the list of features that we compiled ordered by
>> priority – some of which already have ongoing mailing list threads, JIRAs,
>> or
>> FLIPs.
>> 
>> - Improving Flink’s build system & CI [1] [2]
>> - Support Java 11 [3]
>> - Table API improvements
>>   - Configuration Evolution [4] [5]
>>   - Finish type system: Expression Re-design [6] and UDF refactor
>>   - Streaming DDL: Time attribute (watermark) and Changelog support
>>   - Full SQL partition support for both batch & streaming [7]
>>   - New Java Expression DSL [8]
>>   - SQL CLI with DDL and DML support
>> - Hive compatibility completion (DDL/UDF) to support full Hive integration
>>   - Partition/Function/View support
>> - Remaining Blink planner/runtime merge
>>   - Support all TPC-DS queries [9]
>> - Finer grained resource management
>>   - Unified TaskExecutor Memory Configuration [10]
>>   - Fine Grained Operator Resource Management [11]
>>   - Dynamic Slots Allocation [12]
>> - Finish scheduler re-architecture [13]
>> 

Re: [DISCUSS] Features for Apache Flink 1.10

2019-09-06 Thread Zhu Zhu
Thanks Gary for kicking off this discussion.
Really appreciate that you and Yu offer to help to manage 1.10 release.

+1 for Gary and Yu as release managers.

Thanks,
Zhu Zhu

Dian Fu  于2019年9月7日周六 下午12:26写道:

> Hi Gary,
>
> Thanks for kicking off the release schedule of 1.10. +1 for you and Yu Li
> as the release manager.
>
> The feature freeze/release time sounds reasonable.
>
> Thanks,
> Dian
>
> > 在 2019年9月7日,上午11:30,Jark Wu  写道:
> >
> > Thanks Gary for kicking off the discussion for 1.10 release.
> >
> > +1 for Gary and Yu as release managers. Thank you for you effort.
> >
> > Best,
> > Jark
> >
> >
> >> 在 2019年9月7日,00:52,zhijiang  写道:
> >>
> >> Hi Gary,
> >>
> >> Thanks for kicking off the features for next release 1.10.  I am very
> supportive of you and Yu Li to be the relaese managers.
> >>
> >> Just mention another two improvements which want to be covered in
> FLINK-1.10 and I already confirmed with Piotr to reach an agreement before.
> >>
> >> 1. Data serialize and copy only once for broadcast partition [1]: It
> would improve the throughput performance greatly in broadcast mode and was
> actually proposed in Flink-1.8. Most of works already done before and only
> left the last critical jira/PR. It will not take much efforts to make it
> ready.
> >>
> >> 2. Let Netty use Flink's buffers directly in credit-based mode [2] : It
> could avoid memory copy from netty stack to flink managed network buffer.
> The obvious benefit is decreasing the direct memory overhead greatly in
> large-scale jobs. I also heard of some user cases encounter direct OOM
> caused by netty memory overhead. Actually this improvment was proposed by
> nico in FLINK-1.7 and always no time to focus then. Yun Gao already
> submitted a PR half an year ago but have not been reviewed yet. I could
> help review the deign and PR codes to make it ready.
> >>
> >> And you could make these two items as lowest priority if possible.
> >>
> >> [1] https://issues.apache.org/jira/browse/FLINK-10745
> >> [2] https://issues.apache.org/jira/browse/FLINK-10742
> >>
> >> Best,
> >> Zhijiang
> >> --
> >> From:Gary Yao 
> >> Send Time:2019年9月6日(星期五) 17:06
> >> To:dev 
> >> Cc:carp84 
> >> Subject:[DISCUSS] Features for Apache Flink 1.10
> >>
> >> Hi community,
> >>
> >> Since Apache Flink 1.9.0 has been released more than 2 weeks ago, I
> want to
> >> start kicking off the discussion about what we want to achieve for the
> 1.10
> >> release.
> >>
> >> Based on discussions with various people as well as observations from
> >> mailing
> >> list threads, Yu Li and I have compiled a list of features that we deem
> >> important to be included in the next release. Note that the features
> >> presented
> >> here are not meant to be exhaustive. As always, I am sure that there
> will be
> >> other contributions that will make it into the next release. This email
> >> thread
> >> is merely to kick off a discussion, and to give users and contributors
> an
> >> understanding where the focus of the next release lies. If there is
> anything
> >> we have missed that somebody is working on, please reply to this thread.
> >>
> >>
> >> ** Proposed features and focus
> >>
> >> Following the contribution of Blink to Apache Flink, the community
> released
> >> a
> >> preview of the Blink SQL Query Processor, which offers better SQL
> coverage
> >> and
> >> improved performance for batch queries, in Flink 1.9.0. However, the
> >> integration of the Blink query processor is not fully completed yet as
> there
> >> are still pending tasks, such as implementing full TPC-DS support. With
> the
> >> next Flink release, we aim at finishing the Blink integration.
> >>
> >> Furthermore, there are several ongoing work threads addressing
> long-standing
> >> issues reported by users, such as improving checkpointing under
> >> backpressure,
> >> and limiting RocksDBs native memory usage, which can be especially
> >> problematic
> >> in containerized Flink deployments.
> >>
> >> Notable features surrounding Flink’s ecosystem that are planned for the
> next
> >> release include active Kubernetes support (i.e., enabling Flink’s
> >> ResourceManager to launch new pods), improved Hive integration, Java 11
> >> support, and new algorithms for the Flink ML library.
> >>
> >> Below I have included the list of features that we compiled ordered by
> >> priority – some of which already have ongoing mailing list threads,
> JIRAs,
> >> or
> >> FLIPs.
> >>
> >> - Improving Flink’s build system & CI [1] [2]
> >> - Support Java 11 [3]
> >> - Table API improvements
> >>   - Configuration Evolution [4] [5]
> >>   - Finish type system: Expression Re-design [6] and UDF refactor
> >>   - Streaming DDL: Time attribute (watermark) and Changelog support
> >>   - Full SQL partition support for both batch & streaming [7]
> >>   - New Java Expression DSL [8]
> >>   - SQL CLI with DDL and DML support
> >> - Hive compatibility completion (DDL/UDF)

[jira] [Created] (FLINK-13999) Correct the documentation of MATCH_RECOGNIZE

2019-09-06 Thread Dian Fu (Jira)
Dian Fu created FLINK-13999:
---

 Summary: Correct the documentation of MATCH_RECOGNIZE
 Key: FLINK-13999
 URL: https://issues.apache.org/jira/browse/FLINK-13999
 Project: Flink
  Issue Type: Bug
  Components: Documentation
Reporter: Dian Fu


Regarding to the following 
[example|[https://ci.apache.org/projects/flink/flink-docs-stable/dev/table/streaming/match_recognize.html#aggregations]]
 in the doc:
{code:java}
SELECT *
FROM Ticker
MATCH_RECOGNIZE (
PARTITION BY symbol
ORDER BY rowtime
MEASURES
FIRST(A.rowtime) AS start_tstamp,
LAST(A.rowtime) AS end_tstamp,
AVG(A.price) AS avgPrice
ONE ROW PER MATCH
AFTER MATCH SKIP TO FIRST B
PATTERN (A+ B)
DEFINE
A AS AVG(A.price) < 15
) MR;
{code}
Given the inputs shown in the doc, it should be:
{code:java}
 symbol   start_tstamp   end_tstamp  avgPrice
=  ==  ==  
ACME 01-APR-11 10:00:00 01-APR-11 10:00:03 14.5{code}
instead of:
{code:java}
 symbol   start_tstamp   end_tstamp  avgPrice
=  ==  ==  
ACME   01-APR-11 10:00:00  01-APR-11 10:00:03 14.5
ACME   01-APR-11 10:00:04  01-APR-11 10:00:09 13.5
{code}



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


Re: [ANNOUNCE] Kostas Kloudas joins the Flink PMC

2019-09-06 Thread Gary Yao
Congratulations Klou!

On Sat, Sep 7, 2019 at 6:21 AM Thomas Weise  wrote:

> Congratulations!
>
>
> On Fri, Sep 6, 2019 at 9:22 AM zhijiang  .invalid>
> wrote:
>
> > Congratulations Klou!
> >
> > Best,
> > Zhijiang
> > --
> > From:Zhu Zhu 
> > Send Time:2019年9月6日(星期五) 17:19
> > To:dev 
> > Subject:Re: [ANNOUNCE] Kostas Kloudas joins the Flink PMC
> >
> > Congratulations Kostas!
> >
> > Thanks,
> > Zhu Zhu
> >
> > Yu Li  于2019年9月6日周五 下午10:49写道:
> >
> > > Congratulations Klou!
> > >
> > > Best Regards,
> > > Yu
> > >
> > >
> > > On Fri, 6 Sep 2019 at 22:43, Forward Xu 
> wrote:
> > >
> > > > Congratulations Kloudas!
> > > >
> > > >
> > > > Best,
> > > >
> > > > Forward
> > > >
> > > > Dawid Wysakowicz  于2019年9月6日周五 下午10:36写道:
> > > >
> > > > > Congratulations Klou!
> > > > >
> > > > > Best,
> > > > >
> > > > > Dawid
> > > > >
> > > > > On 06/09/2019 14:55, Fabian Hueske wrote:
> > > > > > Hi everyone,
> > > > > >
> > > > > > I'm very happy to announce that Kostas Kloudas is joining the
> Flink
> > > > PMC.
> > > > > > Kostas is contributing to Flink for many years and puts lots of
> > > effort
> > > > in
> > > > > > helping our users and growing the Flink community.
> > > > > >
> > > > > > Please join me in congratulating Kostas!
> > > > > >
> > > > > > Cheers,
> > > > > > Fabian
> > > > > >
> > > > >
> > > > >
> > > >
> > >
> >
> >
>


Re: [ANNOUNCE] Kostas Kloudas joins the Flink PMC

2019-09-06 Thread ying
Congratulations Kostas!

On Fri, Sep 6, 2019 at 11:21 PM Gary Yao  wrote:

> Congratulations Klou!
>
> On Sat, Sep 7, 2019 at 6:21 AM Thomas Weise  wrote:
>
> > Congratulations!
> >
> >
> > On Fri, Sep 6, 2019 at 9:22 AM zhijiang  > .invalid>
> > wrote:
> >
> > > Congratulations Klou!
> > >
> > > Best,
> > > Zhijiang
> > > --
> > > From:Zhu Zhu 
> > > Send Time:2019年9月6日(星期五) 17:19
> > > To:dev 
> > > Subject:Re: [ANNOUNCE] Kostas Kloudas joins the Flink PMC
> > >
> > > Congratulations Kostas!
> > >
> > > Thanks,
> > > Zhu Zhu
> > >
> > > Yu Li  于2019年9月6日周五 下午10:49写道:
> > >
> > > > Congratulations Klou!
> > > >
> > > > Best Regards,
> > > > Yu
> > > >
> > > >
> > > > On Fri, 6 Sep 2019 at 22:43, Forward Xu 
> > wrote:
> > > >
> > > > > Congratulations Kloudas!
> > > > >
> > > > >
> > > > > Best,
> > > > >
> > > > > Forward
> > > > >
> > > > > Dawid Wysakowicz  于2019年9月6日周五 下午10:36写道:
> > > > >
> > > > > > Congratulations Klou!
> > > > > >
> > > > > > Best,
> > > > > >
> > > > > > Dawid
> > > > > >
> > > > > > On 06/09/2019 14:55, Fabian Hueske wrote:
> > > > > > > Hi everyone,
> > > > > > >
> > > > > > > I'm very happy to announce that Kostas Kloudas is joining the
> > Flink
> > > > > PMC.
> > > > > > > Kostas is contributing to Flink for many years and puts lots of
> > > > effort
> > > > > in
> > > > > > > helping our users and growing the Flink community.
> > > > > > >
> > > > > > > Please join me in congratulating Kostas!
> > > > > > >
> > > > > > > Cheers,
> > > > > > > Fabian
> > > > > > >
> > > > > >
> > > > > >
> > > > >
> > > >
> > >
> > >
> >
>


Re: [ANNOUNCE] Kostas Kloudas joins the Flink PMC

2019-09-06 Thread Kurt Young
Congratulations Klou!

Best,
Kurt


On Sat, Sep 7, 2019 at 2:37 PM ying  wrote:

> Congratulations Kostas!
>
> On Fri, Sep 6, 2019 at 11:21 PM Gary Yao  wrote:
>
> > Congratulations Klou!
> >
> > On Sat, Sep 7, 2019 at 6:21 AM Thomas Weise  wrote:
> >
> > > Congratulations!
> > >
> > >
> > > On Fri, Sep 6, 2019 at 9:22 AM zhijiang  > > .invalid>
> > > wrote:
> > >
> > > > Congratulations Klou!
> > > >
> > > > Best,
> > > > Zhijiang
> > > > --
> > > > From:Zhu Zhu 
> > > > Send Time:2019年9月6日(星期五) 17:19
> > > > To:dev 
> > > > Subject:Re: [ANNOUNCE] Kostas Kloudas joins the Flink PMC
> > > >
> > > > Congratulations Kostas!
> > > >
> > > > Thanks,
> > > > Zhu Zhu
> > > >
> > > > Yu Li  于2019年9月6日周五 下午10:49写道:
> > > >
> > > > > Congratulations Klou!
> > > > >
> > > > > Best Regards,
> > > > > Yu
> > > > >
> > > > >
> > > > > On Fri, 6 Sep 2019 at 22:43, Forward Xu 
> > > wrote:
> > > > >
> > > > > > Congratulations Kloudas!
> > > > > >
> > > > > >
> > > > > > Best,
> > > > > >
> > > > > > Forward
> > > > > >
> > > > > > Dawid Wysakowicz  于2019年9月6日周五
> 下午10:36写道:
> > > > > >
> > > > > > > Congratulations Klou!
> > > > > > >
> > > > > > > Best,
> > > > > > >
> > > > > > > Dawid
> > > > > > >
> > > > > > > On 06/09/2019 14:55, Fabian Hueske wrote:
> > > > > > > > Hi everyone,
> > > > > > > >
> > > > > > > > I'm very happy to announce that Kostas Kloudas is joining the
> > > Flink
> > > > > > PMC.
> > > > > > > > Kostas is contributing to Flink for many years and puts lots
> of
> > > > > effort
> > > > > > in
> > > > > > > > helping our users and growing the Flink community.
> > > > > > > >
> > > > > > > > Please join me in congratulating Kostas!
> > > > > > > >
> > > > > > > > Cheers,
> > > > > > > > Fabian
> > > > > > > >
> > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > > >
> > >
> >
>


Re: [VOTE] FLIP-53: Fine Grained Operator Resource Management

2019-09-06 Thread Kurt Young
+1 for FLIP-53.

I would like to raise one minor concern regarding to implementing
request absolute amount of memory case. Currently, it will be
translated to a memory fraction during compile, and translate back
to absolute value during execution. There is a risk that the user might
get less than he requested due to floating point number problems.

Best,
Kurt


On Fri, Sep 6, 2019 at 10:13 PM Andrey Zagrebin 
wrote:

> Thanks for starting the vote @Xintong
>
> +1 for the FLIP-53
>
> Best,
> Andrey
>
> On Fri, Sep 6, 2019 at 3:53 PM Till Rohrmann  wrote:
>
> > Hi Xintong,
> >
> > thanks for starting this vote. The proposal looks good and, hence, +1 for
> > it.
> >
> > One comment I have is concerning the first implementation step. I would
> > suggest to not add the flag allSourcesInSamePipelinedRegion to the
> > ExecutionConfig because the ExecutionConfig is public API. Ideally we
> keep
> > this flag internal and don't expose it to the user.
> >
> > Cheers,
> > Till
> >
> > On Fri, Sep 6, 2019 at 1:47 PM Zhu Zhu  wrote:
> >
> > > Thanks Xintong for proposing this better resource management.
> > > This helps a lot to users who want to better manage the job resources.
> > And
> > > would be even more useful if in the future we can have auto-tuning
> > > mechanism for jobs.
> > >
> > > +1 (non-binding)
> > >
> > > Thanks,
> > > Zhu Zhu
> > >
> > > Xintong Song  于2019年9月6日周五 上午11:17写道:
> > >
> > > > Hi all,
> > > >
> > > > I would like to start the voting process for FLIP-53 [1], which is
> > > > discussed and reached consensus in this thread [2].
> > > >
> > > > This voting will be open for at least 72 hours (excluding weekends).
> > I'll
> > > > try to close it Sep. 11, 04:00 UTC, unless there is an objection or
> not
> > > > enough votes.
> > > >
> > > > Thank you~
> > > >
> > > > Xintong Song
> > > >
> > > >
> > > > [1]
> > > >
> > > >
> > >
> >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-53%3A+Fine+Grained+Operator+Resource+Management
> > > >
> > > > [2]
> > > >
> > > >
> > >
> >
> http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-FLIP-53-Fine-Grained-Resource-Management-td31831.html
> > > >
> > >
> >
>