[jira] [Created] (FLINK-32894) flink-connector-parent should use maven-shade-plugin 3.3.0+ for Java 17

2023-08-18 Thread Qingsheng Ren (Jira)
Qingsheng Ren created FLINK-32894:
-

 Summary: flink-connector-parent should use maven-shade-plugin 
3.3.0+ for Java 17
 Key: FLINK-32894
 URL: https://issues.apache.org/jira/browse/FLINK-32894
 Project: Flink
  Issue Type: Bug
  Components: Connectors / Parent
Affects Versions: connector-parent-1.0.0
Reporter: Qingsheng Ren


When I tried to compile {{flink-sql-connector-kafka}} with Java 17 and using 
profile {{{}-Pjava17 -Pjava17-target{}}}:

 
{code:java}
[ERROR] Failed to execute goal 
org.apache.maven.plugins:maven-shade-plugin:3.2.4:shade (shade-flink) on 
project flink-sql-connector-kafka: Error creating shaded jar: Problem shading 
JAR 
flink-connectors/flink-connector-kafka/flink-connector-kafka/target/flink-connector-kafka-3.1-SNAPSHOT.jar
 entry 
org/apache/flink/streaming/connectors/kafka/FlinkKafkaProducerBase.class: 
java.lang.IllegalArgumentException: Unsupported class file major version 61 
{code}
{{maven-shade-plugin}} supports Java 17 starting from 3.3.0 (see MSHADE-407). 
We need to set the version of {{maven-shade-plugin}} to at least 3.3.0 for 
profile {{java17}} in {{flink-connector-parent}} pom.  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


Re: [DISCUSS] FLIP-327: Support stream-batch unified operator to improve job throughput when processing backlog data

2023-08-18 Thread Dong Lin
Hi Piotr,

Thanks for the explanation.

To recap our offline discussion, there is a concern regarding the
capability to dynamically switch between stream and batch modes. This
concern is around unforeseen behaviors such as bugs or performance
regressions, which we might not yet be aware of yet. The reason for this
concern is that this feature involves a fundamental impact on the Flink
runtime's behavior.

Due to the above concern, I agree it is reasonable to annotate related APIs
as experimental. This step would provide us with the flexibility to modify
these APIs if issues arise in the future. This annotation also serves as a
note to users that this functionality might not perform well as expected.

Though I believe that we can ensure the reliability of this feature through
good design and code reviews, comprehensive unit tests, and thorough
integration testing, I agree that it is reasonable to be extra cautious in
this case. Also, it should be OK to delay making these APIs as
non-experimental by 1-2 releases.

I have updated FLIP-327, FLIP-328, and FLIP-331 to mark APIs in these docs
as experimental. Please let me know if you think any other API should also
be marked as experimental.

Thanks!
Dong

On Wed, Aug 16, 2023 at 10:39 PM Piotr Nowojski 
wrote:

> Hi Dong,
>
> Operators API is unfortunately also our public facing API and I mean the
> APIs that we will add there should also be marked `@Experimental` IMO.
>
> The config options should also be marked as experimental (both
> annotated @Experimental and noted the same thing in the docs,
> if @Experimental annotation is not automatically mentioned in the docs).
>
> > Alternatively, how about we add a doc for
> checkpointing.interval-during-backlog explaining its impact/concern as
> discussed above?
>
> We should do this independently from marking the APIs/config options as
> `@Experimental`
>
> Best,
> Piotrek
>
> pt., 11 sie 2023 o 14:55 Dong Lin  napisał(a):
>
> > Hi Piotr,
> >
> > Thanks for the reply!
> >
> > On Fri, Aug 11, 2023 at 4:44 PM Piotr Nowojski  >
> > wrote:
> >
> > > Hi,
> > >
> > > Sorry for the long delay in responding!
> > >
> > > >  Given that it is an optional feature that can be
> > > > turned off by users, it might be OK to just let users try it out and
> we
> > > can
> > > > fix performance issues once we detect any of them. What do you think?
> > >
> > > I think it's fine. It would be best to mark this feature as
> experimental,
> > > and
> > > we say that the config keys or the default values might change in the
> > > future.
> > >
> >
> > In general I agree we can mark APIs that determine "whether to enable
> > dynamic switching between stream/batch mode" as experimental.
> >
> > However, I am not sure we have such an API yet. The APIs added in this
> FLIP
> > are intended to be used by operator developers rather than end users. End
> > users can enable this capability by setting
> > execution.checkpointing.interval-during-backlog = Long.MAX and uses a
> > source which might implicitly set backlog statu (e.g. HybridSource). So
> > execution.checkpointing.interval-during-backlog is the only user-facing
> > APIs that can always control whether this feature can be used.
> >
> > However, execution.checkpointing.interval-during-backlog itself is not
> tied
> > to FLIP-327.
> >
> > Do you mean we should set checkpointing.interval-during-backlog as
> > experimental? Alternatively, how about we add a doc for
> > checkpointing.interval-during-backlog explaining its impact/concern as
> > discussed above?
> >
> > Best,
> > Dong
> >
> >
> > > > Maybe we can revisit the need for such a config when we
> > introduce/discuss
> > > > the capability to switch backlog from false to true in the future.
> What
> > > do
> > > > you think?
> > >
> > > Sure, we can do that.
> > >
> > > Best,
> > > Piotrek
> > >
> > > niedz., 23 lip 2023 o 14:32 Dong Lin  napisał(a):
> > >
> > > > Hi Piotr,
> > > >
> > > > Thanks a lot for the explanation. Please see my reply inline.
> > > >
> > > > On Fri, Jul 21, 2023 at 10:49 PM Piotr Nowojski <
> > > piotr.nowoj...@gmail.com>
> > > > wrote:
> > > >
> > > > > Hi Dong,
> > > > >
> > > > > Thanks a lot for the answers. I can now only briefly answer your
> last
> > > > > email.
> > > > >
> > > > > > It is possible that spilling to disks might cause larger
> overhead.
> > > IMO
> > > > it
> > > > > > is an orthogonal issue already existing in Flink. This is
> because a
> > > > Flink
> > > > > > job running batch mode might also be slower than its throughput
> in
> > > > stream
> > > > > > mode due to the same reason.
> > > > >
> > > > > Yes, I know, but the thing that worries me is that previously only
> a
> > > user
> > > > > alone
> > > > > could decide whether to use batch mode or streaming, and in
> practice
> > > one
> > > > > user would rarely (if ever) use both for the same
> problem/job/query.
> > If
> > > > his
> > > > > intention was to eventually process live data, he was using
> streaming
> > > > even
> >

Re: [DISCUSS] FLIP-348: Support System Columns in SQL and Table API

2023-08-18 Thread Timo Walther
Great, I also like my last suggestion as it is even more elegant. I will 
update the FLIP until Monday.


Regards,
Timo

On 17.08.23 13:55, Jark Wu wrote:

Hi Timo,

I'm fine with your latest suggestion that introducing a flag to control
expanding behavior of metadata virtual columns, but not introducing
any concept of system/pseudo columns for now.

Best,
Jark

On Tue, 15 Aug 2023 at 23:25, Timo Walther  wrote:


Hi everyone,

I would like to bump this thread up again.

Esp. I would like to hear opinions on my latest suggestions to simply
use METADATA VIRTUAL as system columns and only introduce a config
option for the SELECT * behavior. Implementation-wise this means minimal
effort and less new concepts.

Looking forward to any kind of feedback.

Thanks,
Timo

On 07.08.23 12:07, Timo Walther wrote:

Hi everyone,

thanks for the various feedback and lively discussion. Sorry, for the
late reply as I was on vacation. Let me answer to some of the topics:

1) Systems columns should not be shown with DESCRIBE statements

This sounds fine to me. I will update the FLIP in the next iteration.

2) Do you know why most SQL systems do not need any prefix with their
pseudo column?

Because most systems do not have external catalogs or connectors. And
also the number of system columns is limited to a handful of columns.
Flink is more generic and thus more complex. And we have already the
concepts of metadata columns. We need to be careful with not overloading
our language.

3) Implementation details

  > how to you plan to implement the "system columns", do we need to add
it to `RelNode` level? Or we just need to do it in the
parsing/validating phase?
  > I'm not sure that Calcite's "system column" feature is fully ready

My plan would be to only modify the parsing/validating phase. I would
like to avoid additional complexity in planner rules and
connector/catalog interfaces. Metadata columns already support
projection push down and are passed through the stack (via Schema,
ResolvedSchema, SupportsReadableMetadata). Calcite's "system column"
feature is not fully ready yet and it would be a large effort
potentially introducing bugs in supporting it. Thus, I'm proposing to
leverage what we already have. The only part that needs to be modified
is the "expand star" method in SqlValidator and Table API.

Queries such as `SELECT * FROM (SELECT $rowtime, * FROM t);` would show
$rowtime as the expand star has only a special case when in the scope
for `FROM t`. All further subqueries treat it as a regular column.

4) Built-in defined pseudo-column "$rowtime"

  > Did you consider making it as a built-in defined pseudo-column
"$rowtime" which returns the time attribute value (if exists) or null
(if non-exists) for every table/query, and pseudo-column "$proctime"
always returns PROCTIME() value for each table/query

Built-in pseudo-columns mean that connector or catalog providers need
consensus in Flink which pseudo-columns should be built-in. We should
keep the concept generic and let platform providers decide which pseudo
columns to expose. $rowtime might be obvious but others such as
$partition or $offset are tricky to get consensus as every external
connector works differently. Also a connector might want to expose
different time semantics (such as ingestion time).

5) Any operator can introduce system (psedo) columns.

This is clearly out of scope for this FLIP. The implementation effort
would be huge and could introduce a lot of bugs.

6) "Metadata Key Prefix Constraint" which is still a little complex

Another option could be to drop the naming pattern constraint. We could
make it configurable that METADATA VIRTUAL columns are never selected by
default in SELECT * or visible in DESCRIBE. This would further simplify
the FLIP and esp lower the impact on the planner and all interfaces.

What do you think about this? We could introduce a flag:

table.expand-metadata-columns (better name to be defined)

This way we don't need to introduce the concept of system columns yet,
but can still offer similar functionality with minimal overhead in the
code base.

Regards,
Timo




On 04.08.23 23:06, Alexey Leonov-Vendrovskiy wrote:

Looks like both kinds of system columns can converge.
We can say that any operator can introduce system (psedo) columns.

cc Eugene who is also interested in the subject.

On Wed, Aug 2, 2023 at 1:03 AM Paul Lam  wrote:


Hi Timo,

Thanks for starting the discussion! System columns are no doubt a
good boost on Flink SQL’s usability, and I see the feedbacks are
mainly concerns about the accessibility of system columns.

I think most of the concerns could be solved by clarifying the
ownership of the system columns. Different from databases like
Oracle/BigQuery/PG who owns the data/metadata, Flink uses the
data/metadata from external systems. That means Flink could
have 2 kinds of system columns (take ROWID for example):

1. system columns provided by external systems via catalogs, such
  as ROWID from the

Re: [DISCUSS] [FLINK-32873] Add a config to allow disabling Query hints

2023-08-18 Thread Timo Walther

> lots of the streaming SQL syntax are extensions of SQL standard

That is true. But hints are kind of a special case because they are not 
even "part of Flink SQL" that's why they are written in a comment syntax.


Anyway, I feel hints could be sometimes confusing for users because most 
of them have no effect for streaming and long-term we could also set 
some hints via the CompiledPlan. And if you have multiple teams, 
non-skilled users should not play around with hints and leave the 
decision to the system that might become smarter over time.


Regards,
Timo


On 17.08.23 18:47, liu ron wrote:

Hi, Bonnie


Options hints could be a security concern since users can override

settings.

I think this still doesn't answer my question

Best,
Ron

Jark Wu  于2023年8月17日周四 19:51写道:


Sorry, I still don't understand why we need to disable the query hint.
It doesn't have the security problems as options hint. Bonnie said it
could affect performance, but that depends on users using it explicitly.
If there is any performance problem, users can remove the hint.

If we want to disable query hint just because it's an extension to SQL
standard.
I'm afraid we have to introduce a bunch of configuration, because lots of
the streaming SQL syntax are extensions of SQL standard.

Best,
Jark

On Thu, 17 Aug 2023 at 15:43, Timo Walther  wrote:


+1 for this proposal.

Not every data team would like to enable hints. Also because they are an
extension to the SQL standard. It might also be the case that custom
rules would be overwritten otherwise. Setting hints could also be the
exclusive task of a DevOp team.

Regards,
Timo


On 17.08.23 09:30, Konstantin Knauf wrote:

Hi Bonnie,

this makes sense to me, in particular, given that we already have this
toggle for a different type of hints.

Best,

Konstantin

Am Mi., 16. Aug. 2023 um 19:38 Uhr schrieb Bonnie Arogyam Varghese
:


Hi Liu,
   Options hints could be a security concern since users can override
settings. However, query hints specifically could affect performance.
Since we have a config to disable Options hint, I'm suggesting we also

have

a config to disable Query hints.

On Wed, Aug 16, 2023 at 9:41 AM liu ron  wrote:


Hi,

Thanks for driving this proposal.

Can you explain why you would need to disable query hints because of
security issues? I don't really understand why query hints affects
security.

Best,
Ron

Bonnie Arogyam Varghese 

于2023年8月16日周三

23:59写道:


Platform providers may want to disable hints completely for security
reasons.

Currently, there is a configuration to disable OPTIONS hint -









https://nightlies.apache.org/flink/flink-docs-master/docs/dev/table/config/#table-dynamic-table-options-enabled


However, there is no configuration available to disable QUERY hints

-










https://nightlies.apache.org/flink/flink-docs-release-1.17/docs/dev/table/sql/queries/hints/#query-hints


The proposal is to add a new configuration:

Name: table.query-options.enabled
Description: Enable or disable the QUERY hint, if disabled, an
exception would be thrown if any QUERY hints are specified
Note: The default value will be set to true.



















Re: FLINK-20767 - Support for nested fields filter push down

2023-08-18 Thread Venkatakrishnan Sowrirajan
Gentle ping

On Wed, Aug 16, 2023, 11:56 AM Venkatakrishnan Sowrirajan 
wrote:

> Forgot to share the link -
> https://lists.apache.org/thread/686bhgwrrb4xmbfzlk60szwxos4z64t7 in the
> last email.
>
> Regards
> Venkata krishnan
>
>
> On Wed, Aug 16, 2023 at 11:55 AM Venkatakrishnan Sowrirajan <
> vsowr...@asu.edu> wrote:
>
>> Btw, this is the FLIP proposal discussion thread. Please share your
>> thoughts. Thanks.
>>
>> Regards
>> Venkata krishnan
>>
>>
>> On Sun, Aug 13, 2023 at 6:35 AM liu ron  wrote:
>>
>>> Hi, Venkata krishnan
>>>
>>> Thanks for driving this work, look forward to your FLIP.
>>>
>>> Best,
>>> Ron
>>>
>>> Venkatakrishnan Sowrirajan  于2023年8月13日周日 14:34写道:
>>>
>>> > Thanks Yunhong. That's correct. I am able to make it work locally.
>>> > Currently, in the process of writing a FLIP for the necessary changes
>>> to
>>> > the SupportsFilterPushDown API to support nested fields filter push
>>> down.
>>> >
>>> > Regards
>>> > Venkata krishnan
>>> >
>>> >
>>> > On Mon, Aug 7, 2023 at 8:28 PM yh z  wrote:
>>> >
>>> > > Hi Venkatakrishnan,
>>> > > Sorry for the late reply. I have looked at the code and feel like you
>>> > need
>>> > > to modify the logic of the
>>> > > ExpressionConverter.visit(FieldReferenceExpression expression)
>>> method to
>>> > > support nested types,
>>> > > which are not currently supported in currently code.
>>> > >
>>> > > Regards,
>>> > > Yunhong Zheng (Swuferhong)
>>> > >
>>> > > Venkatakrishnan Sowrirajan  于2023年8月7日周一 13:30写道:
>>> > >
>>> > > > (Sorry, I pressed send too early)
>>> > > >
>>> > > > Thanks for the help @zhengyunhon...@gmail.com.
>>> > > >
>>> > > > Agree on not changing the API as much as possible as well as wrt
>>> > > > simplifying Projection pushdown with nested fields eventually as
>>> well.
>>> > > >
>>> > > > In terms of the code itself, currently I am trying to leverage the
>>> > > > FieldReferenceExpression to also handle nested fields for filter
>>> push
>>> > > down.
>>> > > > But where I am currently struggling to make progress is, once the
>>> > filters
>>> > > > are pushed to the table source itself, in
>>> > > >
>>> >
>>> PushFilterIntoSourceScanRuleBase#resolveFiltersAndCreateTableSourceTable
>>> > > > there is a conversion from List>> > > > FieldReferenceExpression) to the List itself.
>>> > > >
>>> > > > If you have some pointers for that, please let me know. Thanks.
>>> > > >
>>> > > > Regards
>>> > > > Venkata krishnan
>>> > > >
>>> > > >
>>> > > > On Sun, Aug 6, 2023 at 10:23 PM Venkatakrishnan Sowrirajan <
>>> > > > vsowr...@asu.edu>
>>> > > > wrote:
>>> > > >
>>> > > > > Thanks @zhengyunhon...@gmail.com
>>> > > > > Regards
>>> > > > > Venkata krishnan
>>> > > > >
>>> > > > >
>>> > > > > On Sun, Aug 6, 2023 at 6:16 PM yh z 
>>> > wrote:
>>> > > > >
>>> > > > >> Hi, Venkatakrishnan,
>>> > > > >> I think this is a very useful feature. I have been focusing on
>>> the
>>> > > > >> development of the flink-table-planner module recently, so if
>>> you
>>> > need
>>> > > > >> some
>>> > > > >> help, I can assist you in completing the development of some
>>> > sub-tasks
>>> > > > or
>>> > > > >> code review.
>>> > > > >>
>>> > > > >> Returning to the design itself, I think it's necessary to modify
>>> > > > >> FieldReferenceExpression or re-implement a
>>> > > > NestedFieldReferenceExpression.
>>> > > > >> As for modifying the interface of SupportsProjectionPushDown, I
>>> > think
>>> > > we
>>> > > > >> need to make some trade-offs. As a connector developer, the
>>> > stability
>>> > > of
>>> > > > >> the interface is very important. If there are no unresolved
>>> bugs, I
>>> > > > >> personally do not recommend modifying the interface. However,
>>> when I
>>> > > > first
>>> > > > >> read the code of SupportsProjectionPushDown, the design of
>>> int[][]
>>> > was
>>> > > > >> very
>>> > > > >> confusing for me, and it took me a long time to understand it by
>>> > > running
>>> > > > >> specify UT tests. Therefore, in terms of the design of this
>>> > interface
>>> > > > and
>>> > > > >> the consistency between different interfaces, there is indeed
>>> room
>>> > for
>>> > > > >> improvement it.
>>> > > > >>
>>> > > > >> Thanks,
>>> > > > >> Yunhong Zheng (Swuferhong)
>>> > > > >>
>>> > > > >>
>>> > > > >>
>>> > > > >>
>>> > > > >> Becket Qin  于2023年8月3日周四 07:44写道:
>>> > > > >>
>>> > > > >> > Hi Jark,
>>> > > > >> >
>>> > > > >> > If the FieldReferenceExpression contains an int[] to support a
>>> > > nested
>>> > > > >> field
>>> > > > >> > reference, List (or
>>> > > > >> FieldReferenceExpression[])
>>> > > > >> > and int[][] are actually equivalent. If we are designing this
>>> from
>>> > > > >> scratch,
>>> > > > >> > personally I prefer using List for
>>> > > > >> consistency,
>>> > > > >> > i.e. always resolving everything to expressions for users.
>>> > > Projection
>>> > > > >> is a
>>> > > > >> > simpler case, but should not be a special case. This avoids
>>> doing
>>> > > the
>>> > > > >> same
>>> > > > >> > thing in different wa

Re: [DISCUSS] FLIP-356: Support Nested Fields Filter Pushdown

2023-08-18 Thread Venkatakrishnan Sowrirajan
Gentle ping for reviews/feedback.

On Tue, Aug 15, 2023, 5:37 PM Venkatakrishnan Sowrirajan 
wrote:

> Hi All,
>
> I am opening this thread to discuss FLIP-356: Support Nested Fields
> Filter Pushdown. The FLIP can be found at
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-356%3A+Support+Nested+Fields+Filter+Pushdown
>
> This FLIP adds support for pushing down nested fields filters to the
> underlying TableSource. In our data lake, we find a lot of datasets have
> nested fields and also user queries with filters defined on the nested
> fields. This would drastically improve the performance for those sets of
> queries.
>
> Appreciate any comments or feedback you may have on this proposal.
>
> Regards
> Venkata krishnan
>


[DISCUSS] FLIP-319: Integrating with Kafka’s proper support for 2PC participation (KIP-939).

2023-08-18 Thread Tzu-Li (Gordon) Tai
Hi Flink devs,

I’d like to officially start a discussion for FLIP-319: Integrating with
Kafka’s proper support for 2PC participation (KIP-939) [1].

This is the “sister” joint FLIP for KIP-939 [2] [3]. It has been a
long-standing issue that Flink’s Kafka connector doesn’t work fully
correctly under exactly-once mode due to lack of distributed transaction
support in the Kafka transaction protocol. This has led to subpar hacks in
the connector such as Java reflections to workaround the protocol's
limitations (which causes a bunch of problems on its own, e.g. long
recovery times for the connector), while still having corner case scenarios
that can lead to data loss.

This joint effort with the Kafka community attempts to address this so that
the Flink Kafka connector can finally work against public Kafka APIs, which
should result in a much more robust integration between the two systems,
and for Flink developers, easier maintainability of the code.

Obviously, actually implementing this FLIP relies on the joint KIP being
implemented and released first. Nevertheless, I'd like to start the
discussion for the design as early as possible so we can benefit from the
new Kafka changes as soon as it is available.

Looking forward to feedback and comments on the proposal!

Thanks,
Gordon

[1]
https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=255071710
[2]
https://cwiki.apache.org/confluence/display/KAFKA/KIP-939%3A+Support+Participation+in+2PC
[3] https://lists.apache.org/thread/wbs9sqs3z1tdm7ptw5j4o9osmx9s41nf


Re: [DISCUSS] Status of Statefun Project

2023-08-18 Thread Tzu-Li (Gordon) Tai
Hi Galen,

> Gordon, is there a trick to running the sample code in
flink-statefun-playground against yet-unreleased code that I'm missing?

You'd have to locally build an image from the release branch, with a
temporary image version tag. Then, in the flink-statefun-playground, change
the image versions in the docker-compose files to use that locally built
image. IIRC, that's what we have been doing in the past. Admittedly, it's
pretty manual - I don't think the CI manages this workflow.

Thanks,
Gordon

On Mon, Aug 14, 2023 at 10:42 AM Galen Warren 
wrote:

> I created a pull request for this: [FLINK-31619] Upgrade Stateful
> Functions to Flink 1.16.1 by galenwarren · Pull Request #331 ·
> apache/flink-statefun (github.com)
> .
>
> JIRA is here: [FLINK-31619] Upgrade Stateful Functions to Flink 1.16.1 -
> ASF JIRA (apache.org)
> .
>
> Statefun references 1.16.2, despite the title -- that version has come out
> since the issue was created.
>
> I figured out how to run all the playground tests locally, but it took a
> bit of reworking of the playground setup with respect to Docker;
> specifically, the Docker contexts used to build the example functions
> needed to be broadened (i.e. moved up the tree) so that, if needed, local
> artifacts/code can be accessed from within the containers at build time.
> Then I made the Docker compose.yml configurable through environment
> variables to allow for them to run in either the original manner -- i.e.
> pulling artifacts from public repos -- or in a "local" mode, where
> artifacts are pulled from local builds.
>
> This process is a cleaner if the playground is a subfolder of the
> flink-statefun project rather than be its own repository
> (flink-statefun-playground), because then all the relative paths between
> the playground files and the build artifacts are fixed. So, I'd like to
> propose to move the playground files, modified as described above, to
> flink-statefun/playground and retire flink-statefun-playground. I can
> submit separate PR s those changes if everyone is on board.
>
> Also, should I plan to do the same upgrade to handle Flink 1.17.x? It
> should be easy to do, especially while the 1.16.x upgrade is fresh on my
> mind.
>
> Thanks.
>
>
> On Fri, Aug 11, 2023 at 6:40 PM Galen Warren 
> wrote:
>
>> I'm done with the code to make Statefun compatible with Flink 1.16, and
>> all the tests (including e2e succeed). The required changes were pretty
>> minimal.
>>
>> I'm running into a bit of a chicken/egg problem executing the tests in
>> flink-statefun-playground
>> , though. That
>> project seems to assume that all the various Statefun artifacts are built
>> and deployed to the various public repositories already. I've looked into
>> trying to redirect references to local artifacts; however, that's also
>> tricky since all the building occurs in Docker containers.
>>
>> Gordon, is there a trick to running the sample code in
>> flink-statefun-playground against yet-unreleased code that I'm missing?
>>
>> Thanks.
>>
>> On Sat, Jun 24, 2023 at 12:40 PM Galen Warren 
>> wrote:
>>
>>> Great -- thanks!
>>>
>>> I'm going to be out of town for about a week but I'll take a look at
>>> this when I'm back.
>>>
>>> On Tue, Jun 20, 2023 at 8:46 AM Martijn Visser 
>>> wrote:
>>>
 Hi Galen,

 Yes, I'll be more than happy to help with Statefun releases.

 Best regards,

 Martijn

 On Tue, Jun 20, 2023 at 2:21 PM Galen Warren 
 wrote:

> Thanks.
>
> Martijn, to answer your question, I'd need to do a small amount of
> work to get a PR ready, but not much. Happy to do it if we're deciding to
> restart Statefun releases -- are we?
>
> -- Galen
>
> On Sat, Jun 17, 2023 at 9:47 AM Tzu-Li (Gordon) Tai <
> tzuli...@apache.org> wrote:
>
>> > Perhaps he could weigh in on whether the combination of automated
>> tests plus those smoke tests should be sufficient for testing with new
>> Flink versions
>>
>> What we usually did at the bare minimum for new StateFun releases was
>> the following:
>>
>>1. Build tests (including the smoke tests in the e2e module,
>>which covers important tests like exactly-once verification)
>>2. Updating the flink-statefun-playground repo and manually
>>running all language examples there.
>>
>> If upgrading Flink versions was the only change in the release, I'd
>> probably say that this is sufficient.
>>
>> Best,
>> Gordon
>>
>> On Thu, Jun 15, 2023 at 5:25 AM Martijn Visser <
>> martijnvis...@apache.org> wrote:
>>
>>> Let me know if you have a PR for a Flink update :)
>>>
>>> On Thu, Jun 8, 2023 at 5:52 PM Galen Warren via user <
>>> u...@flink.apache.org> wrote:
>>>
 Thanks Ma

Re: [DISCUSS] Status of Statefun Project

2023-08-18 Thread Galen Warren
Thanks.

If you were to build a local image, as you suggest, how do you access that
image when building the playground images? All the compilation of
playground code happens inside containers, so local images on the host
aren't available in those containers. Unless I'm missing something?

I've slightly reworked things such that the playground images can be run in
one of two modes -- the default mode, which works like before, and a
"local" mode where locally built code is copied into the build containers
so that it can be accessed during the build. It works fine, you just have
to define a couple of environment variables when running docker-compose to
specify default vs. local mode and what versions of Flink and Statefun
should be referenced, and then you can build a run the local examples
without any additional steps. Does that sound like a reasonable approach?


On Fri, Aug 18, 2023 at 2:17 PM Tzu-Li (Gordon) Tai 
wrote:

> Hi Galen,
>
> > Gordon, is there a trick to running the sample code in
> flink-statefun-playground against yet-unreleased code that I'm missing?
>
> You'd have to locally build an image from the release branch, with a
> temporary image version tag. Then, in the flink-statefun-playground, change
> the image versions in the docker-compose files to use that locally built
> image. IIRC, that's what we have been doing in the past. Admittedly, it's
> pretty manual - I don't think the CI manages this workflow.
>
> Thanks,
> Gordon
>
> On Mon, Aug 14, 2023 at 10:42 AM Galen Warren 
> wrote:
>
> > I created a pull request for this: [FLINK-31619] Upgrade Stateful
> > Functions to Flink 1.16.1 by galenwarren · Pull Request #331 ·
> > apache/flink-statefun (github.com)
> > .
> >
> > JIRA is here: [FLINK-31619] Upgrade Stateful Functions to Flink 1.16.1 -
> > ASF JIRA (apache.org)
> > .
> >
> > Statefun references 1.16.2, despite the title -- that version has come
> out
> > since the issue was created.
> >
> > I figured out how to run all the playground tests locally, but it took a
> > bit of reworking of the playground setup with respect to Docker;
> > specifically, the Docker contexts used to build the example functions
> > needed to be broadened (i.e. moved up the tree) so that, if needed, local
> > artifacts/code can be accessed from within the containers at build time.
> > Then I made the Docker compose.yml configurable through environment
> > variables to allow for them to run in either the original manner -- i.e.
> > pulling artifacts from public repos -- or in a "local" mode, where
> > artifacts are pulled from local builds.
> >
> > This process is a cleaner if the playground is a subfolder of the
> > flink-statefun project rather than be its own repository
> > (flink-statefun-playground), because then all the relative paths between
> > the playground files and the build artifacts are fixed. So, I'd like to
> > propose to move the playground files, modified as described above, to
> > flink-statefun/playground and retire flink-statefun-playground. I can
> > submit separate PR s those changes if everyone is on board.
> >
> > Also, should I plan to do the same upgrade to handle Flink 1.17.x? It
> > should be easy to do, especially while the 1.16.x upgrade is fresh on my
> > mind.
> >
> > Thanks.
> >
> >
> > On Fri, Aug 11, 2023 at 6:40 PM Galen Warren 
> > wrote:
> >
> >> I'm done with the code to make Statefun compatible with Flink 1.16, and
> >> all the tests (including e2e succeed). The required changes were pretty
> >> minimal.
> >>
> >> I'm running into a bit of a chicken/egg problem executing the tests in
> >> flink-statefun-playground
> >> , though. That
> >> project seems to assume that all the various Statefun artifacts are
> built
> >> and deployed to the various public repositories already. I've looked
> into
> >> trying to redirect references to local artifacts; however, that's also
> >> tricky since all the building occurs in Docker containers.
> >>
> >> Gordon, is there a trick to running the sample code in
> >> flink-statefun-playground against yet-unreleased code that I'm missing?
> >>
> >> Thanks.
> >>
> >> On Sat, Jun 24, 2023 at 12:40 PM Galen Warren 
> >> wrote:
> >>
> >>> Great -- thanks!
> >>>
> >>> I'm going to be out of town for about a week but I'll take a look at
> >>> this when I'm back.
> >>>
> >>> On Tue, Jun 20, 2023 at 8:46 AM Martijn Visser 
> >>> wrote:
> >>>
>  Hi Galen,
> 
>  Yes, I'll be more than happy to help with Statefun releases.
> 
>  Best regards,
> 
>  Martijn
> 
>  On Tue, Jun 20, 2023 at 2:21 PM Galen Warren  >
>  wrote:
> 
> > Thanks.
> >
> > Martijn, to answer your question, I'd need to do a small amount of
> > work to get a PR ready, but not much. Happy to do it if we're
> deciding to
> > restart Statefun releases -- are 

Re: [DISCUSS] Status of Statefun Project

2023-08-18 Thread Tzu-Li (Gordon) Tai
Hi Galen,

> locally built code is copied into the build containers
so that it can be accessed during the build.

That's exactly what we had been doing for release testing, yes. Sorry I
missed that detail in my previous response.

And yes, that sounds like a reasonable approach. If I understand you
correctly, the workflow would become this:

   1. Build the StateFun repo locally to install the snapshot artifact
   jars + have a local base StateFun image.
   2. Run the playground in "local" mode, so that it uses the local base
   StateFun image + builds the playground code using copied artifact jars
   (instead of pulling from Maven).

That looks good to me!

Thanks,
Gordon

On Fri, Aug 18, 2023 at 11:33 AM Galen Warren
 wrote:

> Thanks.
>
> If you were to build a local image, as you suggest, how do you access that
> image when building the playground images? All the compilation of
> playground code happens inside containers, so local images on the host
> aren't available in those containers. Unless I'm missing something?
>
> I've slightly reworked things such that the playground images can be run in
> one of two modes -- the default mode, which works like before, and a
> "local" mode where locally built code is copied into the build containers
> so that it can be accessed during the build. It works fine, you just have
> to define a couple of environment variables when running docker-compose to
> specify default vs. local mode and what versions of Flink and Statefun
> should be referenced, and then you can build a run the local examples
> without any additional steps. Does that sound like a reasonable approach?
>
>
> On Fri, Aug 18, 2023 at 2:17 PM Tzu-Li (Gordon) Tai 
> wrote:
>
> > Hi Galen,
> >
> > > Gordon, is there a trick to running the sample code in
> > flink-statefun-playground against yet-unreleased code that I'm missing?
> >
> > You'd have to locally build an image from the release branch, with a
> > temporary image version tag. Then, in the flink-statefun-playground,
> change
> > the image versions in the docker-compose files to use that locally built
> > image. IIRC, that's what we have been doing in the past. Admittedly, it's
> > pretty manual - I don't think the CI manages this workflow.
> >
> > Thanks,
> > Gordon
> >
> > On Mon, Aug 14, 2023 at 10:42 AM Galen Warren 
> > wrote:
> >
> > > I created a pull request for this: [FLINK-31619] Upgrade Stateful
> > > Functions to Flink 1.16.1 by galenwarren · Pull Request #331 ·
> > > apache/flink-statefun (github.com)
> > > .
> > >
> > > JIRA is here: [FLINK-31619] Upgrade Stateful Functions to Flink 1.16.1
> -
> > > ASF JIRA (apache.org)
> > > .
> > >
> > > Statefun references 1.16.2, despite the title -- that version has come
> > out
> > > since the issue was created.
> > >
> > > I figured out how to run all the playground tests locally, but it took
> a
> > > bit of reworking of the playground setup with respect to Docker;
> > > specifically, the Docker contexts used to build the example functions
> > > needed to be broadened (i.e. moved up the tree) so that, if needed,
> local
> > > artifacts/code can be accessed from within the containers at build
> time.
> > > Then I made the Docker compose.yml configurable through environment
> > > variables to allow for them to run in either the original manner --
> i.e.
> > > pulling artifacts from public repos -- or in a "local" mode, where
> > > artifacts are pulled from local builds.
> > >
> > > This process is a cleaner if the playground is a subfolder of the
> > > flink-statefun project rather than be its own repository
> > > (flink-statefun-playground), because then all the relative paths
> between
> > > the playground files and the build artifacts are fixed. So, I'd like to
> > > propose to move the playground files, modified as described above, to
> > > flink-statefun/playground and retire flink-statefun-playground. I can
> > > submit separate PR s those changes if everyone is on board.
> > >
> > > Also, should I plan to do the same upgrade to handle Flink 1.17.x? It
> > > should be easy to do, especially while the 1.16.x upgrade is fresh on
> my
> > > mind.
> > >
> > > Thanks.
> > >
> > >
> > > On Fri, Aug 11, 2023 at 6:40 PM Galen Warren 
> > > wrote:
> > >
> > >> I'm done with the code to make Statefun compatible with Flink 1.16,
> and
> > >> all the tests (including e2e succeed). The required changes were
> pretty
> > >> minimal.
> > >>
> > >> I'm running into a bit of a chicken/egg problem executing the tests in
> > >> flink-statefun-playground
> > >> , though. That
> > >> project seems to assume that all the various Statefun artifacts are
> > built
> > >> and deployed to the various public repositories already. I've looked
> > into
> > >> trying to redirect references to local artifacts; however, that's also
> > >> tricky since all

Re: [DISCUSS] Status of Statefun Project

2023-08-18 Thread Galen Warren
Yes, exactly! And in addition to the base Statefun jars and the jar for the
Java SDK, it does an equivalent copy/register operation for each of the
other SDK libraries (Go, Python, Javascript) so that those libraries are
also available when building the playground examples.

One more question: In order to copy the various build artifacts into the
Docker containers, those artifacts need to be part of the Docker context.
With the playground being a separate project, that's slightly tricky to do,
as there is no guarantee (other than convention) about the relative paths
of *flink-statefun* and* flink-statefun-playground *in someone's local
filesystem. The way I've set this up locally is to copy the playground into
the* flink-statefun* project -- i.e. to *flink-statefun*/playground -- such
that I can set the Docker context to the root folder of *flink-statefun*
and then have access to any local code and/or build artifacts.

I'm wondering if there might be any appetite for making that move
permanent, i.e. moving the playground to *flink-statefun*/playground and
deprecating the standalone playground project. In addition to making the
problem of building with unreleased artifacts a bit simpler to solve, it
would also simplify the process of releasing a new Statefun version, since
the entire process could be handled with a single PR and associated
build/deploy tasks. In other words, a single PR could both update and
deploy the Statefun package and the playground code and images.

As it stands, at least two PRs would be required for each Statefun version
update -- one for *flink-statefun* and one for *flink-statefun-playground*.

Anyway, just an idea. Maybe there's an important reason for these projects
to remain separate. If we do want to keep the playground project where it
is, I could solve the copying problem by requiring the two projects to be
siblings in the file system and by pre-copying the local build artifacts
into a location accessible by the existing Docker contexts. This would
still leave us with the need to have two PRs and releases instead of one,
though.

Thanks for your help!


On Fri, Aug 18, 2023 at 2:45 PM Tzu-Li (Gordon) Tai 
wrote:

> Hi Galen,
>
> > locally built code is copied into the build containers
> so that it can be accessed during the build.
>
> That's exactly what we had been doing for release testing, yes. Sorry I
> missed that detail in my previous response.
>
> And yes, that sounds like a reasonable approach. If I understand you
> correctly, the workflow would become this:
>
>1. Build the StateFun repo locally to install the snapshot artifact
>jars + have a local base StateFun image.
>2. Run the playground in "local" mode, so that it uses the local base
>StateFun image + builds the playground code using copied artifact jars
>(instead of pulling from Maven).
>
> That looks good to me!
>
> Thanks,
> Gordon
>
> On Fri, Aug 18, 2023 at 11:33 AM Galen Warren
>  wrote:
>
> > Thanks.
> >
> > If you were to build a local image, as you suggest, how do you access
> that
> > image when building the playground images? All the compilation of
> > playground code happens inside containers, so local images on the host
> > aren't available in those containers. Unless I'm missing something?
> >
> > I've slightly reworked things such that the playground images can be run
> in
> > one of two modes -- the default mode, which works like before, and a
> > "local" mode where locally built code is copied into the build containers
> > so that it can be accessed during the build. It works fine, you just have
> > to define a couple of environment variables when running docker-compose
> to
> > specify default vs. local mode and what versions of Flink and Statefun
> > should be referenced, and then you can build a run the local examples
> > without any additional steps. Does that sound like a reasonable approach?
> >
> >
> > On Fri, Aug 18, 2023 at 2:17 PM Tzu-Li (Gordon) Tai  >
> > wrote:
> >
> > > Hi Galen,
> > >
> > > > Gordon, is there a trick to running the sample code in
> > > flink-statefun-playground against yet-unreleased code that I'm missing?
> > >
> > > You'd have to locally build an image from the release branch, with a
> > > temporary image version tag. Then, in the flink-statefun-playground,
> > change
> > > the image versions in the docker-compose files to use that locally
> built
> > > image. IIRC, that's what we have been doing in the past. Admittedly,
> it's
> > > pretty manual - I don't think the CI manages this workflow.
> > >
> > > Thanks,
> > > Gordon
> > >
> > > On Mon, Aug 14, 2023 at 10:42 AM Galen Warren  >
> > > wrote:
> > >
> > > > I created a pull request for this: [FLINK-31619] Upgrade Stateful
> > > > Functions to Flink 1.16.1 by galenwarren · Pull Request #331 ·
> > > > apache/flink-statefun (github.com)
> > > > .
> > > >
> > > > JIRA is here: [FLINK-31619] Upgrade Stateful Functions to Flink
> 1.16.1
> > -
>

Re: [DISCUSS] Status of Statefun Project

2023-08-18 Thread Tzu-Li (Gordon) Tai
Hi Galen,

The original intent of having a separate repo for the playground repo, was
that StateFun users can just go to that and start running simple examples
without any other distractions from the core code. I personally don't have
a strong preference here and can understand how it would make the workflow
more streamlined, but just FYI on the reasoning why are separate in the
first place.

re: paths for locating StateFun artifacts.
Can this be solved by simply passing in the path to the artifacts? As well
as the image tag for the locally build base StateFun image. They could
probably be environment variables.

Cheers,
Gordon

On Fri, Aug 18, 2023 at 12:13 PM Galen Warren via user <
u...@flink.apache.org> wrote:

> Yes, exactly! And in addition to the base Statefun jars and the jar for
> the Java SDK, it does an equivalent copy/register operation for each of the
> other SDK libraries (Go, Python, Javascript) so that those libraries are
> also available when building the playground examples.
>
> One more question: In order to copy the various build artifacts into the
> Docker containers, those artifacts need to be part of the Docker context.
> With the playground being a separate project, that's slightly tricky to do,
> as there is no guarantee (other than convention) about the relative paths
> of *flink-statefun* and* flink-statefun-playground *in someone's local
> filesystem. The way I've set this up locally is to copy the playground into
> the* flink-statefun* project -- i.e. to *flink-statefun*/playground --
> such that I can set the Docker context to the root folder of
> *flink-statefun* and then have access to any local code and/or build
> artifacts.
>
> I'm wondering if there might be any appetite for making that move
> permanent, i.e. moving the playground to *flink-statefun*/playground and
> deprecating the standalone playground project. In addition to making the
> problem of building with unreleased artifacts a bit simpler to solve, it
> would also simplify the process of releasing a new Statefun version, since
> the entire process could be handled with a single PR and associated
> build/deploy tasks. In other words, a single PR could both update and
> deploy the Statefun package and the playground code and images.
>
> As it stands, at least two PRs would be required for each Statefun version
> update -- one for *flink-statefun* and one for *flink-statefun-playground*
> .
>
> Anyway, just an idea. Maybe there's an important reason for these projects
> to remain separate. If we do want to keep the playground project where it
> is, I could solve the copying problem by requiring the two projects to be
> siblings in the file system and by pre-copying the local build artifacts
> into a location accessible by the existing Docker contexts. This would
> still leave us with the need to have two PRs and releases instead of one,
> though.
>
> Thanks for your help!
>
>
> On Fri, Aug 18, 2023 at 2:45 PM Tzu-Li (Gordon) Tai 
> wrote:
>
>> Hi Galen,
>>
>> > locally built code is copied into the build containers
>> so that it can be accessed during the build.
>>
>> That's exactly what we had been doing for release testing, yes. Sorry I
>> missed that detail in my previous response.
>>
>> And yes, that sounds like a reasonable approach. If I understand you
>> correctly, the workflow would become this:
>>
>>1. Build the StateFun repo locally to install the snapshot artifact
>>jars + have a local base StateFun image.
>>2. Run the playground in "local" mode, so that it uses the local base
>>StateFun image + builds the playground code using copied artifact jars
>>(instead of pulling from Maven).
>>
>> That looks good to me!
>>
>> Thanks,
>> Gordon
>>
>> On Fri, Aug 18, 2023 at 11:33 AM Galen Warren
>>  wrote:
>>
>> > Thanks.
>> >
>> > If you were to build a local image, as you suggest, how do you access
>> that
>> > image when building the playground images? All the compilation of
>> > playground code happens inside containers, so local images on the host
>> > aren't available in those containers. Unless I'm missing something?
>> >
>> > I've slightly reworked things such that the playground images can be
>> run in
>> > one of two modes -- the default mode, which works like before, and a
>> > "local" mode where locally built code is copied into the build
>> containers
>> > so that it can be accessed during the build. It works fine, you just
>> have
>> > to define a couple of environment variables when running docker-compose
>> to
>> > specify default vs. local mode and what versions of Flink and Statefun
>> > should be referenced, and then you can build a run the local examples
>> > without any additional steps. Does that sound like a reasonable
>> approach?
>> >
>> >
>> > On Fri, Aug 18, 2023 at 2:17 PM Tzu-Li (Gordon) Tai <
>> tzuli...@apache.org>
>> > wrote:
>> >
>> > > Hi Galen,
>> > >
>> > > > Gordon, is there a trick to running the sample code in
>> > > flink-statefun-playground against yet-

Re: [DISCUSS] Status of Statefun Project

2023-08-18 Thread Galen Warren
Gotcha, makes sense as to the original division.

>> Can this be solved by simply passing in the path to the artifacts

This definitely works if we're going to be copying the artifacts on the
host side -- into the build context -- and then from the context into the
image. It only gets tricky to have a potentially varying path to the
artifacts if we're trying to *directly *include the artifacts in the Docker
context -- then we have a situation where the Docker context must contain
both the artifacts and playground files, with (potentially) different root
locations.

Maybe the simplest thing to do here is just to leave the playground as-is
and then copy the artifacts into the Docker context manually, prior to
building the playground images. I'm fine with that. It will mean that each
Statefun release will require two PRs and two sets of build/publish steps
instead of one, but if everyone else is fine with that I am, too. Unless
anyone objects, I'll go ahead and queue up a PR for the playground that
makes these changes.

Also, I should mention -- in case it's not clear -- that I have already
built and run the playground examples with the code from the PR and
everything worked. So that PR is ready to move forward with review, etc.,
at this point.

Thanks.







On Fri, Aug 18, 2023 at 4:16 PM Tzu-Li (Gordon) Tai 
wrote:

> Hi Galen,
>
> The original intent of having a separate repo for the playground repo, was
> that StateFun users can just go to that and start running simple examples
> without any other distractions from the core code. I personally don't have
> a strong preference here and can understand how it would make the workflow
> more streamlined, but just FYI on the reasoning why are separate in the
> first place.
>
> re: paths for locating StateFun artifacts.
> Can this be solved by simply passing in the path to the artifacts? As well
> as the image tag for the locally build base StateFun image. They could
> probably be environment variables.
>
> Cheers,
> Gordon
>
> On Fri, Aug 18, 2023 at 12:13 PM Galen Warren via user <
> u...@flink.apache.org> wrote:
>
>> Yes, exactly! And in addition to the base Statefun jars and the jar for
>> the Java SDK, it does an equivalent copy/register operation for each of the
>> other SDK libraries (Go, Python, Javascript) so that those libraries are
>> also available when building the playground examples.
>>
>> One more question: In order to copy the various build artifacts into the
>> Docker containers, those artifacts need to be part of the Docker context.
>> With the playground being a separate project, that's slightly tricky to do,
>> as there is no guarantee (other than convention) about the relative paths
>> of *flink-statefun* and* flink-statefun-playground *in someone's local
>> filesystem. The way I've set this up locally is to copy the playground into
>> the* flink-statefun* project -- i.e. to *flink-statefun*/playground --
>> such that I can set the Docker context to the root folder of
>> *flink-statefun* and then have access to any local code and/or build
>> artifacts.
>>
>> I'm wondering if there might be any appetite for making that move
>> permanent, i.e. moving the playground to *flink-statefun*/playground and
>> deprecating the standalone playground project. In addition to making the
>> problem of building with unreleased artifacts a bit simpler to solve, it
>> would also simplify the process of releasing a new Statefun version, since
>> the entire process could be handled with a single PR and associated
>> build/deploy tasks. In other words, a single PR could both update and
>> deploy the Statefun package and the playground code and images.
>>
>> As it stands, at least two PRs would be required for each Statefun
>> version update -- one for *flink-statefun* and one for
>> *flink-statefun-playground*.
>>
>> Anyway, just an idea. Maybe there's an important reason for these
>> projects to remain separate. If we do want to keep the playground project
>> where it is, I could solve the copying problem by requiring the two
>> projects to be siblings in the file system and by pre-copying the local
>> build artifacts into a location accessible by the existing Docker contexts.
>> This would still leave us with the need to have two PRs and releases
>> instead of one, though.
>>
>> Thanks for your help!
>>
>>
>> On Fri, Aug 18, 2023 at 2:45 PM Tzu-Li (Gordon) Tai 
>> wrote:
>>
>>> Hi Galen,
>>>
>>> > locally built code is copied into the build containers
>>> so that it can be accessed during the build.
>>>
>>> That's exactly what we had been doing for release testing, yes. Sorry I
>>> missed that detail in my previous response.
>>>
>>> And yes, that sounds like a reasonable approach. If I understand you
>>> correctly, the workflow would become this:
>>>
>>>1. Build the StateFun repo locally to install the snapshot artifact
>>>jars + have a local base StateFun image.
>>>2. Run the playground in "local" mode, so that it uses the local base
>>>State

Re: [DISCUSS] FLIP-356: Support Nested Fields Filter Pushdown

2023-08-18 Thread Becket Qin
Thanks for the proposal, Venkata.

The current proposal works if we do not want to migrate
SupportsFilterPushdown to also use NestedFieldReferenceExpression in the
long term.

Otherwise, the alternative solution briefly mentioned in the rejected
alternatives would be the following:
Phase 1:
1. Introduce a supportsNestedFilters() method to the SupportsFilterPushdown
interface. (same as current proposal).
2. Extend the FieldReferenceExpression to support nested fields.
- Change the index field type from int to int[].
- Add a new method int[] getFieldIndexArray().
- Deprecate the int getFieldIndex() method, the code will be removed in
the next major version bump.
3. In the SupportsProjectionPushDown interface
- add a new method applyProjection(List,
DataType), with default implementation invoking applyProjection(int[][],
DataType)
- deprecate the current applyProjection(int[][], DataType) method

Phase 2 (in the next major version bump)
1. remove the deprecated methods.

Phase 3 (optional)
1. deprecate and remove the supportsNestedFilters() /
supportsNestedProjection() methods from the SupportsFilterPushDown /
SupportsProjectionPushDown interfaces.

Personally I prefer this alternative. It takes longer to finish the work,
but the API eventually becomes clean and consistent. But I can live with
the current proposal.

Thanks,

Jiangjie (Becket) Qin

On Sat, Aug 19, 2023 at 12:09 AM Venkatakrishnan Sowrirajan <
vsowr...@asu.edu> wrote:

> Gentle ping for reviews/feedback.
>
> On Tue, Aug 15, 2023, 5:37 PM Venkatakrishnan Sowrirajan  >
> wrote:
>
> > Hi All,
> >
> > I am opening this thread to discuss FLIP-356: Support Nested Fields
> > Filter Pushdown. The FLIP can be found at
> >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-356%3A+Support+Nested+Fields+Filter+Pushdown
> >
> > This FLIP adds support for pushing down nested fields filters to the
> > underlying TableSource. In our data lake, we find a lot of datasets have
> > nested fields and also user queries with filters defined on the nested
> > fields. This would drastically improve the performance for those sets of
> > queries.
> >
> > Appreciate any comments or feedback you may have on this proposal.
> >
> > Regards
> > Venkata krishnan
> >
>


[jira] [Created] (FLINK-32895) Introduce the max attempts for Exponential Delay Restart Strategy

2023-08-18 Thread Rui Fan (Jira)
Rui Fan created FLINK-32895:
---

 Summary: Introduce the max attempts for Exponential Delay Restart 
Strategy
 Key: FLINK-32895
 URL: https://issues.apache.org/jira/browse/FLINK-32895
 Project: Flink
  Issue Type: Improvement
  Components: Runtime / Coordination
Reporter: Rui Fan
Assignee: Rui Fan


Currently, Flink has 3 restart strategies, they are: fixed-delay, failure-rate 
and exponential-delay.

The exponential-delay is suitable if a job continues to fail for a period of 
time. The fixed-delay and failure-rate has the max attemepts mechanism, that 
means, the job won't restart and fail after the attemept exceeds the threshold 
of max attemepts. 

The max attemepts mechanism is reasonable, flink should not or need to 
infinitely restart the job if the job keeps failing. However, the 
exponential-delay doesn't have the max attemepts mechanism.

I propose inctroducing the 
`restart-strategy.exponential-delay.max-attemepts-before-reset` to support the 
max attemepts mechanism for exponential-delay. It means flink won't restart job 
if the number of job failures before reset exceeds max-attepts-before-reset 
when is exponential-delay is enabled.




--
This message was sent by Atlassian Jira
(v8.20.10#820010)