Re: [DISCUSS] Add missing visibility annotation for Table APIs

2023-07-29 Thread Jane Chan
Hi all,

Thanks for your valuable feedback. Here are my thoughts.

To Jark
I agree with your suggestions, and I've updated the sheet accordingly.

To Lincoln

> For the `DynamicFilteringEvent`, tend to keep it 'internal' since it's a
> concreate implementation of `SourceEvent`(and other two implementers are
> not public ones) .


I have checked the corresponding FLIP [1], and according to the design doc,
it suggested being marked as `@PublicEvolving`. At the same time, the class
`DynamicFilteringData`, which was introduced with it, was also marked as
`@PublicEvolving`. I think we can cc Gen Luo  to help
clarify.


For the `LookupOptions` and `Trigger`s, because they're all in the public
> interfaces of the flip[2], I'm fine with making them all public ones or
> just excluding the Trigger implemantors, cc @Qingsheng can you also help to
> check this?
>

I'm fine with both. Let's wait for Qingsheng's opinion.


For the `BuiltInFunctionDefinitions$Builder`, I think it should be
> `BuiltInFunctionDefinition$Builder`


Yes, this is a typo, and I've corrected it.


To Jing

do we really need to
> mark some many classes as @Internal? What is the exactly different between
> a public class with no annotation and with the @Internal?


IMO it is still necessary. From a user's perspective, marking a class as
@Internal has a clear directionality, indicating that this is an internal
class, and I should not rely on it. However, facing an unmarked class, I
wonder whether it is safe to depend on it in my code. From a developer's
perspective, marking a class as @Internal also helps us to be more
confident when iterating and updating interfaces. We can be sure that this
proposed change will not have unexpected behavior (because we have informed
users that it is internal and cannot promise the same compatibility
guarantee as public APIs).

[1]
https://cwiki.apache.org/confluence/display/FLINK/FLIP-248%3A+Introduce+dynamic+partition+pruning#:~:text=PublicEvolving%0Apublic%20class-,DynamicFilteringEvent,-implements%20SourceEvent%20%7B

Best,
Jane

On Wed, Jul 26, 2023 at 1:26 AM Jing Ge  wrote:

> Hi Jane,
>
> Thanks for your effort of walking through all classes and compiling the
> sheet. It is quite helpful. Just out of curiosity, do we really need to
> mark some many classes as @Internal? What is the exactly different between
> a public class with no annotation and with the @Internal?
>
> Best regards,
> Jing
>
>
> On Tue, Jul 25, 2023 at 11:06 AM Lincoln Lee 
> wrote:
>
> > Hi Jane,
> >
> > Thanks for driving this! Overall the proposed annotations looks good to
> me.
> > Some comments for the table[1]:
> >
> > For the `DynamicFilteringEvent`, tend to keep it 'internal' since it's a
> > concreate implementation of `SourceEvent`(and other two implementers are
> > not public ones) .
> >
> > For the `LookupOptions` and `Trigger`s, because they're all in the public
> > interfaces of the flip[2], I'm fine with making them all public ones or
> > just excluding the Trigger implemantors, cc @Qingsheng can you also help
> to
> > check this?
> >
> > For the `BuiltInFunctionDefinitions$Builder`, I think it should be
> > `BuiltInFunctionDefinition$Builder`.
> >
> >
> > [1].
> >
> >
> https://docs.google.com/spreadsheets/d/1e8M0tUtKkZXEd8rCZtZ0C6Ty9QkNaPySsrCgz0vEID4/edit?usp=sharing
> > [2].
> >
> >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-221%3A+Abstraction+for+lookup+source+cache+and+metric#FLIP221:Abstractionforlookupsourcecacheandmetric-PublicInterfaces
> >
> > Best,
> > Lincoln Lee
> >
> > Jark Wu  于2023年7月25日周二 10:47写道:
> >
> > > Hi Jane,
> > >
> > > Thanks for kicking off this work and collecting the detailed list.
> > >
> > > +1 to add the missing annotation.
> > >
> > > This often confuses me whether the class can be modified without
> breaking
> > > the compatibility
> > >  when looking at classes in table-common and table-api. Explicitly mark
> > the
> > > visibility can be
> > > helpful in this case.
> > >
> > > I have some additional suggestions wrt the class annotations:
> > > - classes in org.apache.flink.table.catalog.stats.* can be
> > @PublicEvolving,
> > > because all the classes in there are needed to build the
> @PublicEvolving
> > >  CatalogColumnStatistics.
> > > - PeriodicCacheReloadTrigger and TimedCacheReloadTrigger can be
> > > @PublicEvolving,
> > > they are built-in implementations of cache reload trigger and are
> exposed
> > > to connectors.
> > > - CoreModule can be @PublicEvolving to allow users use it in
> > > TableEnv#loadModule(name, Module).
> > >
> > > Best,
> > > Jark
> > >
> > >
> > > On Fri, 21 Jul 2023 at 00:34, Jane Chan  wrote:
> > >
> > > > Hi, Devs,
> > > >
> > > > I would like to start a discussion on adding missing visibility
> > > annotation
> > > > (PublicEvolving / Internal etc.) for Table APIs.
> > > >
> > > > The motivation for starting this discussion was during the cleanup of
> > > which
> > > > Table API to be deprecated for version 2.0, I noticed that some of
> the

[jira] [Created] (FLINK-32712) Enhance Flink ogg-json format

2023-07-29 Thread yuan (Jira)
yuan created FLINK-32712:


 Summary: Enhance Flink ogg-json format
 Key: FLINK-32712
 URL: https://issues.apache.org/jira/browse/FLINK-32712
 Project: Flink
  Issue Type: Bug
  Components: Formats (JSON, Avro, Parquet, ORC, SequenceFile)
Affects Versions: 1.17.1
Reporter: yuan


The BEFORE field in ogg-json can be configured to contain no table fields, as 
in the ogg-json example below.
{code:java}
{
  "table": "ZBZZZ",
  "op_type": "U",
  "op_ts": "2023-07-20 21:45:34.860817",
  "current_ts": "2023-07-21T05:45:36.615000",
  "pos": "2564940142073691",
  "before": {},
  "after": {
      "ID": 1461242,
      "PROPERTY_01": "tc",
      "PROPERTY_02": null,
      "PROPERTY_03": null,
      "PROPERTY_04": "K",
      "PROPERTY_05": "5",
      "PROPERTY_06": null,
      "PROPERTY_07": null,
      "PROPERTY_08": null,
      "PROPERTY_09": null,
      "PROPERTY_10": null
  }
}{code}
For this case, ogg-json format should not send a Record of type UPDATE_BEFORE 
downstream。 Worse, it can sometimes cause the flink job to crash out.
 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


Re: [VOTE] FLIP-322 Cooldown period for adaptive scheduler. Second vote.

2023-07-29 Thread Rui Fan
+1(binding)

Thanks for driving this proposal, it will be useful for rescale.

I’m preparing the FLIP-334[1], it will decouple the autoscaler and
kubernetes. In the end, we hope all kind of flink jobs work well with
rescale and autoscaler.

[1]
https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=263424711

Best
Rui Fan

On Fri, 28 Jul 2023 at 19:21, Martijn Visser 
wrote:

> +1 (binding)
>
> On Fri, Jul 14, 2023 at 11:59 AM Prabhu Joseph  >
> wrote:
>
> > *+1 (non-binding)*
> >
> > Thanks for working on this. We have seen good improvement during the cool
> > down period with this feature.
> > Below are details on the test results from one of our clusters:
> >
> > On a scale-out operation, 8 new nodes were added one by one with a gap of
> > ~30 seconds. There were 8 restarts within 4 minutes with the default
> > behaviour,
> > whereas only one with this feature (cooldown period of 4 minutes).
> >
> > The number of records processed by the job with this feature during the
> > restart window is higher (2909764), whereas it is only 1323960 with the
> > default
> > behaviour due to multiple restarts, where it spends most of the time
> > recovering, and also whatever work progressed by the tasks after the last
> > successful completed checkpoint is lost.
> >
> > Metrics Default Adaptive Scheduler Adaptive Scheduler With Cooldown
> Period
> > Remarks
> > NumRecordsProcessed 1323960 2909764 1. NumRecordsProcessed metric
> indicates
> > the difference the cool down period brings in. When the job is doing
> > multiple restarts, the task spends most of the time recovering, and the
> > progress the task made will be lost during the restart.
> >
> > 2. There is only one restart with Cool Down Period which happened when
> the
> > 8th node got added back.
> >
> > Job Parallelism 13 -> 20 -> 27 -> 34 -> 41 -> 48 -> 55 → 62 → 69 13 → 69
> > NumRestarts 8 1
> >
> >
> >
> >
> >
> >
> >
> >
> > On Wed, Jul 12, 2023 at 8:03 PM Etienne Chauchot 
> > wrote:
> >
> > > Hi all,
> > >
> > > I'm going on vacation tonight for 3 weeks.
> > >
> > > Even if the vote is not finished, as the implementation is rather quick
> > > and the design discussion had settled, I preferred I implementing
> > > FLIP-322 [1] to allow people to take a look while I'm off.
> > >
> > > [1] https://github.com/apache/flink/pull/22985
> > >
> > > Best
> > >
> > > Etienne
> > >
> > > Le 12/07/2023 à 09:56, Etienne Chauchot a écrit :
> > > >
> > > > Hi all,
> > > >
> > > > Would you mind casting your vote to this second vote thread (opened
> > > > after new discussions) so that the subject can move forward ?
> > > >
> > > > @David, @Chesnay, @Robert you took part to the discussions, can you
> > > > please sent your vote ?
> > > >
> > > > Thank you very much
> > > >
> > > > Best
> > > >
> > > > Etienne
> > > >
> > > > Le 06/07/2023 à 13:02, Etienne Chauchot a écrit :
> > > >>
> > > >> Hi all,
> > > >>
> > > >> Thanks for your feedback about the FLIP-322: Cooldown period for
> > > >> adaptive scheduler [1].
> > > >>
> > > >> This FLIP was discussed in [2].
> > > >>
> > > >> I'd like to start a vote for it. The vote will be open for at least
> 72
> > > >> hours (until July 9th 15:00 GMT) unless there is an objection or
> > > >> insufficient votes.
> > > >>
> > > >> [1]
> > > >>
> > >
> >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-322+Cooldown+period+for+adaptive+scheduler
> > > >> [2]
> https://lists.apache.org/thread/qvgxzhbp9rhlsqrybxdy51h05zwxfns6
> > > >>
> > > >> Best,
> > > >>
> > > >> Etienne
> >
>


Re: [DISCUSS] Add missing visibility annotation for Table APIs

2023-07-29 Thread Jing Ge
Hi Jane,

Thanks for the clarification. Commonly speaking, it is a good idea to show
the clear information that a class is only used internally, i.e. please
don't rely on it. However, the rule of using @Internal in the community is
not clearly defined, at least it is not as clear as
@Public/@PublicEvolving/@Experimental
are. According to your proposal, does it mean that there will be no public
class that has no annotation? In the end, every public class must have one
annotation and all classes with @Internal will have the same meaning for
developers as they had no annotation before. Developers will ignore
the annotation and continue depending on them. It looks like a typical
involution: we will come back to the position where we were but with
extra @Internal maintenance effort.

Best regards,
Jing

On Sat, Jul 29, 2023 at 3:01 PM Jane Chan  wrote:

> Hi all,
>
> Thanks for your valuable feedback. Here are my thoughts.
>
> To Jark
> I agree with your suggestions, and I've updated the sheet accordingly.
>
> To Lincoln
>
> > For the `DynamicFilteringEvent`, tend to keep it 'internal' since it's a
> > concreate implementation of `SourceEvent`(and other two implementers are
> > not public ones) .
>
>
> I have checked the corresponding FLIP [1], and according to the design doc,
> it suggested being marked as `@PublicEvolving`. At the same time, the class
> `DynamicFilteringData`, which was introduced with it, was also marked as
> `@PublicEvolving`. I think we can cc Gen Luo  to help
> clarify.
>
>
> For the `LookupOptions` and `Trigger`s, because they're all in the public
> > interfaces of the flip[2], I'm fine with making them all public ones or
> > just excluding the Trigger implemantors, cc @Qingsheng can you also help
> to
> > check this?
> >
>
> I'm fine with both. Let's wait for Qingsheng's opinion.
>
>
> For the `BuiltInFunctionDefinitions$Builder`, I think it should be
> > `BuiltInFunctionDefinition$Builder`
>
>
> Yes, this is a typo, and I've corrected it.
>
>
> To Jing
>
> do we really need to
> > mark some many classes as @Internal? What is the exactly different
> between
> > a public class with no annotation and with the @Internal?
>
>
> IMO it is still necessary. From a user's perspective, marking a class as
> @Internal has a clear directionality, indicating that this is an internal
> class, and I should not rely on it. However, facing an unmarked class, I
> wonder whether it is safe to depend on it in my code. From a developer's
> perspective, marking a class as @Internal also helps us to be more
> confident when iterating and updating interfaces. We can be sure that this
> proposed change will not have unexpected behavior (because we have informed
> users that it is internal and cannot promise the same compatibility
> guarantee as public APIs).
>
> [1]
>
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-248%3A+Introduce+dynamic+partition+pruning#:~:text=PublicEvolving%0Apublic%20class-,DynamicFilteringEvent,-implements%20SourceEvent%20%7B
>
> Best,
> Jane
>
> On Wed, Jul 26, 2023 at 1:26 AM Jing Ge 
> wrote:
>
> > Hi Jane,
> >
> > Thanks for your effort of walking through all classes and compiling the
> > sheet. It is quite helpful. Just out of curiosity, do we really need to
> > mark some many classes as @Internal? What is the exactly different
> between
> > a public class with no annotation and with the @Internal?
> >
> > Best regards,
> > Jing
> >
> >
> > On Tue, Jul 25, 2023 at 11:06 AM Lincoln Lee 
> > wrote:
> >
> > > Hi Jane,
> > >
> > > Thanks for driving this! Overall the proposed annotations looks good to
> > me.
> > > Some comments for the table[1]:
> > >
> > > For the `DynamicFilteringEvent`, tend to keep it 'internal' since it's
> a
> > > concreate implementation of `SourceEvent`(and other two implementers
> are
> > > not public ones) .
> > >
> > > For the `LookupOptions` and `Trigger`s, because they're all in the
> public
> > > interfaces of the flip[2], I'm fine with making them all public ones or
> > > just excluding the Trigger implemantors, cc @Qingsheng can you also
> help
> > to
> > > check this?
> > >
> > > For the `BuiltInFunctionDefinitions$Builder`, I think it should be
> > > `BuiltInFunctionDefinition$Builder`.
> > >
> > >
> > > [1].
> > >
> > >
> >
> https://docs.google.com/spreadsheets/d/1e8M0tUtKkZXEd8rCZtZ0C6Ty9QkNaPySsrCgz0vEID4/edit?usp=sharing
> > > [2].
> > >
> > >
> >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-221%3A+Abstraction+for+lookup+source+cache+and+metric#FLIP221:Abstractionforlookupsourcecacheandmetric-PublicInterfaces
> > >
> > > Best,
> > > Lincoln Lee
> > >
> > > Jark Wu  于2023年7月25日周二 10:47写道:
> > >
> > > > Hi Jane,
> > > >
> > > > Thanks for kicking off this work and collecting the detailed list.
> > > >
> > > > +1 to add the missing annotation.
> > > >
> > > > This often confuses me whether the class can be modified without
> > breaking
> > > > the compatibility
> > > >  when looking at classes in table-common and table-api. Exp

Re: [VOTE] Apache Flink Kubernetes Operator Release 1.6.0, release candidate #1

2023-07-29 Thread Rui Fan
+1 (non-binding)

- Compiled and tested the source code via mvn verify
- Verified the signatures
- Downloaded the image : docker pull
ghcr.io/apache/flink-kubernetes-operator:e7045a6
- Deployed helm chart to test cluster
- Ran example job

Best,
Rui Fan

On Thu, Jul 27, 2023 at 10:53 PM Gyula Fóra  wrote:

> Hi Everyone,
>
> Please review and vote on the release candidate #1 for the version 1.6.0 of
> Apache Flink Kubernetes Operator,
> as follows:
> [ ] +1, Approve the release
> [ ] -1, Do not approve the release (please provide specific comments)
>
> **Release Overview**
>
> As an overview, the release consists of the following:
> a) Kubernetes Operator canonical source distribution (including the
> Dockerfile), to be deployed to the release repository at dist.apache.org
> b) Kubernetes Operator Helm Chart to be deployed to the release repository
> at dist.apache.org
> c) Maven artifacts to be deployed to the Maven Central Repository
> d) Docker image to be pushed to dockerhub
>
> **Staging Areas to Review**
>
> The staging areas containing the above mentioned artifacts are as follows,
> for your review:
> * All artifacts for a,b) can be found in the corresponding dev repository
> at dist.apache.org [1]
> * All artifacts for c) can be found at the Apache Nexus Repository [2]
> * The docker image for d) is staged on github [3]
>
> All artifacts are signed with the key 21F06303B87DAFF1 [4]
>
> Other links for your review:
> * JIRA release notes [5]
> * source code tag "release-1.6.0-rc1" [6]
> * PR to update the website Downloads page to
> include Kubernetes Operator links [7]
>
> **Vote Duration**
>
> The voting time will run for at least 72 hours.
> It is adopted by majority approval, with at least 3 PMC affirmative votes.
>
> **Note on Verification**
>
> You can follow the basic verification guide here[8].
> Note that you don't need to verify everything yourself, but please make
> note of what you have tested together with your +- vote.
>
> Cheers!
> Gyula Fora
>
> [1]
>
> https://dist.apache.org/repos/dist/dev/flink/flink-kubernetes-operator-1.6.0-rc1/
> [2]
> https://repository.apache.org/content/repositories/orgapacheflink-1646/
> [3] ghcr.io/apache/flink-kubernetes-operator:e7045a6
> [4] https://dist.apache.org/repos/dist/release/flink/KEYS
> [5]
>
> https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12315522&version=12353230
> [6]
> https://github.com/apache/flink-kubernetes-operator/tree/release-1.6.0-rc1
> [7] https://github.com/apache/flink-web/pull/666
> [8]
>
> https://cwiki.apache.org/confluence/display/FLINK/Verifying+a+Flink+Kubernetes+Operator+Release
>