[jira] [Created] (FLINK-37009) Migrate PruneAggregateCallRule

2025-01-05 Thread Jacky Lau (Jira)
Jacky Lau created FLINK-37009:
-

 Summary: Migrate PruneAggregateCallRule
 Key: FLINK-37009
 URL: https://issues.apache.org/jira/browse/FLINK-37009
 Project: Flink
  Issue Type: Sub-task
  Components: Table SQL / Planner
Affects Versions: 2.0.0
Reporter: Jacky Lau
 Fix For: 2.0.0






--
This message was sent by Atlassian Jira
(v8.20.10#820010)


Re: [DISCUSS] Is it a bug that the AdaptiveScheduler does not prioritize releasing TaskManagers during downscaling in Application mode?

2025-01-05 Thread Yuepeng Pan
Thank you Matthias and all for the feedback and suggestions.

Please let me make a brief summary based on the historical comments:
- It's agreeded to optimize/fix this issue in the 1.x TLS versions.
- The primary goal of this optimization/fix is to minimize the number of 
TaskManagers used in application mode.
- The optimized logic should be controlled via a parameter.

I'd like to introduce the following parameter to control whether the optimized 
logic should be enabled:
- Name: jobmanager.adaptive-scheduler.resource.minimal-taskmanagers-preferred
- Type: boolean
- Default value: false
- Description: This parameter defines whether the adaptive scheduler 
prioritizes 
using the minimum number of TaskManagers when scheduling tasks.
Note: This parameter is currently suitable for cases that 
execution.state-recovery.from-local is disabled.'

BTW, I'm uncertain whether the introduction of a parameter for this specific 
fix necessitates documentation via a FLIP.
If so, I'm willing to initiate a FLIP to aid in subsequent tasks.
If not, I will add this email address to the corresponding jira ticket's 
comments for tracking  and start the work on MR.

Any suggestion would be appreciated!

Thank you!

Best,
Yuepeng Pan

On 2025/01/05 18:41:11 Matthias Pohl wrote:
> Hi everyone and sorry for the late reply. I was mostly off in November and
> forgot about that topic in December last year.
> 
> Thanks for summarizing and bringing up user feedback. I see the problem and
> agree with your view that it's a topic that we might want to address in the
> 1.x LTS version. I see how this can be labeled as a bug or a feature
> depending on the perspective. I think adding this behavior while being
> guarded by a feature flag/configuration parameter in the 1.x LTS version is
> reasonable.
> 
> Best,
> Matthias
> 
> [1]
> https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=158873338#FLIP138:DeclarativeResourcemanagement-Howtodistributeslotsacrossdifferentjobs
> 
> On Wed, Nov 6, 2024 at 9:21 AM Rui Fan <1996fan...@gmail.com> wrote:
> 
> > Thanks Yuepeng for the PR and starting this discussion!
> >
> > And thanks Gyula and Yuanfeng for the input!
> >
> > I also agree to fix this behaviour in the 1.x line.
> >
> > The adaptive scheduler and rescaling API provide powerful capabilities to
> > increase or decrease parallelism.
> >
> > The main benefit I understand of decreasing parallelism is saving
> > resources.
> > If decreasing parallelism can't save resources, why do users decrease it?
> > This is why I think releasing TM resources when decreasing parallelism is
> > a basic capability that the Adaptive Scheduler should have.
> >
> > Please correct me if I miss anything, thanks~
> >
> > Also, I believe it does not work as the user expects. Because this
> > behaviour
> > was reported multiple times in the flink community, such as:
> > FLINK-33977[1],
> > FLINK-35594[2], FLINK-35903[3] and Slack channel[4].
> > And 1.20.x is a LTS version, so I agree to fix it in the 1.x line.
> >
> > [1] https://issues.apache.org/jira/browse/FLINK-33977
> > [2] https://issues.apache.org/jira/browse/FLINK-35594
> > [3] https://issues.apache.org/jira/browse/FLINK-35903
> > [4] https://apache-flink.slack.com/archives/C03G7LJTS2G/p1729167222445569
> >
> > Best,
> > Rui
> >
> > On Wed, Nov 6, 2024 at 4:15 PM yuanfeng hu  wrote:
> >
> >> > Is it considered an error if the adaptive scheduler fails to release the
> >> task manager during scaling?
> >>
> >> +1 . When we enable adaptive mode and perform scaling operations on tasks,
> >> a significant part of the goal is to reduce resource usage for the tasks.
> >> However, due to some logic in the adaptive scheduler's scheduling process,
> >> the task manager cannot be released, and the ultimate goal cannot be
> >> achieved. Therefore, I consider this to be a mistake.
> >>
> >> Additionally, many tasks are currently running in this mode and will
> >> continue to run for quite a long time (many users are in this situation).
> >> So whether or not it is considered a bug, I believe we need to fix it in
> >> the 1.x version.
> >>
> >> Yuepeng Pan  于2024年11月6日周三 14:32写道:
> >>
> >> > Hi, community.
> >> >
> >> >
> >> >
> >> >
> >> > When working on ticket[1] we have received some lively discussions and
> >> > valuable
> >> > feedback[2](thanks for Matthias, Rui, Gyula, Maximilian, Tison, etc.),
> >> the
> >> > main issues are that:
> >> >
> >> > When the job runs in an application cluster, could the default behavior
> >> of
> >> > AdaptiveScheduler not actively releasing Taskmanagers resources during
> >> > downscaling be considered a bug?
> >> >
> >> > If so,should we fix it in flink 1.x?
> >> >
> >> >
> >> >
> >> > I’d like to start a discussion to hear more comments about it to define
> >> > the next step and I have sorted out some information in the doc[3]
> >> > regarding this discussion for you.
> >> >
> >> >
> >> >
> >> > Looking forward to your comments and attention.
> >> >
> >> > Thank you.

Re: [DISCUSS] Is it a bug that the AdaptiveScheduler does not prioritize releasing TaskManagers during downscaling in Application mode?

2025-01-05 Thread Yuepeng Pan
Thank you Matthias and all for the feedback and suggestions.

Please let me make a brief summary based on the historical comments:
- It's agreeded to optimize/fix this issue in the 1.x TLS versions.
- The primary goal of this optimization/fix is to minimize the number of 
TaskManagers used in application mode.
- The optimized logic should be controlled via a parameter.

I'd like to introduce the following parameter to control whether the optimized 
logic should be enabled:
- Name: jobmanager.adaptive-scheduler.resource.minimal-taskmanagers-preferred
- Type: boolean
- Default value: false
- Description: This parameter defines whether the adaptive scheduler 
prioritizes 
using the minimum number of TaskManagers when scheduling tasks.
Note: This parameter is currently suitable for cases that 
execution.state-recovery.from-local is disabled.'

BTW, I'm uncertain whether the introduction of a parameter for this specific 
fix necessitates documentation via a FLIP.
If so, I'm willing to initiate a FLIP to aid in subsequent tasks.
If not, I will add this email address to the corresponding jira ticket's 
comments for tracking  and start the work on MR.

Any suggestion would be appreciated!

Thank you!

Best,
Yuepeng Pan

On 2025/01/05 18:41:11 Matthias Pohl wrote:
> Hi everyone and sorry for the late reply. I was mostly off in November and
> forgot about that topic in December last year.
> 
> Thanks for summarizing and bringing up user feedback. I see the problem and
> agree with your view that it's a topic that we might want to address in the
> 1.x LTS version. I see how this can be labeled as a bug or a feature
> depending on the perspective. I think adding this behavior while being
> guarded by a feature flag/configuration parameter in the 1.x LTS version is
> reasonable.
> 
> Best,
> Matthias
> 
> [1]
> https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=158873338#FLIP138:DeclarativeResourcemanagement-Howtodistributeslotsacrossdifferentjobs
> 
> On Wed, Nov 6, 2024 at 9:21 AM Rui Fan <1996fan...@gmail.com> wrote:
> 
> > Thanks Yuepeng for the PR and starting this discussion!
> >
> > And thanks Gyula and Yuanfeng for the input!
> >
> > I also agree to fix this behaviour in the 1.x line.
> >
> > The adaptive scheduler and rescaling API provide powerful capabilities to
> > increase or decrease parallelism.
> >
> > The main benefit I understand of decreasing parallelism is saving
> > resources.
> > If decreasing parallelism can't save resources, why do users decrease it?
> > This is why I think releasing TM resources when decreasing parallelism is
> > a basic capability that the Adaptive Scheduler should have.
> >
> > Please correct me if I miss anything, thanks~
> >
> > Also, I believe it does not work as the user expects. Because this
> > behaviour
> > was reported multiple times in the flink community, such as:
> > FLINK-33977[1],
> > FLINK-35594[2], FLINK-35903[3] and Slack channel[4].
> > And 1.20.x is a LTS version, so I agree to fix it in the 1.x line.
> >
> > [1] https://issues.apache.org/jira/browse/FLINK-33977
> > [2] https://issues.apache.org/jira/browse/FLINK-35594
> > [3] https://issues.apache.org/jira/browse/FLINK-35903
> > [4] https://apache-flink.slack.com/archives/C03G7LJTS2G/p1729167222445569
> >
> > Best,
> > Rui
> >
> > On Wed, Nov 6, 2024 at 4:15 PM yuanfeng hu  wrote:
> >
> >> > Is it considered an error if the adaptive scheduler fails to release the
> >> task manager during scaling?
> >>
> >> +1 . When we enable adaptive mode and perform scaling operations on tasks,
> >> a significant part of the goal is to reduce resource usage for the tasks.
> >> However, due to some logic in the adaptive scheduler's scheduling process,
> >> the task manager cannot be released, and the ultimate goal cannot be
> >> achieved. Therefore, I consider this to be a mistake.
> >>
> >> Additionally, many tasks are currently running in this mode and will
> >> continue to run for quite a long time (many users are in this situation).
> >> So whether or not it is considered a bug, I believe we need to fix it in
> >> the 1.x version.
> >>
> >> Yuepeng Pan  于2024年11月6日周三 14:32写道:
> >>
> >> > Hi, community.
> >> >
> >> >
> >> >
> >> >
> >> > When working on ticket[1] we have received some lively discussions and
> >> > valuable
> >> > feedback[2](thanks for Matthias, Rui, Gyula, Maximilian, Tison, etc.),
> >> the
> >> > main issues are that:
> >> >
> >> > When the job runs in an application cluster, could the default behavior
> >> of
> >> > AdaptiveScheduler not actively releasing Taskmanagers resources during
> >> > downscaling be considered a bug?
> >> >
> >> > If so,should we fix it in flink 1.x?
> >> >
> >> >
> >> >
> >> > I’d like to start a discussion to hear more comments about it to define
> >> > the next step and I have sorted out some information in the doc[3]
> >> > regarding this discussion for you.
> >> >
> >> >
> >> >
> >> > Looking forward to your comments and attention.
> >> >
> >> > Thank you.

Re: [DISCUSS] FLIP-499: Support Event Time by Generalized Watermark in DataStream V2

2025-01-05 Thread Xu Huang
Hi, Anil

Maybe the watermark alignment mechanism on DataStream V1 can solve your
problem, it can align the watermark of SourceSplit.
please refer to the documentation [1][2].
And we don't provide this feature on this FLIP, this feature will be in our
future planning.

[1]
https://cwiki.apache.org/confluence/display/FLINK/FLIP-217%3A+Support+watermark+alignment+of+source+splits
[2]
https://nightlies.apache.org/flink/flink-docs-release-1.20/docs/dev/datastream/event-time/generating_watermarks/#watermark-alignment

Best,
Xu Huang

Anil Dasari  于2025年1月3日周五 22:41写道:

>
> Hi Xu,Thanks for the response.I am currently using Spark Streaming to
> process data from Kafka in microbatches, writing each microbatch's data to
> a dedicated prefix in S3. Since Spark Streaming is lazy, it processes data
> only when a microbatch is created or triggered, leaving resources idle
> until then. This approach is not true streaming. To improve resource
> utilization and process data as it arrives, I am considering switching to
> Flink.
> Both Spark's Kafka source and Flink's Kafka source (with parallelism > 1)
> use multithreaded processing. However, Spark's Kafka source readers share a
> global microbatch epoch, ensuring a consistent view across readers. In
> contrast, Flink's Kafka source split readers do not share a global epoch or
> identifiers to divide the stream into chunks.
> It requires all split readers of a source should emit a special event
> which same epoch at the same time.
> Thanks
>
> On Friday, January 3, 2025 at 06:21:34 AM PST, Xu Huang <
> huangxu.wal...@gmail.com> wrote:
>
>  Hi, Anil
>
> I don't understand what you mean by Global Watermark, are you trying to
> have all Sources emit a special event with the same epoch at the same time?
> Is there a specific user case for this question?
>
> Happy new year!
>
> Best,
> Xu Huang
>
> Anil Dasari  于2025年1月3日周五 14:02写道:
>
> >  Hello XU,Happy new year. Thank you for FLIP-499 and FLIP-467.
> > I tried to split/chunk streams based by fixed timestamp intervals and
> > route them to the appropriate destination. A few months ago, I evaluated
> > the following options and found that Flink currently lacks direct support
> > for a global watermark or timer that can share consistent information
> (such
> > as an epoch or identifier) across task nodes.
> > 1. Windowing: While promising, this approach requires record-level checks
> > for flushing, as window data isn't accessible throughout the pipeline.
> > 2. Window + Trigger: This method buffers events until the trigger
> interval
> > is reached, impacting real-time processing since events are processed
> only
> > when the trigger fires.
> > 3. Processing Time: Processing time is localized to each file writer,
> > causing inconsistencies across task managers.
> > 4. Watermark: Watermarks are specific to each source task. Additionally,
> > the initial watermark (before the first event) is not epoch-based,
> leading
> > to further challenges.
> > Would global watermarks address this use case? If not, could this use
> case
> > align with any of the proposed FLIPs
> > Thanks in advance.
> >
> >On Thursday, January 2, 2025 at 09:06:31 PM PST, Xu Huang <
> > huangxu.wal...@gmail.com> wrote:
> >
> >  Hi Devs,
> >
> > Weijie Guo and I would like to initiate a discussion about FLIP-499:
> > Support Event Time by Generalized Watermark in DataStream V2
> > <
> >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-499%3A+Support+Event+Time+by+Generalized+Watermark+in+DataStream+V2
> > >
> > [1].
> >
> > Event time is a fundamental feature of Flink that has been widely
> adopted.
> > For instance, the Window operator can determine whether to trigger a
> window
> > based on event time, and users can register timer using the event time.
> > FLIP-467[2] introduces the Generalized Watermark in DataStream V2,
> enabling
> > users to define specific events that can be emitted from a source or
> other
> > operators, propagate along the streams, received by downstream operators,
> > and aligned during propagation. Within this framework, the traditional
> > (event-time) Watermark can be viewed as a special instance of the
> > Generalized Watermark already provided by Flink.
> >
> > To make it easy for users to use event time in DataStream V2, this FLIP
> > will implement event time extension in DataStream V2 based on Generalized
> > Watermark.
> >
> > For more details, please refer to FLIP-499 [1]. We look forward to your
> > feedback.
> >
> > Best,
> >
> > Xu Huang
> >
> > [1] https://cwiki.apache.org/confluence/x/pQz0Ew
> >
> > [2] https://cwiki.apache.org/confluence/x/oA6TEg
> >
>


Re: [DISCUSS] FLIP-499: Support Event Time by Generalized Watermark in DataStream V2

2025-01-05 Thread Junrui Lee
Hi, Xu

Thanks for your work, I noticed that in this FLIP, event-time watermark is
created and sent through a separate WatermarkGenerator. I would like to
know if there is support for Source to send event-time watermark?

Best,
Junrui

Xu Huang  于2025年1月6日周一 10:31写道:

> Hi, Anil
>
> Maybe the watermark alignment mechanism on DataStream V1 can solve your
> problem, it can align the watermark of SourceSplit.
> please refer to the documentation [1][2].
> And we don't provide this feature on this FLIP, this feature will be in our
> future planning.
>
> [1]
>
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-217%3A+Support+watermark+alignment+of+source+splits
> [2]
>
> https://nightlies.apache.org/flink/flink-docs-release-1.20/docs/dev/datastream/event-time/generating_watermarks/#watermark-alignment
>
> Best,
> Xu Huang
>
> Anil Dasari  于2025年1月3日周五 22:41写道:
>
> >
> > Hi Xu,Thanks for the response.I am currently using Spark Streaming to
> > process data from Kafka in microbatches, writing each microbatch's data
> to
> > a dedicated prefix in S3. Since Spark Streaming is lazy, it processes
> data
> > only when a microbatch is created or triggered, leaving resources idle
> > until then. This approach is not true streaming. To improve resource
> > utilization and process data as it arrives, I am considering switching to
> > Flink.
> > Both Spark's Kafka source and Flink's Kafka source (with parallelism > 1)
> > use multithreaded processing. However, Spark's Kafka source readers
> share a
> > global microbatch epoch, ensuring a consistent view across readers. In
> > contrast, Flink's Kafka source split readers do not share a global epoch
> or
> > identifiers to divide the stream into chunks.
> > It requires all split readers of a source should emit a special event
> > which same epoch at the same time.
> > Thanks
> >
> > On Friday, January 3, 2025 at 06:21:34 AM PST, Xu Huang <
> > huangxu.wal...@gmail.com> wrote:
> >
> >  Hi, Anil
> >
> > I don't understand what you mean by Global Watermark, are you trying to
> > have all Sources emit a special event with the same epoch at the same
> time?
> > Is there a specific user case for this question?
> >
> > Happy new year!
> >
> > Best,
> > Xu Huang
> >
> > Anil Dasari  于2025年1月3日周五 14:02写道:
> >
> > >  Hello XU,Happy new year. Thank you for FLIP-499 and FLIP-467.
> > > I tried to split/chunk streams based by fixed timestamp intervals and
> > > route them to the appropriate destination. A few months ago, I
> evaluated
> > > the following options and found that Flink currently lacks direct
> support
> > > for a global watermark or timer that can share consistent information
> > (such
> > > as an epoch or identifier) across task nodes.
> > > 1. Windowing: While promising, this approach requires record-level
> checks
> > > for flushing, as window data isn't accessible throughout the pipeline.
> > > 2. Window + Trigger: This method buffers events until the trigger
> > interval
> > > is reached, impacting real-time processing since events are processed
> > only
> > > when the trigger fires.
> > > 3. Processing Time: Processing time is localized to each file writer,
> > > causing inconsistencies across task managers.
> > > 4. Watermark: Watermarks are specific to each source task.
> Additionally,
> > > the initial watermark (before the first event) is not epoch-based,
> > leading
> > > to further challenges.
> > > Would global watermarks address this use case? If not, could this use
> > case
> > > align with any of the proposed FLIPs
> > > Thanks in advance.
> > >
> > >On Thursday, January 2, 2025 at 09:06:31 PM PST, Xu Huang <
> > > huangxu.wal...@gmail.com> wrote:
> > >
> > >  Hi Devs,
> > >
> > > Weijie Guo and I would like to initiate a discussion about FLIP-499:
> > > Support Event Time by Generalized Watermark in DataStream V2
> > > <
> > >
> >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-499%3A+Support+Event+Time+by+Generalized+Watermark+in+DataStream+V2
> > > >
> > > [1].
> > >
> > > Event time is a fundamental feature of Flink that has been widely
> > adopted.
> > > For instance, the Window operator can determine whether to trigger a
> > window
> > > based on event time, and users can register timer using the event time.
> > > FLIP-467[2] introduces the Generalized Watermark in DataStream V2,
> > enabling
> > > users to define specific events that can be emitted from a source or
> > other
> > > operators, propagate along the streams, received by downstream
> operators,
> > > and aligned during propagation. Within this framework, the traditional
> > > (event-time) Watermark can be viewed as a special instance of the
> > > Generalized Watermark already provided by Flink.
> > >
> > > To make it easy for users to use event time in DataStream V2, this FLIP
> > > will implement event time extension in DataStream V2 based on
> Generalized
> > > Watermark.
> > >
> > > For more details, please refer to FLIP-499 [1]. We look forward to your
> > > feedba

Re: [DISCUSS] FLIP-499: Support Event Time by Generalized Watermark in DataStream V2

2025-01-05 Thread Xu Huang
Hi, Junrui

Thanks for your question, Source is able to send event-time watermark.
It can be sent via SourceReaderContext as mentioned in FLIP-467. Example is
as follows: >
sourceReaderContext.emitWatermark(EventTimeExtension.EVENT_TIME_WATERMARK_DECLARATION.newWatermark(eventTime))

We will add it to the JavaDoc or FLIP to clarify this, thanks for the
question!

Best,
Xu Huang

Junrui Lee  于2025年1月6日周一 11:01写道:

> Hi, Xu
>
> Thanks for your work, I noticed that in this FLIP, event-time watermark is
> created and sent through a separate WatermarkGenerator. I would like to
> know if there is support for Source to send event-time watermark?
>
> Best,
> Junrui
>
> Xu Huang  于2025年1月6日周一 10:31写道:
>
> > Hi, Anil
> >
> > Maybe the watermark alignment mechanism on DataStream V1 can solve your
> > problem, it can align the watermark of SourceSplit.
> > please refer to the documentation [1][2].
> > And we don't provide this feature on this FLIP, this feature will be in
> our
> > future planning.
> >
> > [1]
> >
> >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-217%3A+Support+watermark+alignment+of+source+splits
> > [2]
> >
> >
> https://nightlies.apache.org/flink/flink-docs-release-1.20/docs/dev/datastream/event-time/generating_watermarks/#watermark-alignment
> >
> > Best,
> > Xu Huang
> >
> > Anil Dasari  于2025年1月3日周五 22:41写道:
> >
> > >
> > > Hi Xu,Thanks for the response.I am currently using Spark Streaming to
> > > process data from Kafka in microbatches, writing each microbatch's data
> > to
> > > a dedicated prefix in S3. Since Spark Streaming is lazy, it processes
> > data
> > > only when a microbatch is created or triggered, leaving resources idle
> > > until then. This approach is not true streaming. To improve resource
> > > utilization and process data as it arrives, I am considering switching
> to
> > > Flink.
> > > Both Spark's Kafka source and Flink's Kafka source (with parallelism >
> 1)
> > > use multithreaded processing. However, Spark's Kafka source readers
> > share a
> > > global microbatch epoch, ensuring a consistent view across readers. In
> > > contrast, Flink's Kafka source split readers do not share a global
> epoch
> > or
> > > identifiers to divide the stream into chunks.
> > > It requires all split readers of a source should emit a special event
> > > which same epoch at the same time.
> > > Thanks
> > >
> > > On Friday, January 3, 2025 at 06:21:34 AM PST, Xu Huang <
> > > huangxu.wal...@gmail.com> wrote:
> > >
> > >  Hi, Anil
> > >
> > > I don't understand what you mean by Global Watermark, are you trying to
> > > have all Sources emit a special event with the same epoch at the same
> > time?
> > > Is there a specific user case for this question?
> > >
> > > Happy new year!
> > >
> > > Best,
> > > Xu Huang
> > >
> > > Anil Dasari  于2025年1月3日周五 14:02写道:
> > >
> > > >  Hello XU,Happy new year. Thank you for FLIP-499 and FLIP-467.
> > > > I tried to split/chunk streams based by fixed timestamp intervals and
> > > > route them to the appropriate destination. A few months ago, I
> > evaluated
> > > > the following options and found that Flink currently lacks direct
> > support
> > > > for a global watermark or timer that can share consistent information
> > > (such
> > > > as an epoch or identifier) across task nodes.
> > > > 1. Windowing: While promising, this approach requires record-level
> > checks
> > > > for flushing, as window data isn't accessible throughout the
> pipeline.
> > > > 2. Window + Trigger: This method buffers events until the trigger
> > > interval
> > > > is reached, impacting real-time processing since events are processed
> > > only
> > > > when the trigger fires.
> > > > 3. Processing Time: Processing time is localized to each file writer,
> > > > causing inconsistencies across task managers.
> > > > 4. Watermark: Watermarks are specific to each source task.
> > Additionally,
> > > > the initial watermark (before the first event) is not epoch-based,
> > > leading
> > > > to further challenges.
> > > > Would global watermarks address this use case? If not, could this use
> > > case
> > > > align with any of the proposed FLIPs
> > > > Thanks in advance.
> > > >
> > > >On Thursday, January 2, 2025 at 09:06:31 PM PST, Xu Huang <
> > > > huangxu.wal...@gmail.com> wrote:
> > > >
> > > >  Hi Devs,
> > > >
> > > > Weijie Guo and I would like to initiate a discussion about FLIP-499:
> > > > Support Event Time by Generalized Watermark in DataStream V2
> > > > <
> > > >
> > >
> >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-499%3A+Support+Event+Time+by+Generalized+Watermark+in+DataStream+V2
> > > > >
> > > > [1].
> > > >
> > > > Event time is a fundamental feature of Flink that has been widely
> > > adopted.
> > > > For instance, the Window operator can determine whether to trigger a
> > > window
> > > > based on event time, and users can register timer using the event
> time.
> > > > FLIP-467[2] introduces the Generalized Watermark in DataStream 

[jira] [Created] (FLINK-37008) Flink UI should show the type of checkpoint (full vs incremental)

2025-01-05 Thread Ryan van Huuksloot (Jira)
Ryan van Huuksloot created FLINK-37008:
--

 Summary: Flink UI should show the type of checkpoint (full vs 
incremental)
 Key: FLINK-37008
 URL: https://issues.apache.org/jira/browse/FLINK-37008
 Project: Flink
  Issue Type: Improvement
  Components: Runtime / Checkpointing, Runtime / Web Frontend
Affects Versions: 2.0-preview
Reporter: Ryan van Huuksloot


There is no way to tell from the Flink UI if a checkpoint is full or 
incremental. 

 

It would be useful for folks strictly using the UI to have an easy way to see 
if a checkpoint is a full or incremental checkpoint. The information is 
available but not set in the correct places to have it exposed in the UI.

 

At the same time we should make it available in the API so in the future it 
could be used in other contexts like the operator.

 

Community Thread: 
[https://lists.apache.org/thread/0hnn82jrfog18337n3x56wv8n7rrw2rg]

 

I've opened an example PR. It is lacking sophistication but hopefully it starts 
the conversation.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (FLINK-37010) Unify KeyedProcessFunction and the async one

2025-01-05 Thread Zakelly Lan (Jira)
Zakelly Lan created FLINK-37010:
---

 Summary: Unify KeyedProcessFunction and the async one
 Key: FLINK-37010
 URL: https://issues.apache.org/jira/browse/FLINK-37010
 Project: Flink
  Issue Type: Sub-task
Reporter: Zakelly Lan
Assignee: Zakelly Lan


In FLINK-36120, we introduced `AsyncKeyedProcessFunction` and 
`AsyncKeyedProcessOperator` to perform processing with async state. However for 
table runtime, all the functions will extend from `KeyedProcessFunction`, so it 
is better to unify the base class for function.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


Re: Re: [VOTE] FLIP-494: Add missing createTable/createView methods to TableEnvironment

2025-01-05 Thread Sergey Nuyanzin
also myself
+1 (binding)

On Thu, Jan 2, 2025 at 4:08 AM Xuyang  wrote:

> +1 (non-binding)
>
>
> --
>
> Best!
> Xuyang
>
>
>
>
>
> 在 2024-12-31 16:48:15,"Dawid Wysakowicz"  写道:
> >+1(binding)
> >Best,
> >Dawid
> >
> >On Tue, 24 Dec 2024 at 08:27, Feng Jin  wrote:
> >
> >> +1  (non-binding)
> >>
> >> Best,
> >> Feng Jin
> >>
> >>
> >> On Tue, Dec 24, 2024 at 10:42 AM Yuepeng Pan 
> >> wrote:
> >>
> >> > +1 (non-binding)
> >> >
> >> > Best,
> >> > Yuepeng Pan
> >> >
> >> >
> >> >
> >> >
> >> >
> >> >
> >> >
> >> >
> >> >
> >> >
> >> >
> >> >
> >> >
> >> >
> >> > 在 2024-12-24 10:37:57,"Ron Liu"  写道:
> >> > >+1(binding)
> >> > >
> >> > >Best,
> >> > >Ron
> >> > >
> >> > >Sergey Nuyanzin  于2024年12月20日周五 18:52写道:
> >> > >
> >> > >> Hi everyone,
> >> > >>
> >> > >> I'd like to start a vote on FLIP-494: Add missing
> >> createTable/createView
> >> > >> methods to TableEnvironment [1] which has been discussed in this
> >> thread
> >> > >> [2].
> >> > >>
> >> > >> The vote will be open for at least 72 hours unless there is an
> >> objection
> >> > >> or not enough votes.
> >> > >>
> >> > >> [1]
> >> > >>
> >> >
> >>
> https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=334760466
> >> > >> [2]
> https://lists.apache.org/thread/k1dgx03xmbo3pw3hxtqjr8wcvz0vhdjs
> >> > >>
> >> > >> --
> >> > >> Best regards,
> >> > >> Sergey
> >> > >>
> >> >
> >>
>


-- 
Best regards,
Sergey


[RESULT][VOTE] FLIP-494: Add missing createTable/createView methods to TableEnvironment

2025-01-05 Thread Sergey Nuyanzin
Hi everyone,

I'm delighted to announce that FLIP-494 [1] has been accepted.

There were 6 +1 votes, 3 of which are binding:

- Ron Liu (binding)
- Yuepeng Pan (non-binding)
- Feng Jin (non-binding)
- Dawid Wysakowicz (binding)
- Xuyang (non-binding)
- Sergey Nuyanzin (binding)

There were no -1 votes.

Thanks everyone!

[1]
https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=334760466

-- 
Best regards,
Sergey


Re: Re: [VOTE] FLIP-494: Add missing createTable/createView methods to TableEnvironment

2025-01-05 Thread Sergey Nuyanzin
The voting time for FLIP-494: Add missing createTable/createView methods to
TableEnvironment [1] has passed.
  I'm closing the vote now.

  [1]
https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=334760466

On Sun, Jan 5, 2025 at 7:21 PM Sergey Nuyanzin  wrote:

> also myself
> +1 (binding)
>
> On Thu, Jan 2, 2025 at 4:08 AM Xuyang  wrote:
>
>> +1 (non-binding)
>>
>>
>> --
>>
>> Best!
>> Xuyang
>>
>>
>>
>>
>>
>> 在 2024-12-31 16:48:15,"Dawid Wysakowicz"  写道:
>> >+1(binding)
>> >Best,
>> >Dawid
>> >
>> >On Tue, 24 Dec 2024 at 08:27, Feng Jin  wrote:
>> >
>> >> +1  (non-binding)
>> >>
>> >> Best,
>> >> Feng Jin
>> >>
>> >>
>> >> On Tue, Dec 24, 2024 at 10:42 AM Yuepeng Pan 
>> >> wrote:
>> >>
>> >> > +1 (non-binding)
>> >> >
>> >> > Best,
>> >> > Yuepeng Pan
>> >> >
>> >> >
>> >> >
>> >> >
>> >> >
>> >> >
>> >> >
>> >> >
>> >> >
>> >> >
>> >> >
>> >> >
>> >> >
>> >> >
>> >> > 在 2024-12-24 10:37:57,"Ron Liu"  写道:
>> >> > >+1(binding)
>> >> > >
>> >> > >Best,
>> >> > >Ron
>> >> > >
>> >> > >Sergey Nuyanzin  于2024年12月20日周五 18:52写道:
>> >> > >
>> >> > >> Hi everyone,
>> >> > >>
>> >> > >> I'd like to start a vote on FLIP-494: Add missing
>> >> createTable/createView
>> >> > >> methods to TableEnvironment [1] which has been discussed in this
>> >> thread
>> >> > >> [2].
>> >> > >>
>> >> > >> The vote will be open for at least 72 hours unless there is an
>> >> objection
>> >> > >> or not enough votes.
>> >> > >>
>> >> > >> [1]
>> >> > >>
>> >> >
>> >>
>> https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=334760466
>> >> > >> [2]
>> https://lists.apache.org/thread/k1dgx03xmbo3pw3hxtqjr8wcvz0vhdjs
>> >> > >>
>> >> > >> --
>> >> > >> Best regards,
>> >> > >> Sergey
>> >> > >>
>> >> >
>> >>
>>
>
>
> --
> Best regards,
> Sergey
>


-- 
Best regards,
Sergey


Re: [DISCUSS] Is it a bug that the AdaptiveScheduler does not prioritize releasing TaskManagers during downscaling in Application mode?

2025-01-05 Thread Matthias Pohl
Hi everyone and sorry for the late reply. I was mostly off in November and
forgot about that topic in December last year.

Thanks for summarizing and bringing up user feedback. I see the problem and
agree with your view that it's a topic that we might want to address in the
1.x LTS version. I see how this can be labeled as a bug or a feature
depending on the perspective. I think adding this behavior while being
guarded by a feature flag/configuration parameter in the 1.x LTS version is
reasonable.

Best,
Matthias

[1]
https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=158873338#FLIP138:DeclarativeResourcemanagement-Howtodistributeslotsacrossdifferentjobs

On Wed, Nov 6, 2024 at 9:21 AM Rui Fan <1996fan...@gmail.com> wrote:

> Thanks Yuepeng for the PR and starting this discussion!
>
> And thanks Gyula and Yuanfeng for the input!
>
> I also agree to fix this behaviour in the 1.x line.
>
> The adaptive scheduler and rescaling API provide powerful capabilities to
> increase or decrease parallelism.
>
> The main benefit I understand of decreasing parallelism is saving
> resources.
> If decreasing parallelism can't save resources, why do users decrease it?
> This is why I think releasing TM resources when decreasing parallelism is
> a basic capability that the Adaptive Scheduler should have.
>
> Please correct me if I miss anything, thanks~
>
> Also, I believe it does not work as the user expects. Because this
> behaviour
> was reported multiple times in the flink community, such as:
> FLINK-33977[1],
> FLINK-35594[2], FLINK-35903[3] and Slack channel[4].
> And 1.20.x is a LTS version, so I agree to fix it in the 1.x line.
>
> [1] https://issues.apache.org/jira/browse/FLINK-33977
> [2] https://issues.apache.org/jira/browse/FLINK-35594
> [3] https://issues.apache.org/jira/browse/FLINK-35903
> [4] https://apache-flink.slack.com/archives/C03G7LJTS2G/p1729167222445569
>
> Best,
> Rui
>
> On Wed, Nov 6, 2024 at 4:15 PM yuanfeng hu  wrote:
>
>> > Is it considered an error if the adaptive scheduler fails to release the
>> task manager during scaling?
>>
>> +1 . When we enable adaptive mode and perform scaling operations on tasks,
>> a significant part of the goal is to reduce resource usage for the tasks.
>> However, due to some logic in the adaptive scheduler's scheduling process,
>> the task manager cannot be released, and the ultimate goal cannot be
>> achieved. Therefore, I consider this to be a mistake.
>>
>> Additionally, many tasks are currently running in this mode and will
>> continue to run for quite a long time (many users are in this situation).
>> So whether or not it is considered a bug, I believe we need to fix it in
>> the 1.x version.
>>
>> Yuepeng Pan  于2024年11月6日周三 14:32写道:
>>
>> > Hi, community.
>> >
>> >
>> >
>> >
>> > When working on ticket[1] we have received some lively discussions and
>> > valuable
>> > feedback[2](thanks for Matthias, Rui, Gyula, Maximilian, Tison, etc.),
>> the
>> > main issues are that:
>> >
>> > When the job runs in an application cluster, could the default behavior
>> of
>> > AdaptiveScheduler not actively releasing Taskmanagers resources during
>> > downscaling be considered a bug?
>> >
>> > If so,should we fix it in flink 1.x?
>> >
>> >
>> >
>> > I’d like to start a discussion to hear more comments about it to define
>> > the next step and I have sorted out some information in the doc[3]
>> > regarding this discussion for you.
>> >
>> >
>> >
>> > Looking forward to your comments and attention.
>> >
>> > Thank you.
>> >
>> > Best,
>> > Yuepeng Pan
>> >
>> >
>> >
>> >
>> > [1] https://issues.apache.org/jira/browse/FLINK-33977
>> >
>> > [2] https://github.com/apache/flink/pull/25218#issuecomment-2401913141
>> >
>> > [3]
>> >
>> https://docs.google.com/document/d/1Rwwl2aGVz9g5kUJFMP5GMlJwzEO_a-eo4gPf7gITpdw/edit?tab=t.0#heading=h.s4i4hehbbli5
>> >
>> >
>> >
>>
>> --
>> Best,
>> Yuanfeng
>>
>


[jira] [Created] (FLINK-36934) Enrich numSubpartitions to TierMasterAgent#addPartitionAndGetShuffleDescriptor

2025-01-05 Thread Weijie Guo (Jira)
Weijie Guo created FLINK-36934:
--

 Summary: Enrich numSubpartitions to 
TierMasterAgent#addPartitionAndGetShuffleDescriptor
 Key: FLINK-36934
 URL: https://issues.apache.org/jira/browse/FLINK-36934
 Project: Flink
  Issue Type: Improvement
Reporter: Weijie Guo






--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (FLINK-37007) FLIP-494: Add missing createTable/createView methods to TableEnvironment

2025-01-05 Thread Sergey Nuyanzin (Jira)
Sergey Nuyanzin created FLINK-37007:
---

 Summary: FLIP-494: Add missing createTable/createView methods to 
TableEnvironment
 Key: FLINK-37007
 URL: https://issues.apache.org/jira/browse/FLINK-37007
 Project: Flink
  Issue Type: New Feature
  Components: Table SQL / API
Reporter: Sergey Nuyanzin
Assignee: Sergey Nuyanzin


https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=334760466



--
This message was sent by Atlassian Jira
(v8.20.10#820010)