Re: [PR] add announcement blog post for Flink 1.18 [flink-web]

via GitHub Fri, 06 Oct 2023 06:59:18 -0700


snuyanzin commented on code in PR #680:
URL: https://github.com/apache/flink-web/pull/680#discussion_r1348759823



##########
docs/content/posts/2023-10-10-release-1.18.0.md:
##########
@@ -0,0 +1,542 @@
+---
+authors:
+- JingGe:
+  name: "Jing Ge"
+  twitter: jingengineer
+- KonstantinKnauf:
+  name: "Konstantin Knauf"
+  twitter: snntrable
+- SergeyNuyanzin:
+  name: "Sergey Nuyanzin"
+  twitter: uckamello
+- QingshengRen:
+  name: "Qingsheng Ren"
+  twitter: renqstuite
+date: "2023-10-10T08:00:00Z"
+subtitle: ""
+title: Announcing the Release of Apache Flink 1.18
+aliases:
+- /news/2023/10/10/release-1.18.0.html
+---
+
+The Apache Flink PMC is pleased to announce the release of Apache Flink 
1.18.0. As usual, we are looking at a packed 
+release with a wide variety of improvements and new features. Overall, 176 
people contributed to this release completing 
+18 FLIPS and 700+ issues. Thank you!
+
+Let's dive into the highlights.
+
+# Towards a Streaming Lakehouse
+
+## Flink SQL Improvements
+
+### Introduce Flink JDBC Driver For Sql Gateway 
+
+Flink 1.18 comes with a JDBC Driver for the Flink SQL Gateway. So, you can now 
use any SQL Client that supports JDBC to 
+interact with your tables via Flink SQL. Here is an example using 
[SQLLine](https://julianhyde.github.io/sqlline/manual.html). 
+
+```shell
+sqlline> !connect jdbc:flink://localhost:8083
+```
+
+```shell
+sqlline version 1.12.0
+sqlline> !connect jdbc:flink://localhost:8083
+Enter username for jdbc:flink://localhost:8083:
+Enter password for jdbc:flink://localhost:8083:
+0: jdbc:flink://localhost:8083> CREATE TABLE T(
+. . . . . . . . . . . . . . .)>      a INT,
+. . . . . . . . . . . . . . .)>      b VARCHAR(10)
+. . . . . . . . . . . . . . .)>  ) WITH (
+. . . . . . . . . . . . . . .)>      'connector' = 'filesystem',
+. . . . . . . . . . . . . . .)>      'path' = 'file:///tmp/T.csv',
+. . . . . . . . . . . . . . .)>      'format' = 'csv'
+. . . . . . . . . . . . . . .)>  );
+No rows affected (0.122 seconds)
+0: jdbc:flink://localhost:8083> INSERT INTO T VALUES (1, 'Hi'), (2, 'Hello');
++----------------------------------+
+|              job id              |
++----------------------------------+
+| fbade1ab4450fc57ebd5269fdf60dcfd |
++----------------------------------+
+1 row selected (1.282 seconds)
+0: jdbc:flink://localhost:8083> SELECT * FROM T;
++---+-------+
+| a |   b   |
++---+-------+
+| 1 | Hi    |
+| 2 | Hello |
++---+-------+
+2 rows selected (1.955 seconds)
+0: jdbc:flink://localhost:8083>
+```
+
+**More Information**
+* 
[Documentation](https://nightlies.apache.org/flink/flink-docs-master/docs/dev/table/jdbcdriver/)
 
+* [FLIP-293: Introduce Flink Jdbc Driver For Sql 
Gateway](https://cwiki.apache.org/confluence/display/FLINK/FLIP-293%3A+Introduce+Flink+Jdbc+Driver+For+Sql+Gateway)
+
+
+### Stored Procedures
+
+Stored Procedures provide a convenient way to encapsulate complex logic to 
perform data manipulation or administrative 
+tasks in Apache Flink itself. Therefore， Flink introduces the support for 
calling stored procedures. 
+Flink now allows catalog developers to develop their own built-in stored 
procedures and then enables users to call these
+predefined stored procedures.
+
+**More Information**
+* 
[Documentation](https://nightlies.apache.org/flink/flink-docs-master/docs/dev/table/procedures/)
+* [FLIP-311: Support Call Stored 
Procedure](https://cwiki.apache.org/confluence/display/FLINK/FLIP-311%3A+Support+Call+Stored+Procedure)
+
+### Extended DDL Support
+
+From this release onwards, Flink supports
+
+- `REPLACE TABLE AS SELECT`
+- `CREATE OR REPLACE TABLE AS SELECT`
+
+and both these commands and previously supported `CREATE TABLE AS` can now 
support atomicity provided the underlying 
+connector supports this.
+
+Moreover, Apache Flink now supports TRUNCATE TABLE in batch execution mode. As 
before, the underlying connector needs 
+to implement and provide this capability
+
+And, finally, we have also added support for adding, dropping and listing 
partitions via
+
+- `ALTER TABLE ADD PARTITION`
+- `ALTER TABLE DROP PARTITION`
+- `SHOW PARTITIONS`
+
+**More Information**
+- [Documentation on 
TRUNCATE](https://nightlies.apache.org/flink/flink-docs-master/docs/dev/table/sql/truncate/)
+- [Documentation on CREATE OR 
REPLACE](https://nightlies.apache.org/flink/flink-docs-master/docs/dev/table/sql/create/#create-or-replace-table)
+- [Documentation on ALTER 
TABLE](https://nightlies.apache.org/flink/flink-docs-master/docs/dev/table/sql/alter/#alter-table)
+- [FLIP-302: Support TRUNCATE TABLE statement in batch 
mode](https://cwiki.apache.org/confluence/display/FLINK/FLIP-302%3A+Support+TRUNCATE+TABLE+statement+in+batch+mode)
+- [FLIP-303: Support REPLACE TABLE AS SELECT 
statement](https://cwiki.apache.org/confluence/display/FLINK/FLIP-303%3A+Support+REPLACE+TABLE+AS+SELECT+statement)
+- [FLIP-305: Support atomic for CREATE TABLE AS SELECT(CTAS) 
statement](https://cwiki.apache.org/confluence/display/FLINK/FLIP-305%3A+Support+atomic+for+CREATE+TABLE+AS+SELECT%28CTAS%29+statement)
+
+### Time Travelling
+
+Flink supports the time travel SQL syntax for querying historical versions of 
data that allows users to specify a point 
+in time and retrieve the data and schema of a table as it appeared at that 
time. With time travel, users can easily 
+analyze and compare historical versions of data.
+
+**More information**
+- 
[Documentation](https://nightlies.apache.org/flink/flink-docs-master/docs/dev/table/sql/queries/time-travel/)
+- [FLIP-308: Support Time 
Travel](https://cwiki.apache.org/confluence/display/FLINK/FLIP-308%3A+Support+Time+Travel)
+
+## Streaming Execution Improvements
+
+### Support Operator-Level State TTL
+
+Starting from Flink 1.18, Table API and SQL users can set state time-to-live 
(TTL) individually for stateful operators.
+This means that for scenarios like stream regular joins, users can now set 
different TTLs for the left and right 
+streams. In previous versions, state expiration could only be controlled at 
the pipeline level using the configuration 
+`table.exec.state.ttl`. With the introduction of operator-level state 
retention, users can now optimize resource 
+usage according to their specific requirements.
+
+**More Information**
+- 
[Documentation](https://nightlies.apache.org/flink/flink-docs-master/docs/dev/table/concepts/overview/#configure-operator-level-state-ttl)
+- [FLIP-292: Enhance COMPILED PLAN to support operator-level state TTL 
configuration](https://cwiki.apache.org/confluence/display/FLINK/FLIP-292%3A+Enhance+COMPILED+PLAN+to+support+operator-level+state+TTL+configuration)
+
+### Watermark Alignment and Idleness Detection in SQL
+
+You can now configure [watermark 
alignment](https://nightlies.apache.org/flink/flink-docs-stable/docs/dev/datastream/event-time/generating_watermarks/#watermark-alignment)
 
+and [source idleness 
timeouts](https://nightlies.apache.org/flink/flink-docs-stable/docs/dev/datastream/event-time/generating_watermarks/#dealing-with-idle-sources)
 
+in pure SQL via hints. Previously, these features were only available in the 
DataStream API.
+
+**More Information**
+- 
[Documentation](https://nightlies.apache.org/flink/flink-docs-master/docs/dev/table/sql/queries/hints/)
+- [FLIP-296: Extend watermark-related features for 
SQL](https://cwiki.apache.org/confluence/display/FLINK/FLIP-296%3A+Extend+watermark-related+features+for+SQL)
+
+## Batch Execution Improvements
+
+### Hybrid Shuffle supports Remote Storage
+
+Hybrid Shuffle supports storing the shuffle data in remote storage. The remote 
storage path can be configured with the 
+option `taskmanager.network.hybrid-shuffle.remote.path`. Hybrid Shuffle uses 
less network memory than before by 
+decoupling the memory usage from the number of parallelisms, improving the 
stability and ease of use. 
+
+For more detailed 
+information, please refer to 
+
+**More Information**
+* 
[Documentation](https://nightlies.apache.org/flink/flink-docs-master/docs/ops/batch/batch\_shuffle/#hybrid-shuffle)
+* [FLIP-301: Hybrid Shuffle supports Remote 
Storage](https://cwiki.apache.org/confluence/display/FLINK/FLIP-301%3A+Hybrid+Shuffle+supports+Remote+Storage)
+
+
+### Performance Improvements & TPC-DS Benchmark
+
+In previous releases, the community has done a lot of work to improve batch 
processing performance, which has led to 
+significant improvements. In this release cycle, community contributors have 
continued to put a lot of effort into 
+improving batch performance.
+
+#### Runtime Filter for Flink SQL
+
+Runtime filter is a common optimization to improve join performance. It is 
designed to dynamically generate filter 
+conditions for certain Join queries at runtime to reduce the amount of scanned 
or shuffled data, avoid unnecessary I/O 
+and network transmission, and speed up the query. We introduced runtime 
filters in Flink 1.18, and verified its 
+effectiveness through the TPC-DS benchmark, and observed up to 3x speedup for 
some queries by enabling this feature.
+
+#### Operator Fusion Codegen for Flink SQL
+
+Operator Fusion Codegen improves the execution performance of a query by 
fusing an operator DAG into a single optimized 
+operator that eliminates virtual function calls, leverages CPU registers for 
intermediate data and reduces the 
+instruction cache miss. As a general technical optimization, we verified its 
effectiveness through TPC-DS, and 
+only some batch operators completed fusion codegen support in version 1.18, 
getting significant performance gains on 
+some query.
+
+Note that both features are considered experimental and disabled by default in 
Flink 1.18. 
+They can be enabled using `table.optimizer.runtime-filter.enabled` and 
`able.exec.operator-fusion-codegen.enabled` 
+respectively.
+
+Since Flink 1.16, the Apache Flink Community has been continuously tracking 
the performance of its batch engine via the 
+TPC-DS benchmarking framework. After significant improvements in Flink 1.17 
(dynamic join-reordering, 
+dynamic local aggregations), the two improvements described in the previous 
sections lead to 12% performance improvement
+compared to Flink 1.17 , a 35% performance improvement compared to Flink 1.16 
on a 10T dataset for partitioned tables.
+
+<div style="text-align: center;">
+<img src="/img/blog/2023-10-10-release-1.18.0/tpc-ds-benchmark.png" 
style="width:90%;margin:15px">
+</div>
+
+**More Information**
+* [FLIP-324: Introduce Runtime Filter for Flink Batch 
Jobs](https://cwiki.apache.org/confluence/display/FLINK/FLIP-324%3A+Introduce+Runtime+Filter+for+Flink+Batch+Jobs)
+* [FLIP-315: Support Operator Fusion Codegen for Flink 
SQL](https://cwiki.apache.org/confluence/display/FLINK/FLIP-315+Support+Operator+Fusion+Codegen+for+Flink+SQL)
+
+# Towards Cloud-Native Elasticity
+
+Elasticity describes the ability of a system to adapt to workload changes in a 
non-disruptive ideally automatic manner.
+It is a defining characteristic of cloud-native systems and for long-running 
streaming workloads it is particularly 
+important. As such elasticity improvements are an area of continuous 
investment in the Apache Flink community. 
+Recent initiatives include the Kubernetes 
+[Autoscaler](https://nightlies.apache.org/flink/flink-kubernetes-operator-docs-release-1.6/docs/custom-resource/autoscaler/),
 
+numerous improvements to rescaling performance and last but not least 
+the [Adaptive 
Scheduler](https://nightlies.apache.org/flink/flink-docs-release-1.17/docs/deployment/elastic_scaling/#adaptive-scheduler).
+
+The Adaptive Scheduler was first introduced in Flink 1.15 and constitutes a 
centerpiece of a fully-elastic 
+Apache Flink deployment. At its core, it allows jobs to change their resource 
requirements and parallelism during 
+runtime. In addition, it also adapts to the available resources in the cluster 
by only rescaling once the cluster can 
+satisfy the minimum required resources of the job. 
+
+Until Flink 1.18, the adaptive scheduler was primarily used in
+[Reactive 
Mode](https://nightlies.apache.org/flink/flink-docs-master/docs/deployment/elastic_scaling/#reactive-mode),
 which meant that a single job by design would always use all the available 
resources in the cluster. Please see [this blog 
post](https://flink.apache.org/2021/05/06/scaling-flink-automatically-with-reactive-mode/)
 on how to autoscale Flink Jobs in Reactive Mode using a Horizontal Pod 
Autoscaler on Kubernetes.
+
+With Flink 1.18 the adaptive scheduler becomes much more powerful and more 
widely applicable and is on a trajectory to becoming the default scheduler for 
streaming workloads on Apache Flink.
+
+
+## Dynamic Fine-Grained Rescaling via REST API
+
+Despite the underlying capabilities of the Adaptive Scheduler, the ability to 
change the resource requirements of a 
+Job during runtime had not yet been exposed to the end user directly. This 
changes in Flink 1.18. You can now change 
+the parallelism of any individual task of your job via the Flink Web UI and 
REST API while the job is running.
+
+{{< youtube B1NVDTazsZY >}}
+
+Under the hood, Apache Flink performs a regular rescaling operation as soon as 
the required resources for the new 
+parallelism have been acquired. The rescaling operation is not based on a 
Savepoint, but on an ordinary, periodic 
+checkpoint, which means it doesn’t introduce any additional snapshot. As you 
can see in the video above, the rescaling
+operation happens nearly instantaneously and with a very short downtime for 
jobs with small state size.
+
+In conjunction with the 
+[backpressure 
monitor](https://nightlies.apache.org/flink/flink-docs-release-1.17/docs/ops/monitoring/back_pressure/)
 
+of the Apache Flink Web UI, it is now easier than ever to find and maintain an 
efficient, backpressure-free parallelism 
+for each of the tasks:
+
+- If a task is very busy (red), you increase the parallelism.
+- If a task is mostly idle (blue), you decrease the parallelism.
+
+<div style="text-align: center;">
+<img src="/img/blog/2023-10-10-release-1.18.0/backpressure_monitor.png" 
style="width:90%;margin:15px">
+</div>
+
+**More Information**
+* [Documentation](missing)
+* [FLIP-291: Externalized Declarative Resource 
Management](https://cwiki.apache.org/confluence/display/FLINK/FLIP-291%3A+Externalized+Declarative+Resource+Management)
+
+## Faster Rescaling with RocksDB
+
+The rescaling times when using RocksDB Statebackend with incremental 
checkpoints have been improved about 30% in the 99th quantile.
+
+We increased the potential for parallel download from just downloading state 
handles in parallel to downloading individual files in parallel.
+
+Furthermore, we deactivated write-ahead-logging for batch-inserting into the 
temporary RocksDB instances we use for rescaling.
+
+<div style="text-align: center;">
+<img src="/img/blog/2023-10-10-release-1.18.0/rescaling_performance.png" 
style="width:90%;margin:15px">
+</div>
+
+**More Information**
+* [FLINK-32326](https://issues.apache.org/jira/browse/FLINK-32326)
+* [FLINK-32345](https://issues.apache.org/jira/browse/FLINK-32345)
+
+# Support for Java 17 
+
+Java 17 was released in 2021 and is the latest long term support (LTS) release 
of Java with an end of life in 2029. 
+So, it was about time that Apache Flink added support for it. What does this 
mean concretely? As of Flink 1.18, you can
+now run Apache Flink on Java 17 and the [official Docker 
repository](https://hub.docker.com/_/flink) includes an image 
+based on Java 17.
+
+    docker pull flink:1.18.0-java17
+
+If your cluster runs on Java 17, this of course, also allows you to use Java 
17 features in your user programs and to
+compile it to a target version of Java 17.
+
+**More Information**
+* 
[Documentation](https://nightlies.apache.org/flink/flink-docs-master/docs/deployment/java_compatibility/)
+* [FLINK-15736](https://issues.apache.org/jira/browse/FLINK-15736)
+
+# Others Improvements
+
+## Production-Ready Watermark Alignment
+
+Supported as “Beta” since Flink 1.16 and Flink 1.17, 
+[watermark 
alignment](https://nightlies.apache.org/flink/flink-docs-stable/docs/dev/datastream/event-time/generating_watermarks/#watermark-alignment)
 
+has by now been thoroughly tested at scale in the real world. Over that time 
the community has collected and addressed 
+bugs and performance issues as they were discovered. With the resolution of 
these issues, we are now happy to recommend 
+watermark alignment for general use.
+
+**More Information**
+* [FLINK-32548](https://issues.apache.org/jira/browse/FLINK-32548)
+
+## Pluggable Failure Handling
+
+Apache Flink serves as the foundation for numerous stream processing platforms 
at companies like Apple, Netflix or uber. It is also the basis for various 
commercial stream processing services. Therefore, its ability to easily 
integrate into the wider ecosystem of these internal as well as vendor 
platforms becomes increasingly important. The catalog modification listener and 
pluggable failure handlers fall into this category of improvements.
+
+**More Information**
+* 
[Documentation](https://nightlies.apache.org/flink/flink-docs-master/docs/deployment/advanced/failure_enrichers/)
+* [FLIP-304: Pluggable Failure 
Enrichers](https://cwiki.apache.org/confluence/display/FLINK/FLIP-304%3A+Pluggable+failure+handling+for+Apache+Flink)
+
+## SQL Client Quality of Life Improvements
+
+In 1.18 as a part of 
[FLIP-189](https://cwiki.apache.org/confluence/display/FLINK/FLIP-189%3A+SQL+Client+Usability+Improvements),
 
+SQL client received some major updates. It became more colorful with the 
ability to enable SQL syntax highlighting
+and switching among 7 different color schemes 
[FLINK-24909](https://issues.apache.org/jira/browse/FLINK-24909). 
+In 1.16 there was upgraded JLine which is the underlying Flink SQL client 
engine. With the upgrade it became possible 
+to edit and navigate through very large queries like larger than one screen. 
And in 1.18 another JLine’s feature is 
+coming: now it’s possible to turn line numbers off an on for the query
+[FLINK-24911](https://issues.apache.org/jira/browse/FLINK-24911).
+
+## Apache Pekko instead of Akka
+
+A year ago, [Lightbend 
announced](https://flink.apache.org/2022/09/08/regarding-akkas-licensing-change/)
 changing the 
+license of future versions of Akka (2.7+) from Apache 2.0 to BSL. It was also 
announced that Akka 2.6, the version that 
+Apache Flink uses, would receive security updates and critical bug fixes until 
September of 2023. As September 2023 was 
+approaching, we decided to switch from Akka to [Apache 
Pekko](https://pekko.apache.org/) (incubating). 
+Apache Pekko (incubating) is a fork of Akka 2.6.x, prior to the Akka project’s 
adoption of the Business Source License.
+Pekko recently released Apache Pekko 1.0.0-incubating, which enabled us to 
already use it in Flink 1.18 - just in time.
+While our mid-term plan is to drop the dependency on Akka or Pekko altogether 
+(see [FLINK-29281](https://issues.apache.org/jira/browse/FLINK-29281)), the 
switch to Pekko presents a good short-term 
+solution and ensures that the Apache Pekko and Apache Flink Community can 
address critical bug fixes and security 
+vulnerabilities throughout our software supply chain.
+
+**More Information**
+* [FLINK-32468](https://issues.apache.org/jira/browse/FLINK-32468)
+
+## Calcite Upgrade(s)
+
+In Apache Flink 1.18, Apache Calcite was gradually upgraded from 1.29 to 1.32. 
The immediate benefit of these upgrades 
+are bug fixes, a smarter optimizer and performance improvements. On a parser 
level it now allows to have joins to be 
+grouped into trees using parentheses (mentioned in SQL-92) e.g. `SELECT * FROM 
a JOIN (b JOIN c ON b.x = c.x) ON a.y = c.y` 
+also see [CALCITE-35](https://issues.apache.org/jira/browse/CALCITE-35). In 
addition, the upgrade to Calcite 1.31+ has 
+unblocked the support of Session Windows via Table-Valued Functions (see 
+[CALCITE-4865](https://issues.apache.org/jira/browse/CALCITE-4865), 
+[FLINK-24024](https://issues.apache.org/jira/browse/FLINK-24024)) and as a 
corollary the depreciation of the legacy 
+group window aggregations. As a drawback of these upgrades due to 
+[CALCITE-4861](https://issues.apache.org/jira/browse/CALCITE-4861) 
(Optimization of chained CAST calls can lead to 
+unexpected behavior), also Flink's casting behavior has slightly changed. Some 
corner cases might behave differently
+now: For example, casting from FLOAT/DOUBLE 9234567891.12 to INT/BIGINT has 
now Java behavior for overflows.

Review Comment:
   isn't mentioning of overflow behavior not enough?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Re: [PR] add announcement blog post for Flink 1.18 [flink-web]

Reply via email to