Re: [PR] add announcement blog post for Flink 1.18 [flink-web]

via GitHub Tue, 17 Oct 2023 21:16:01 -0700


zhuzhurk commented on code in PR #680:
URL: https://github.com/apache/flink-web/pull/680#discussion_r1363147992



##########
docs/content/posts/2023-10-10-release-1.18.0.md:
##########
@@ -0,0 +1,554 @@
+---
+authors:
+- JingGe:
+  name: "Jing Ge"
+  twitter: jingengineer
+- KonstantinKnauf:
+  name: "Konstantin Knauf"
+  twitter: snntrable
+- SergeyNuyanzin:
+  name: "Sergey Nuyanzin"
+  twitter: uckamello
+- QingshengRen:
+  name: "Qingsheng Ren"
+  twitter: renqstuite
+date: "2023-10-10T08:00:00Z"
+subtitle: ""
+title: Announcing the Release of Apache Flink 1.18
+aliases:
+- /news/2023/10/10/release-1.18.0.html
+---
+
+The Apache Flink PMC is pleased to announce the release of Apache Flink 
1.18.0. As usual, we are looking at a packed 
+release with a wide variety of improvements and new features. Overall, 174 
people contributed to this release completing 
+18 FLIPS and 700+ issues. Thank you!
+
+Let's dive into the highlights.
+
+# Towards a Streaming Lakehouse
+
+## Flink SQL Improvements
+
+### Introduce Flink JDBC Driver For SQL Gateway 
+
+Flink 1.18 comes with a JDBC Driver for the Flink SQL Gateway. So, you can now 
use any SQL Client that supports JDBC to 
+interact with your tables via Flink SQL. Here is an example using 
[SQLLine](https://julianhyde.github.io/sqlline/manual.html). 
+
+```shell
+sqlline> !connect jdbc:flink://localhost:8083
+```
+
+```shell
+sqlline version 1.12.0
+sqlline> !connect jdbc:flink://localhost:8083
+Enter username for jdbc:flink://localhost:8083:
+Enter password for jdbc:flink://localhost:8083:
+0: jdbc:flink://localhost:8083> CREATE TABLE T(
+. . . . . . . . . . . . . . .)>      a INT,
+. . . . . . . . . . . . . . .)>      b VARCHAR(10)
+. . . . . . . . . . . . . . .)>  ) WITH (
+. . . . . . . . . . . . . . .)>      'connector' = 'filesystem',
+. . . . . . . . . . . . . . .)>      'path' = 'file:///tmp/T.csv',
+. . . . . . . . . . . . . . .)>      'format' = 'csv'
+. . . . . . . . . . . . . . .)>  );
+No rows affected (0.122 seconds)
+0: jdbc:flink://localhost:8083> INSERT INTO T VALUES (1, 'Hi'), (2, 'Hello');
++----------------------------------+
+|              job id              |
++----------------------------------+
+| fbade1ab4450fc57ebd5269fdf60dcfd |
++----------------------------------+
+1 row selected (1.282 seconds)
+0: jdbc:flink://localhost:8083> SELECT * FROM T;
++---+-------+
+| a |   b   |
++---+-------+
+| 1 | Hi    |
+| 2 | Hello |
++---+-------+
+2 rows selected (1.955 seconds)
+0: jdbc:flink://localhost:8083>
+```
+
+**More Information**
+* 
[Documentation](https://nightlies.apache.org/flink/flink-docs-stable/docs/dev/table/jdbcdriver/)
 
+* [FLIP-293: Introduce Flink Jdbc Driver For Sql 
Gateway](https://cwiki.apache.org/confluence/display/FLINK/FLIP-293%3A+Introduce+Flink+Jdbc+Driver+For+Sql+Gateway)
+
+
+### Stored Procedure Support for Flink Connectors
+
+Stored procedures have been an indispensable tool in traditional databases,
+offering a convenient way to encapsulate complex logic for data manipulation
+and administrative tasks. They also offer the potential for enhanced
+performance, since they can trigger the handling of data operations directly
+within an external database. Other popular data systems like Trino and Iceberg
+automate and simplify common maintenance tasks into small sets of procedures,
+which greatly reduces users' administrative burden.
+
+This new update primarily targets developers of Flink connectors, who can now
+predefine custom stored procedures into connectors via the Catalog interface.
+The primary benefit to users is that connector-specific tasks that previously
+may have required writing custom Flink code can now be replaced with simple
+calls that encapsulate, standardize, and potentially optimize the underlying
+operations. Users can execute procedures using the familiar `CALL` syntax, and
+discover a connector's available procedures with `SHOW PROCEDURES`. Stored
+procedures within connectors improves the extensibility of Flink's SQL and
+Table APIs, and should unlock smoother data access and management for users.
+
+**More Information**
+* 
[Documentation](https://nightlies.apache.org/flink/flink-docs-stable/docs/dev/table/procedures/)
+* [FLIP-311: Support Call Stored 
Procedure](https://cwiki.apache.org/confluence/display/FLINK/FLIP-311%3A+Support+Call+Stored+Procedure)
+
+### Extended DDL Support
+
+From this release onwards, Flink supports
+
+- `REPLACE TABLE AS SELECT`
+- `CREATE OR REPLACE TABLE AS SELECT`
+
+and both these commands and previously supported `CREATE TABLE AS` can now 
support atomicity provided the underlying 
+connector also supports this.
+
+Moreover, Apache Flink now supports TRUNCATE TABLE in batch execution mode. 
Same as before, the underlying connector needs 
+to implement and provide this capability
+
+And, finally, we have also implemented support for adding, dropping and 
listing partitions via
+
+- `ALTER TABLE ADD PARTITION`
+- `ALTER TABLE DROP PARTITION`
+- `SHOW PARTITIONS`
+
+**More Information**
+- [Documentation on 
TRUNCATE](https://nightlies.apache.org/flink/flink-docs-stable/docs/dev/table/sql/truncate/)
+- [Documentation on CREATE OR 
REPLACE](https://nightlies.apache.org/flink/flink-docs-stable/docs/dev/table/sql/create/#create-or-replace-table)
+- [Documentation on ALTER 
TABLE](https://nightlies.apache.org/flink/flink-docs-stable/docs/dev/table/sql/alter/#alter-table)
+- [FLIP-302: Support TRUNCATE TABLE statement in batch 
mode](https://cwiki.apache.org/confluence/display/FLINK/FLIP-302%3A+Support+TRUNCATE+TABLE+statement+in+batch+mode)
+- [FLIP-303: Support REPLACE TABLE AS SELECT 
statement](https://cwiki.apache.org/confluence/display/FLINK/FLIP-303%3A+Support+REPLACE+TABLE+AS+SELECT+statement)
+- [FLIP-305: Support atomic for CREATE TABLE AS SELECT(CTAS) 
statement](https://cwiki.apache.org/confluence/display/FLINK/FLIP-305%3A+Support+atomic+for+CREATE+TABLE+AS+SELECT%28CTAS%29+statement)
+
+### Time Traveling
+
+Flink supports the time travel SQL syntax for querying historical versions of 
data that allows users to specify a point 
+in time and retrieve the data and schema of a table as it appeared at that 
time. With time travel, users can easily 
+analyze and compare historical versions of data.
+
+**More information**
+- 
[Documentation](https://nightlies.apache.org/flink/flink-docs-stable/docs/dev/table/sql/queries/time-travel/)
+- [FLIP-308: Support Time 
Travel](https://cwiki.apache.org/confluence/display/FLINK/FLIP-308%3A+Support+Time+Travel)
+
+## Streaming Execution Improvements
+
+### Support Operator-Level State TTL in Table API & SQL
+
+Starting from Flink 1.18, Table API and SQL users can set state time-to-live 
(TTL) individually for stateful operators.
+This means that for scenarios like stream regular joins, users can now set 
different TTLs for the left and right 
+streams. In previous versions, state expiration could only be controlled at 
the pipeline level using the configuration 
+`table.exec.state.ttl`. With the introduction of operator-level state 
retention, users can now optimize resource 
+usage according to their specific requirements.
+
+**More Information**
+- 
[Documentation](https://nightlies.apache.org/flink/flink-docs-stable/docs/dev/table/concepts/overview/#configure-operator-level-state-ttl)
+- [FLIP-292: Enhance COMPILED PLAN to support operator-level state TTL 
configuration](https://cwiki.apache.org/confluence/display/FLINK/FLIP-292%3A+Enhance+COMPILED+PLAN+to+support+operator-level+state+TTL+configuration)
+
+### Watermark Alignment and Idleness Detection in SQL
+
+You can now configure [watermark 
alignment](https://nightlies.apache.org/flink/flink-docs-stable/docs/dev/datastream/event-time/generating_watermarks/#watermark-alignment)
 
+and [source idleness 
timeouts](https://nightlies.apache.org/flink/flink-docs-stable/docs/dev/datastream/event-time/generating_watermarks/#dealing-with-idle-sources)
 
+in [pure SQL via 
hints](https://nightlies.apache.org/flink/flink-docs-master/docs/dev/table/concepts/time_attributes/#advanced-watermark-features).
 Previously, these features were only available in the DataStream API.
+
+**More Information**
+- 
[Documentation](https://nightlies.apache.org/flink/flink-docs-stable/docs/dev/table/sql/queries/hints/)
+- [FLIP-296: Extend watermark-related features for 
SQL](https://cwiki.apache.org/confluence/display/FLINK/FLIP-296%3A+Extend+watermark-related+features+for+SQL)
+
+## Batch Execution Improvements
+
+### Hybrid Shuffle supports Remote Storage
+
+Hybrid Shuffle supports storing the shuffle data in remote storage. The remote 
storage path can be configured with the 
+option `taskmanager.network.hybrid-shuffle.remote.path`. Hybrid Shuffle uses 
less network memory than before by 
+decoupling the memory usage from the number of parallelisms, improving the 
stability and ease of use. 
+
+**More Information**
+* 
[Documentation](https://nightlies.apache.org/flink/flink-docs-stable/docs/ops/batch/batch\_shuffle/#hybrid-shuffle)
+* [FLIP-301: Hybrid Shuffle supports Remote 
Storage](https://cwiki.apache.org/confluence/display/FLINK/FLIP-301%3A+Hybrid+Shuffle+supports+Remote+Storage)
+
+### Performance Improvements & TPC-DS Benchmark
+
+In previous releases, the community worked extensively to improve Flink's 
batch processing performance, which has led to 
+significant improvements. In this release cycle, community contributors 
continued to put significant effort into 
+further improving Flink's batch performance.
+
+#### Runtime Filter for Flink SQL
+
+Runtime filter is a common method for optimizing Join performance. It is 
designed to dynamically generate filter 
+conditions for certain Join queries at runtime to reduce the amount of scanned 
or shuffled data, avoid unnecessary I/O 
+and network transmission, and speed up the query. We introduced runtime 
filters in Flink 1.18, and verified its 
+effectiveness through the TPC-DS benchmark, and observed up to 3x speedup for 
some queries by enabling this feature.
+
+#### Operator Fusion Codegen for Flink SQL
+
+Operator Fusion Codegen improves the execution performance of a query by 
fusing an operator DAG into a single optimized 
+operator that eliminates virtual function calls, leverages CPU registers for 
intermediate data and reduces the 
+instruction cache miss. As a general technical optimization, we verified its 
effectiveness through TPC-DS, and 
+only some batch operators (Calc, HashAgg, and HashJoin) completed fusion 
codegen support in version 1.18, getting 
+significant performance gains on some query.
+
+Note that both features are considered experimental and disabled by default in 
Flink 1.18. 
+They can be enabled by using `table.optimizer.runtime-filter.enabled` and 
`able.exec.operator-fusion-codegen.enabled` 
+respectively.
+
+Since Flink 1.16, the Apache Flink Community has been continuously tracking 
the performance of its batch engine via the 
+TPC-DS benchmarking framework. After significant improvements in Flink 1.17 
(dynamic join-reordering, 
+dynamic local aggregations), the two improvements described in the previous 
sections lead to 12% performance improvement
+compared to Flink 1.17 , a 35% performance improvement compared to Flink 1.16 
on a 10T dataset for partitioned tables.

Review Comment:
   I think the numbers 12% and 35% are the reduction of TPC-DS execution time. 
So actually the performance improvements numbers should be 14% and 54%.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Re: [PR] add announcement blog post for Flink 1.18 [flink-web]

Reply via email to