Re: [PR] [FLINK-35847][release] Add release note for version 1.20 [flink]

via GitHub Thu, 25 Jul 2024 02:11:25 -0700


superdiaodiao commented on code in PR #25091:
URL: https://github.com/apache/flink/pull/25091#discussion_r1691112012



##########
docs/content.zh/release-notes/flink-1.20.md:
##########
@@ -0,0 +1,434 @@
+---
+title: "Release Notes - Flink 1.20"
+---
+<!--
+Licensed to the Apache Software Foundation (ASF) under one
+or more contributor license agreements.  See the NOTICE file
+distributed with this work for additional information
+regarding copyright ownership.  The ASF licenses this file
+to you under the Apache License, Version 2.0 (the
+"License"); you may not use this file except in compliance
+with the License.  You may obtain a copy of the License at
+
+  http://www.apache.org/licenses/LICENSE-2.0
+
+Unless required by applicable law or agreed to in writing,
+software distributed under the License is distributed on an
+"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+KIND, either express or implied.  See the License for the
+specific language governing permissions and limitations
+under the License.
+-->
+
+# Release notes - Flink 1.20
+
+These release notes discuss important aspects, such as configuration, behavior 
or dependencies,
+that changed between Flink 1.19 and Flink 1.20. Please read these notes 
carefully if you are 
+planning to upgrade your Flink version to 1.20.
+
+### Checkpoints
+
+#### Unified File Merging Mechanism for Checkpoints
+
+##### [FLINK-32070](https://issues.apache.org/jira/browse/FLINK-32070)
+
+The unified file merging mechanism for checkpointing is introduced to Flink 
1.20 as an MVP ("minimum viable product")
+feature, which allows scattered small checkpoint files to be written into 
larger files, reducing the
+number of file creations and file deletions and alleviating the pressure of 
file system metadata management
+raised by the file flooding problem during checkpoints. The mechanism can be 
enabled by setting
+`state.checkpoints.file-merging.enabled` to `true`. For more advanced options 
and principle behind
+this feature, please refer to the document of `Checkpointing`.
+
+#### Reorganize State & Checkpointing & Recovery Configuration
+
+##### [FLINK-34255](https://issues.apache.org/jira/browse/FLINK-34255)
+
+Currently, all the options about state and checkpointing are reorganized and 
categorized by
+prefixes as listed below:
+
+1. execution.checkpointing: all configurations associated with checkpointing 
and savepoint.
+2. execution.state-recovery: all configurations pertinent to state recovery.
+3. state.*: all configurations related to the state accessing.
+    1. state.backend.*: specific options for individual state backends, such 
as RocksDB.
+    2. state.changelog: configurations for the changelog, as outlined in 
FLIP-158, including the options for the "Durable Short-term Log" (DSTL).
+    3. state.latency-track: configurations related to the latency tracking of 
state access.
+
+At the meantime, all the original options scattered everywhere are annotated 
as `@Deprecated`.
+
+#### Use common thread pools when transferring RocksDB state files
+
+##### [FLINK-35501](https://issues.apache.org/jira/browse/FLINK-35501)
+
+The semantics of `state.backend.rocksdb.checkpoint.transfer.thread.num` 
changed slightly:
+If negative, the common (TM) IO thread pool is used (see 
`cluster.io-pool.size`) for up/downloading RocksDB files.
+
+#### Expose RocksDB bloom filter metrics
+
+##### [FLINK-34386](https://issues.apache.org/jira/browse/FLINK-34386)
+
+We expose some RocksDB bloom filter metrics to monitor the effectiveness of 
bloom filter optimization:
+
+`BLOOM_FILTER_USEFUL`: times bloom filter has avoided file reads.
+`BLOOM_FILTER_FULL_POSITIVE`: times bloom FullFilter has not avoided the reads.
+`BLOOM_FILTER_FULL_TRUE_POSITIVE`: times bloom FullFilter has not avoided the 
reads and data actually exist.
+
+#### Manually Compact Small SST Files
+
+##### [FLINK-26050](https://issues.apache.org/jira/browse/FLINK-26050)
+
+In some cases, the number of files produced by RocksDB state backend grows 
indefinitely.This might
+cause task state info (TDD and checkpoint ACK) to exceed RPC message size and 
fail recovery/checkpoint
+in addition to having lots of small files.
+
+In Flink 1.20, you can manually merge such files in the background using 
RocksDB API.
+
+### Runtime & Coordination
+
+#### Support Job Recovery from JobMaster Failures for Batch Jobs
+
+##### [FLINK-33892](https://issues.apache.org/jira/browse/FLINK-33892)
+
+In 1.20, we introduced a batch job recovery mechanism to enable batch jobs to 
recover as much progress as possible 
+after a JobMaster failover, avoiding the need to rerun tasks that have already 
been finished.
+
+More information about this feature and how to enable it could be found in: 
https://nightlies.apache.org/flink/flink-docs-master/docs/ops/batch/recovery_from_job_master_failure/
+
+#### Extend Curator config option for Zookeeper configuration
+
+##### [FLINK-33376](https://issues.apache.org/jira/browse/FLINK-33376)
+
+Adds support for the following curator parameters: 
+`high-availability.zookeeper.client.authorization` (corresponding curator 
parameter: `authorization`),
+`high-availability.zookeeper.client.max-close-wait` (corresponding curator 
parameter: `maxCloseWaitMs`), 
+`high-availability.zookeeper.client.simulated-session-expiration-percent` 
(corresponding curator parameter: `simulatedSessionExpirationPercent`).
+
+#### More fine-grained timer processing
+
+##### [FLINK-20217](https://issues.apache.org/jira/browse/FLINK-20217)
+
+Firing timers can now be interrupted to speed up checkpointing. Timers that 
were interrupted by a checkpoint,
+will be fired shortly after checkpoint completes. 
+
+By default, this features is disabled. To enabled it please set 
`execution.checkpointing.unaligned.interruptible-timers.enabled` to `true`.
+Currently supported only by all `TableStreamOperators` and `CepOperator`.
+
+#### Add numFiredTimers and numFiredTimersPerSecond metrics
+
+##### [FLINK-35065](https://issues.apache.org/jira/browse/FLINK-35065)
+
+Currently, there is no way of knowing how many timers are being fired by 
Flink, so it's impossible to distinguish,
+even using code profiling, if operator is firing only a couple of heavy timers 
per second using ~100% of the CPU time,
+vs firing thousands of timer per seconds.
+
+We added the following metrics to address this issue:
+
+- `numFiredTimers`: total number of fired timers per operator
+- `numFiredTimersPerSecond`: per second rate of firing timers per operator
+
+#### Support EndOfStreamTrigger and isOutputOnlyAfterEndOfStream Operator 
Attribute to Optimize Task Deployment
+
+##### [FLINK-34371](https://issues.apache.org/jira/browse/FLINK-34371)
+
+For operators that only generates outputs after all inputs have been consumed, 
they are now optimized
+to run in blocking mode, and the other operators in the same job will wait to 
start until these operators
+have finished. Such operators include windowing with 
`GlobalWindows#createWithEndOfStreamTrigger`, 
+sorting, and etc.
+
+### SDK
+
+#### Support Full Partition Processing On Non-keyed DataStream
+
+##### [FLINK-34543](https://issues.apache.org/jira/browse/FLINK-34543)
+
+We have introduced some full window processing APIs, allowing collection and 
processing of all records
+in each subtask:
+- `mapPartition`: Processes all records using the MapPartitionFunction.
+- `sortPartition`: Sort all records by field or key in the full partition 
window.
+- `aggregate`: Aggregate all records in the full partition window.
+- `reduce`: Reduce all records in the full partition window.
+
+### Table SQL / API
+
+#### Introduce a New Materialized Table for Simplifying Data Pipelines
+
+##### [FLINK-35187](https://issues.apache.org/jira/browse/FLINK-35187)
+
+We introduced the Materialized Table in Flink SQL, a new table type designed 
to simplify both 
+batch and stream data pipelines while providing a consistent development 
experience. 
+
+By specifying data freshness and query at creation, the engine automatically 
derives the schema 
+and creates a data refresh pipeline to maintain the specified freshness. 
+
+More information about this feature can be found here: [Materialized Table 
Overview](https://nightlies.apache.org/flink/flink-docs-master/docs/dev/table/materialized-table/overview/)
+
+#### Introduce Catalog-related Syntax
+
+##### [FLINK-34914](https://issues.apache.org/jira/browse/FLINK-34914)
+
+As the application scenario of `Catalog` expands, which widely applied in 
services such as JDBC/Hive/Paimon,
+`Catalog` plays an increasingly crucial role in Flink. 
+
+FLIP-436 introduces the DQL syntax to obtain detailed metadata from existing 
catalogs, and the DDL syntax
+to modify metadata such as properties or comment in the specified catalog.

Review Comment:
   comment->comments



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Re: [PR] [FLINK-35847][release] Add release note for version 1.20 [flink]

Reply via email to