cecemei commented on code in PR #19287: URL: https://github.com/apache/druid/pull/19287#discussion_r3114270304
########## docs/release-info/release-notes.md: ########## @@ -231,6 +231,10 @@ This section contains detailed release notes separated by areas. ### Ingestion +- Added the `maxStringLength` configuration for string dimensions that truncates values exceeding the specified length during ingestion. You can set the length globally using `druid.indexing.formats.maxStringLength` or per-dimension in the ingestion spec [#19146](https://github.com/apache/druid/pull/19146) +- Added `StringColumnFormatSpec` for string dimension configs [#19258](https://github.com/apache/druid/pull/19258) +- Sped up task scheduling on the Overlord [#19199](https://github.com/apache/druid/pull/19199) Review Comment: ```suggestion - Speed up task scheduling on the Overlord [#19199](https://github.com/apache/druid/pull/19199) ``` ########## docs/release-info/release-notes.md: ########## @@ -525,13 +523,16 @@ You can't perform a rolling upgrade from versions earlier than Druid 0.23. #### Metadata storage for auto-compaction with compaction supervisors -Automatic compaction using compaction supervisors now requires incremental segment metadata caching to be enabled on the Overlord and Coordinator in the runtime properties. Specifically, the `druid.manager.segments.useIncrementalCache` config must be set to `always` or `ifSynced`. For more information about the config, see [Segment metadata cache](https://druid.apache.org/docs/latest/configuration/#segment-metadata-cache-experimental). +Automatic compaction using compaction supervisors requires incremental segment metadata caching to be enabled on the Overlord and Coordinator in the runtime properties. + +As part of this update, Druid requires the incremental cache to be enabled and a new table in the metadata store. No action is required if you are using the default settings for the following configs: Review Comment: ```suggestion Automatic compaction with supervisors requires incremental segment metadata caching on Overlord and a new metadata store table; no action is required if you are using the default settings for the following configs: ``` ########## docs/release-info/release-notes.md: ########## Review Comment: ```suggestion Auto-compaction using compaction supervisors has been improved, now generally available, and the recommended default. Automatic compaction tasks are now prefixed with `auto` instead of `coordinator-issued`. ``` ########## docs/release-info/release-notes.md: ########## @@ -57,50 +57,600 @@ For tips about how to write a good release note, see [Release notes](https://git This section contains important information about new and existing features. +### Hadoop-based ingestion + +Support for Hadoop-based ingestion has been removed. The feature was deprecated in Druid 34. + +Use one of Druid's other supported ingestion methods, such as SQL-based ingestion or MiddleManager-less ingestion using Kubernetes. + +[#19109](https://github.com/apache/druid/pull/19109) + +### Query blocklist + +You can now use the using the `/druid/coordinator/v1/config/broker` API to create a query blocklist to dynamically block queries by datasource, query type, or query context. The blocklist takes effect without a restarting Druid. Block rules use `AND` logic, which means all criteria must match. + +The following example blocks all groupBy queries on the `wikipedia` datasource with a query context parameter of `priority` equal to `0`: + +``` +POST /druid/coordinator/v1/config/broker + { + "queryBlocklist": [ + { + "ruleName": "block-wikipedia-groupbys", + "dataSources": ["wikipedia"], + "queryTypes": ["groupBy"], + "contextMatches": {"priority": "0"} + } + ] + } +``` + +[#19011](https://github.com/apache/druid/pull/19011) + +### Minor compaction for Overlord-based compaction (experimental) + +You can now configure minor compaction to compact only newly ingested segments while upgrading existing compacted segments. When Druid upgrades segments, it updates the metadata instead of using resources to compact it again. You can use the native compaction engine or the MSQ task engine. + +Use the `mostFragmentedFirst` compaction policy and set either a percentage of rows-based or byte-based threshold for minor compaction. + +[#19059](https://github.com/apache/druid/pull/19059) [#19205](https://github.com/apache/druid/pull/19205) [#19016](https://github.com/apache/druid/pull/19016) + +### Cascading reindexing (experimental) + +Using cascading reindexing, you can now define age-based rules to automatically apply different compaction configurations based on the age of your data. While standard auto-compaction applies a single flat configuration across an entire datasource, cascading reindexing lets you tailor your compaction settings to the characteristics of your data. + +For example, you can keep recent data in hourly segments while automatically rolling up to daily segments after 90 days to reduce segment count. You can also layer on age-based row deletion (such as dropping bot traffic from older data), change compression settings, or shift to rollup with coarser query granularity as data ages. Rules are defined inline in the supervisor spec. + +You must use compaction supervisors with the MSQ task engine to use cascading reindexing. + +[#18939](https://github.com/apache/druid/pull/18939) [#19213](https://github.com/apache/druid/pull/19213) [#19106](https://github.com/apache/druid/pull/19106) [#19078](https://github.com/apache/druid/pull/19078) + +### Thrift input format + +As part of the Thrift contributor extension, Druid now supports Thrift-encoded data for Kafka and Kinesis streaming ingestion. + +[#19111](https://github.com/apache/druid/pull/19111) + +To use this feature, you must add `druid-thrift-extensions` to your extension load list. + +### Incremental cache + +Incremental segment metadata cache (`useIncrementalCache`) is now generally available and defaults to `ifSynced`. Druid blocks reads from the cache until it has synced with the metadata store at least once after becoming leader. + +[#19252](https://github.com/apache/druid/pull/19252) + +### Kubernetes-based task management + +This extension is now generally available. + +[#19128](https://github.com/apache/druid/pull/19128) + +### Tombstones + +Tombstones for JSON-based native batch ingestion (the `dropExisting` flag for `ioConfig`) are now generally available. + +[#19128](https://github.com/apache/druid/pull/19128) + +### Dynamic default query context + +You can now add default query context parameters as a dynamic configuration to the Broker. This allows you to override static defaults set in your runtime properties without restarting your deployment or having to update multiple queries individually. Druid applies query context parameters based on the following priority: + +1. The query context included with the query +1. The query context set as a dynamic configuration on the Broker +1. The query context parameters set in the runtime properties +1. The defaults that ship with Druid + +Note that like other Broker dynamic configuration, this is best-effort. Settings may not be applied in certain +cases, such as when a Broker has recently started and hasn't received the configuration yet, or if the +Broker can't contact the Coordinator. If a query context parameter is critical for all your queries, set it in the runtime properties. + +[#19144](https://github.com/apache/druid/pull/19144) + +### `sys.queries` table (experimental) + +The new system queries table provides information about currently running and recently completed queries that use the Dart engine. This table is off by default. To enable the table, set the following: + +``` +druid.sql.planner.enableSysQueriesTable = true +``` + +As part of this change, the `/druid/v2/sql/queries` API now supports an `includeComplete` parameter that shows recently completed queries. + +[#18923](https://github.com/apache/druid/pull/18923) + +### Auto-compaction with compaction supervisors + +Auto-compaction using compaction supervisors has been improved and is now generally available. + +As part of the improvement compaction states are now stored in a central location, a new `indexingStates` table. Individual segments only need to store a unique reference (`indexing_state_fingerprint`) to their full compaction state. + +Since many segments in a single datasource share the same underlying compaction state, this greatly reduces metadata storage requirements for automatic compaction. + +For backwards compatibility, Druid continues to persist the detailed compaction state in each segment. This functionality will be removed in a future release. + +You can stop storing detailed compaction state by setting `storeCompactionStatePerSegment` to `false` in the cluster compaction config. If you turn it off and need to downgrade, Druid needs to re-compact any segments that have been compacted since you changed the config. + +This change has upgrade impacts for metadata storage and metadata caching. For more information, see the [Metadata storage for auto-compaction with compaction supervisors](#metadata-storage-for-auto-compaction-with-compaction-supervisors) upgrade note. + +[#19113](https://github.com/apache/druid/pull/19113) [#18844](https://github.com/apache/druid/pull/18844) [#19252](https://github.com/apache/druid/pull/19252) + +### Broker tier selection for realtime servers + +Added `druid.broker.realtime.select.tier` and `druid.broker.realtime.balancer.type` on the Brokers to optionally override the Broker’s tier selection and balancer strategies for realtime servers. If these properties are not set (the default), realtime servers continue to use the existing `druid.broker.select` and `druid.broker.balancer` configurations that apply to both historical and realtime servers. + +[#19062](https://github.com/apache/druid/pull/19062) + +### Manual Broker routing in the web console + +You can now configure which Broker the Router uses for queries issued from the web console. You may want to do this if there are Brokers that don't have visibility into certain data tiers, and you know you're querying data available only on a certain tier. + +To specify a Broker, add the following config to `web-console/console-config.js`: + +```js +consoleBrokerService: 'druid/BROKER_NAME' +``` + +[#19069](https://github.com/apache/druid/pull/19069) + +### Consul extension + +The contributor extension `druid-consul-extensions` lets Druid clusters use Consul for service discovery and +Coordinator/Overlord leader election instead of ZooKeeper. The extension supports ACLs, TLS/mTLS, and metrics. + +Before you switch to Consul, you need to set +`druid.serverview.type=http` and `druid.indexer.runner.type=httpRemote` cluster wide. + +[#18843](https://github.com/apache/druid/pull/18843) + ## Functional area and related changes This section contains detailed release notes separated by areas. ### Web console +#### Changed storage column displays + +- Improved the compaction config view to +- Renamed **Current size** to **Assigned size**. +- Renamed **Max size** to **Effective size**. It now displays the smaller value between `max_size` and `storage_size`. The max size is still shown as a tooltip. +- Changed usage calculation to use `effective_size` + +[#19007](https://github.com/apache/druid/pull/19007) + Review Comment: this part doesnt read smooth to me, this is part of showing storage metric for data nodes i believe, this change is also related with vsf, do you mind take a look at this, @clintropolis ? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
