capistrant commented on code in PR #19213: URL: https://github.com/apache/druid/pull/19213#discussion_r2997750333
########## docs/data-management/cascading-reindexing.md: ########## @@ -0,0 +1,412 @@ +--- +id: cascading-reindexing +title: "Cascading reindexing" +--- + +<!-- + ~ Licensed to the Apache Software Foundation (ASF) under one + ~ or more contributor license agreements. See the NOTICE file + ~ distributed with this work for additional information + ~ regarding copyright ownership. The ASF licenses this file + ~ to you under the Apache License, Version 2.0 (the + ~ "License"); you may not use this file except in compliance + ~ with the License. You may obtain a copy of the License at + ~ + ~ http://www.apache.org/licenses/LICENSE-2.0 + ~ + ~ Unless required by applicable law or agreed to in writing, + ~ software distributed under the License is distributed on an + ~ "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY + ~ KIND, either express or implied. See the License for the + ~ specific language governing permissions and limitations + ~ under the License. + --> + +:::info +Cascading reindexing is an experimental feature introduced in Druid 37. Its API may change in future releases. This feature is only for automatic compaction using [compaction supervisors](automatic-compaction.md#auto-compaction-using-compaction-supervisors) with the [MSQ compaction engine](automatic-compaction.md#use-msq-for-auto-compaction). +::: + +Cascading reindexing is a compaction supervisor template that lets you define age-based rules to automatically apply different compaction configurations as data ages. Instead of a single flat compaction configuration for an entire datasource, you define rules that say "for data older than X, apply configuration Y." Reindexing is a more general term than compaction. Reindexing not only can merge segments with the same schema and partitioning, but also can change the segment schema, partitioning, and encoding. Cascading reindexing gives you fine-grained control over how your data evolves over time. + +For example, you might want to: +- Keep recent data in hourly segments, but coarsen to daily segments after 90 days to help reduce segment count and storage footprint. +- Delete some unwanted rows from data older than 30 days. +- Change compression settings for older data. +- Roll up older data to a coarser query granularity for data . Review Comment: ```suggestion - Roll up older data to a coarser query granularity ``` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
