This is an automated email from the ASF dual-hosted git repository. luzhijing pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/doris-website.git
The following commit(s) were added to refs/heads/master by this push: new 016f5d2276d [blog](workload group) Update Workload Group Blog (#650) 016f5d2276d is described below commit 016f5d2276d931a2620b059a7a5737ee66291d7f Author: KassieZ <139741991+kass...@users.noreply.github.com> AuthorDate: Wed May 15 13:14:35 2024 +0700 [blog](workload group) Update Workload Group Blog (#650) --- ...in-apache-doris-for-10x-faster-data-transfer.md | 2 +- blog/cross-cluster-replication-for-read-write.md | 2 +- ...pache-doris-sql-convertor-for-easy-migration.md | 2 +- ...ti-tenant-workload-isolation-in-apache-doris.md | 225 +++++++++++++++++++++ blog/release-note-2.0.9.md | 2 - src/components/recent-blogs/recent-blogs.data.ts | 10 +- src/constant/newsletter.data.ts | 18 +- static/images/CPU-hard-limit-test.png | Bin 0 -> 383207 bytes static/images/CPU-hard-limit.png | Bin 0 -> 199322 bytes static/images/CPU-soft-limit-test.png | Bin 0 -> 111364 bytes static/images/CPU-soft-limit.png | Bin 0 -> 166890 bytes static/images/memory-resource-limit.png | Bin 0 -> 46040 bytes static/images/multi-tenant-workload-group.jpg | Bin 0 -> 268817 bytes static/images/query-queue.png | Bin 0 -> 113375 bytes .../resource-isolation-based-on-resource-tag-2.PNG | Bin 0 -> 169601 bytes .../resource-isolation-based-on-resource-tag.PNG | Bin 0 -> 144188 bytes .../test-in-simulated-production-environment.png | Bin 0 -> 194079 bytes .../workload-isolation-based-on-workload-group.png | Bin 0 -> 268368 bytes 18 files changed, 241 insertions(+), 20 deletions(-) diff --git a/blog/arrow-flight-sql-in-apache-doris-for-10x-faster-data-transfer.md b/blog/arrow-flight-sql-in-apache-doris-for-10x-faster-data-transfer.md index ebdcff63962..26834d64a6c 100644 --- a/blog/arrow-flight-sql-in-apache-doris-for-10x-faster-data-transfer.md +++ b/blog/arrow-flight-sql-in-apache-doris-for-10x-faster-data-transfer.md @@ -1,6 +1,6 @@ --- { - 'title': "Arrow Flight SQL in Apache Doris for 10X faster data transfer", + 'title': "Arrow Flight SQL for 10X faster data transfer", 'summary': "Apache Doris 2.1 supports Arrow Flight SQL protocol for reading data from Doris. It delivers tens-fold speedups compared to PyMySQL and Pandas.", 'date': '2024-04-16', 'author': 'Apache Doris', diff --git a/blog/cross-cluster-replication-for-read-write.md b/blog/cross-cluster-replication-for-read-write.md index 7a0899ae59f..fa2286b0513 100644 --- a/blog/cross-cluster-replication-for-read-write.md +++ b/blog/cross-cluster-replication-for-read-write.md @@ -5,7 +5,7 @@ 'date': '2024-04-25', 'author': 'Apache Doris', 'picked': "true", - 'order': "2", + 'order': "3", 'tags': ['Best Practice'], "image": '/images/ccr-for-read-write-separation.jpg' } diff --git a/blog/from-presto-trino-clickhouse-and-hive-to-apache-doris-sql-convertor-for-easy-migration.md b/blog/from-presto-trino-clickhouse-and-hive-to-apache-doris-sql-convertor-for-easy-migration.md index 5e521adbc32..7dc397b8a38 100644 --- a/blog/from-presto-trino-clickhouse-and-hive-to-apache-doris-sql-convertor-for-easy-migration.md +++ b/blog/from-presto-trino-clickhouse-and-hive-to-apache-doris-sql-convertor-for-easy-migration.md @@ -5,7 +5,7 @@ 'date': '2024-05-06', 'author': 'Apache Doris', 'picked': "true", - 'order': "1", + 'order': "2", 'tags': ['Tech Sharing'], "image": '/images/sql-convertor-feature.jpeg' } diff --git a/blog/multi-tenant-workload-isolation-in-apache-doris.md b/blog/multi-tenant-workload-isolation-in-apache-doris.md new file mode 100644 index 00000000000..d0b7ff4db72 --- /dev/null +++ b/blog/multi-tenant-workload-isolation-in-apache-doris.md @@ -0,0 +1,225 @@ +--- +{ + 'title': "Multi-tenant workload isolation: a better balance between isolation and utilization", + 'summary': "Apache Doris supports workload isolation based on Resource Tag and Workload Group. It provides solutions for different tradeoffs among the level of isolation, resource utilization, and stable performance.", + 'date': '2024-05-14', + 'author': 'Apache Doris', + 'picked': "true", + 'order': "1", + 'tags': ['Tech Sharing'], + "image": '/images/multi-tenant-workload-group.jpg' +} + +--- + +<!-- +Licensed to the Apache Software Foundation (ASF) under one +or more contributor license agreements. See the NOTICE file +distributed with this work for additional information +regarding copyright ownership. The ASF licenses this file +to you under the Apache License, Version 2.0 (the +"License"); you may not use this file except in compliance +with the License. You may obtain a copy of the License at + + http://www.apache.org/licenses/LICENSE-2.0 + +Unless required by applicable law or agreed to in writing, +software distributed under the License is distributed on an +"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +KIND, either express or implied. See the License for the +specific language governing permissions and limitations +under the License. +--> + + +This is an in-depth introduction to the workload isolation capabilities of [Apache Doris](https://doris.apache.org). But first of all, why and when do you need workload isolation? If you relate to any of the following situations, read on and you will end up with a solution: + +- You have different business departments or tenants sharing the same cluster and you want to prevent the interference of workloads among them. + +- You have query tasks of varying priority levels and you want to give priority to your critical tasks (such as real-time data analytics and online transactions) in terms of resources and execution. + +- You need workload isolation but also want high cost-effectiveness and resource utilization rates. + +Apache Doris supports workload isolation based on Resource Tag and Workload Group. Resource Tag isolates the CPU and memory resources for different workloads at the level of backend nodes, while the Workload Group mechanism can further divide the resources within a backend node for higher resource utilization. + +:::tip + +[Demo](https://www.youtube.com/watch?v=Wd3l5C4k8Ok&t=1s) of using the Workload Manager in Apache Doris to set a CPU soft/hard limit for Workload Groups. + +::: + +## Resource isolation based on Resource Tag + +Let's begin with the architecture of Apache Doris. Doris has two [types of nodes](https://doris.apache.org/docs/get-starting/what-is-apache-doris#technical-overview): frontends (FEs) and backends (BEs). FE nodes store metadata, manage clusters, process user requests, and parse query plans, while BE nodes are responsible for computation and data storage. Thus, BE nodes are the major resource consumers. + +The main idea of a Resource Tag-based isolation solution is to divide computing resources into groups by assigning tags to BE nodes in a cluster, where BE nodes of the same tag constitute a Resource Group. A Resource Group can be deemed as a unit for data storage and computation. For data ingested into Doris, the system will write data replicas into different Resource Groups according to the configurations. Queries will also be assigned to their corresponding [Resource Groups](https://do [...] + +For example, if you want to separate read and write workloads in a 3-BE cluster, you can follow these steps: + +1. **Assign Resource Tags to BE nodes**: Bind 2 BEs to the "Read" tag and 1 BE to the "Write" tag. + +2. **Assign Resource Tags to data replicas**: Assuming that Table 1 has 3 replicas, bind 2 of them to the "Read" tag and 1 to the "Write" tag. Data written into Replica 3 will be synchronized to Replica 1 and Replica 2 and the data synchronization process consumes few resources of BE 1 and BE2. + +3. **Assign workload groups to Resource Tags**: Queries that include the "Read" tag in their SQLs will be automatically routed to the nodes tagged with "Read" (in this case, BE 1 and BE 2). For data writing tasks, you also need to assign them with the "Write" tag, so they can be routed to the corresponding node (BE 3). In this way, there will be no resource contention between read and write workloads except the data synchronization overheads from replica 3 to replicate 1 and 2. + +![Resource isolation based on Resource Tag](/images/resource-isolation-based-on-resource-tag.PNG) + +Resource Tag also enables **multi-tenancy** in Apache Doris. For example, computing and storage resources tagged with "User A" are for User A only, while those tagged with "User B" are exclusive to User B. This is how Doris implements multi-tenant resource isolation with Resource Tags at the BE side. + +![Resource isolation based on Resource Tag](/images/resource-isolation-based-on-resource-tag-2.PNG) + +Dividing the BE nodes into groups ensures **a high level of isolation**: + +- CPU, memory, and I/O of different tenants are physically isolated. + +- One tenant will never be affected by the failures (such as process crashes) of another tenant. + +But it has a few downsides: + +- In read-write separation, when the data writing stops, the BE nodes tagged with "Write" become idle. This reduces overall cluster utilization. + +- Under multi-tenancy, if you want to further isolate different workloads of the same tenant by assigning separate BE nodes to each of them, you will need to endure significant costs and low resource utilization. + +- The number of tenants is tied to the number of data replicas. So if you have 5 tenants, you will need 5 data replicas. That's huge storage redundancy. + +**To improve on this,we provide a workload isolation solution based on Workload Group in Apache Doris 2.0.0, and enhanced it in [Apache Doris 2.1.0](https://doris.apache.org/blog/release-note-2.1.0)** + +## Workload isolation based on Workload Group + +The [Workload Group](https://doris.apache.org/docs/admin-manual/resource-admin/workload-group)-based solution realizes a more granular division of resources. It further divides CPU and memory resources within processes on BE nodes, meaning that the queries in one BE node can be isolated from each other to some extent. This avoids resource competition within BE processes and optimizes resource utilization. + +Users can relate queries to Workload Groups, and thus limit the percentage of CPU and memory resources that a query can use. Under high cluster loads, Doris can automatically kill the most resource-consuming queries in a Workload Group. Under low cluster loads, Doris can allow multiple Workload Groups to share idle resources. + +Doris supports both CPU soft limit and CPU hard limit. The soft limit allows Workload Groups to break the limit and utilize idle resources, enabling more efficient utilization. The hard limit is a hard guarantee of stable performance because it prevents the mutual impact of Workload Groups. + +*(CPU soft limit and CPU hard limit are contradictory to each other. You can choose between them based on your own use case.)* + +![Workload isolation based on Workload Group](/images/workload-isolation-based-on-workload-group.png) + +Its differences from the Resource Tag-based solution include: + +- Workload Groups are formed within processes. Multiple Workload Groups compete for resources within the same BE node. + +- The consideration of data replica distribution is out of the picture because Workload Group is only a way of resource management. + +### CPU soft limit + +CPU soft limit is implemented by the `cpu_share` parameter, which is similar to weights conceptually. Workload Groups with higher `cpu_share` will be allocated more CPU time during a time slot. + +For example, if Group A is configured with a `cpu_share` of 1, and Group B, 9. In a time slot of 10 seconds, when both Group A and Group B are fully loaded, Group A and Group B will be able to consume 1s and 9s of CPU time, respectively. + +What happens in real-world cases is that, not all workloads in the cluster run at full capacity. Under the soft limit, if Group B has low or zero workload, then Group A will be able to use all 10s of CPU time, thus increasing the overall CPU utilization in the cluster. + +![CPU soft limit](/images/CPU-soft-limit.png) + +A soft limit brings flexibility and a higher resource utilization rate. On the flip side, it might cause performance fluctuations. + +### CPU hard limit + +CPU hard limit in Apache Doris 2.1.0 is designed for users who require stable performance. In simple terms, the CPU hard limit defines that a Workload Group cannot use more CPU resources than its limit whether there are idle CPU resources or not. + +This is how it works: + +Suppose that Group A is set with `cpu_hard_limit=10%` and Group B with `cpu_hard_limit=90%`. If both Group A and Group B run at full load, Group A and Group B will respectively use 10% and 90% of the overall CPU time. The difference lies in when the workload of Group B decreases. In such cases, regardless of how high the query load of Group A is, it should not use more than the 10% CPU resources allocated to it. + +![CPU hard limit](/images/CPU-hard-limit.png) + +As opposed to soft limit, a hard limit guarantees stable system performance at the cost of flexibility and the possibility of a higher resource utilization rate. + +### Memory resource limit + +> The memory of a BE node comprises the following parts: +> +> - Reserved memory for the operating system. +> +> - Memory consumed by non-queries, which is not considered in the Workload Group's memory statistics. +> +> - Memory consumed by queries, including data writing. This can be tracked and controlled by Workload Group. + +The `memory_limit` parameter defines the maximum (%) memory available to a Workload Group within the BE process. It also affects the priority of Resource Groups. + +Under initial status, a high-priority Resource Group will be allocated more memory. By setting `enable_memory_overcommit`, you can allow Resource Groups to occupy more memory than the limits when there is idle space. When memory is tight, Doris will cancel tasks to reclaim the memory resources that they commit. In this case, the system will retain memory resources for high-priority resource groups as much as possible. + + +<div style={{textAlign:'center'}}><img src="/images/memory-resource-limit.png" alt="Memory resource limit" style={{display: 'inline-block', width:300}}/></div > + + +### Query queue + +It happens that the cluster is undertaking more loads than it can handle. In this case, submitting new query requests will not only be fruitless but also interruptive to the queries in progress. + +To improve on this, Apache Doris provides the [query queue](https://doris.apache.org/docs/admin-manual/resource-admin/workload-group#query-queue) mechanism. Users can put a limit on the number of queries that can run concurrently in the cluster. A query will be rejected when the query queue is full or after a waiting timeout, thus ensuring system stability under high loads. + +![Query queue](/images/query-queue.png) + +The query queue mechanism involves three parameters: `max_concurrency`, `max_queue_size`, and `queue_timeout`. + +## Tests + +To demonstrate the effectiveness of the CPU soft limit and hard limit, we did a few tests. + +- Environment: single machine, 16 cores, 64GB + +- Deployment: 1 FE + 1 BE + +- Dataset: ClickBench, TPC-H + +- Load testing tool: Apache JMeter + +### CPU soft limit test + +Start two clients and continuously submit queries (ClickBench Q23) with and without using Workload Groups, respectively. Note that Page Cache should be disabled to prevent it from affecting the test results. + +![CPU soft limit test](/images/CPU-soft-limit-test.png) + +Comparing the throughputs of the two clients in both tests, it can be concluded that: + +- **Without configuring Workload Groups**, the two clients consume the CPU resources on an equal basis. + +- **Configuring Workload Groups** and setting the `cpu_share` to 2:1, the throughput ratio of the two clients is 2:1. With a higher `cpu_share`, Client 1 is provided with a higher portion of CPU resources, and it delivers a higher throughput. + +### CPU hard limit test + +Start a client, set `cpu_hard_limit=50%` for the Workload Group, and execute ClickBench Q23 for 5 minutes under a concurrency level of 1, 2, and 4, respectively. + +![CPU hard limit test](/images/CPU-hard-limit-test.png) + +As the query concurrency increases, the CPU utilization rate remains at around 800%, meaning that 8 cores are used. On a 16-core machine, that's **50% utilization**, which is as expected. In addition, since CPU hard limits are imposed, the increase in TP99 latency as concurrency rises is also an expected outcome. + +## Test in simulated production environment + +In real-world usage, users are particularly concerned about query latency rather than just query throughput, since latency is more easily perceptible in user experience. That's why we decided to validate the effectiveness of Workload Group in a simulated production environment. + +We picked out a SQL set consisting of queries that should be finished within 1s (ClickBench Q15, Q17, Q23 and TPC-H Q3, Q7, Q19), including single-table aggregations and join queries. The size of the TPC-H dataset is 100GB. + +Similarly, we conduct tests with and without configuring Workload Groups. + +![Test in simulated production environment](/images/test-in-simulated-production-environment.png) + +As the results show: + +- **Without Workload Group** (comparing Test 1 & 2): When dialing up the concurrency of Client 2, both clients experience a 2~3-time increase in query latency. + +- **Configuring Workload Group** (comparing Test 3 & 4): As the concurrency of Client 2 goes up, the performance fluctuation in Client 1 is much smaller, which is proof of how it is effectively protected by workload isolation. + +## Recommendations & plans + +The Resource Tag-based solution is a thorough workload isolation plan. The Workload Group-based solution realizes a better balance between resource isolation and utilization, and it is complemented by the query queue mechanism for stability guarantee. + +So which one to choose for your use case? Here is our recommendation: + +- **Resource Tag**: for use cases where different business lines of departments share the same cluster, so the resources and data are physically isolated for different tenants. + +- **Workload Group**: for use cases where one cluster undertakes various query workloads for flexible resource allocation. + +In future releases, we will keep improving user experience of the Workload Group and query queue features: + +- Freeing up memory space by canceling queries is a brutal method. We plan to implement that by disk spilling, which will bring higher stability in query performance. + +- Since memory consumed by non-queries in the BE is not included in Workload Group's memory statistics, users might observe a disparity between the BE process memory usage and Workload Group memory usage. We will address this issue to avoid confusion. + +- In the query queue mechanism, cluster load is controlled by setting the maximum query concurrency. We plan to enable dynamic maximum query concurrency based on resource availability at the BE. This is to create backpressure on the client side and thus improve the availability of Doris when clients keep submitting high loads. + +- The main idea of Resource Tag is to group the BE nodes, while that of Workload Group is to further divide the resources of a single BE node. For users to grasp these ideas, they need to learn about the concept of BE nodes in Doris first. However, from an operational perspective, users only need to understand the resource consumption percentage of each of their workloads and what priority they should have when cluster load is saturated. Thus, we will try and figure out a way to flatten [...] + +For further assistance on workload isolation in Apache Doris, join the [Apache Doris community](https://join.slack.com/t/apachedoriscommunity/shared_invite/zt-2gmq5o30h-455W226d79zP3L96ZhXIoQ). \ No newline at end of file diff --git a/blog/release-note-2.0.9.md b/blog/release-note-2.0.9.md index 9f023ac9c24..a93baa06c88 100644 --- a/blog/release-note-2.0.9.md +++ b/blog/release-note-2.0.9.md @@ -5,8 +5,6 @@ 'date': '2024-04-23', 'author': 'Apache Doris', 'tags': ['Release Notes'], - 'picked': "true", - 'order': "3", "image": '/images/2.0.9.png' } --- diff --git a/src/components/recent-blogs/recent-blogs.data.ts b/src/components/recent-blogs/recent-blogs.data.ts index 3a104eb81e3..bb4a47955b1 100644 --- a/src/components/recent-blogs/recent-blogs.data.ts +++ b/src/components/recent-blogs/recent-blogs.data.ts @@ -1,14 +1,14 @@ export const RECENT_BLOGS_POSTS = [ { - label: `Cross-cluster replication for read-write separation: story of a grocery store brand`, - link: 'https://doris.apache.org/blog/cross-cluster-replication-for-read-write', + label: `From Presto, Trino, ClickHouse, and Hive to Apache Doris: SQL convertor for easy migration`, + link: 'https://doris.apache.org/blog/from-presto-trino-clickhouse-and-hive-to-apache-doris-sql-convertor-for-easy-migration', }, { - label: `Apache Doris version 2.0.9 has been released`, - link: 'https://doris.apache.org/blog/release-note-2.0.9', + label: `Cross-cluster replication for read-write separation: story of a grocery store brand`, + link: 'https://doris.apache.org/blog/cross-cluster-replication-for-read-write', }, { - label: 'Arrow Flight SQL in Apache Doris for 10X faster data transfer', + label: 'Arrow Flight SQL for 10X faster data transfer', link: 'https://doris.apache.org/blog/arrow-flight-sql-in-apache-doris-for-10x-faster-data-transfer', }, { diff --git a/src/constant/newsletter.data.ts b/src/constant/newsletter.data.ts index fcf9c5648f3..1b9a280ab8b 100644 --- a/src/constant/newsletter.data.ts +++ b/src/constant/newsletter.data.ts @@ -1,4 +1,11 @@ export const NEWSLETTER_DATA = [ + { + tags: ['Tech Sharing'], + title: "Multi-tenant workload isolation: a better balance between isolation and utilization", + content: `Apache Doris supports workload isolation based on Resource Tag and Workload Group. It provides solutions for different tradeoffs among the level of isolation, resource utilization, and stable performance.`, + to: '/blog/multi-tenant-workload-isolation-in-apache-doris', + image: 'multi-tenant-workload-group.jpg', + }, { tags: ['Tech Sharing'], title: "From Presto, Trino, ClickHouse, and Hive to Apache Doris: SQL convertor for easy migration", @@ -13,20 +20,11 @@ export const NEWSLETTER_DATA = [ to: '/blog/cross-cluster-replication-for-read-write', image: 'ccr-for-read-write-separation.jpg', }, - { - tags: ['Release Notes'], - title: "Apache Doris version 2.0.9 has been released", - content: `Thanks to our community users and developers, about 68 improvements and bug fixes have been made in Doris 2.0.9 version.`, - to: '/blog/release-note-2.0.9', - image: '2.0.9.png', - }, { tags: ['Tech Sharing'], - title: "Arrow Flight SQL in Apache Doris for 10X faster data transfer", + title: "Arrow Flight SQL for 10X faster data transfer", content: `Apache Doris 2.1 supports Arrow Flight SQL protocol for reading data from Doris. It delivers tens-fold speedups compared to PyMySQL and Pandas.`, to: '/blog/arrow-flight-sql-in-apache-doris-for-10x-faster-data-transfer', image: 'arrow-flight-sql-in-apache-doris-for-10x-faster-data-transfer.png', }, - - ]; diff --git a/static/images/CPU-hard-limit-test.png b/static/images/CPU-hard-limit-test.png new file mode 100644 index 00000000000..6e46021e97d Binary files /dev/null and b/static/images/CPU-hard-limit-test.png differ diff --git a/static/images/CPU-hard-limit.png b/static/images/CPU-hard-limit.png new file mode 100644 index 00000000000..056ae839db2 Binary files /dev/null and b/static/images/CPU-hard-limit.png differ diff --git a/static/images/CPU-soft-limit-test.png b/static/images/CPU-soft-limit-test.png new file mode 100644 index 00000000000..a9a4a1da26a Binary files /dev/null and b/static/images/CPU-soft-limit-test.png differ diff --git a/static/images/CPU-soft-limit.png b/static/images/CPU-soft-limit.png new file mode 100644 index 00000000000..84a87c45cdb Binary files /dev/null and b/static/images/CPU-soft-limit.png differ diff --git a/static/images/memory-resource-limit.png b/static/images/memory-resource-limit.png new file mode 100644 index 00000000000..63b4b824d7a Binary files /dev/null and b/static/images/memory-resource-limit.png differ diff --git a/static/images/multi-tenant-workload-group.jpg b/static/images/multi-tenant-workload-group.jpg new file mode 100644 index 00000000000..d5a449af367 Binary files /dev/null and b/static/images/multi-tenant-workload-group.jpg differ diff --git a/static/images/query-queue.png b/static/images/query-queue.png new file mode 100644 index 00000000000..3e17a1a15db Binary files /dev/null and b/static/images/query-queue.png differ diff --git a/static/images/resource-isolation-based-on-resource-tag-2.PNG b/static/images/resource-isolation-based-on-resource-tag-2.PNG new file mode 100644 index 00000000000..413ca74ef01 Binary files /dev/null and b/static/images/resource-isolation-based-on-resource-tag-2.PNG differ diff --git a/static/images/resource-isolation-based-on-resource-tag.PNG b/static/images/resource-isolation-based-on-resource-tag.PNG new file mode 100644 index 00000000000..78c2271cb65 Binary files /dev/null and b/static/images/resource-isolation-based-on-resource-tag.PNG differ diff --git a/static/images/test-in-simulated-production-environment.png b/static/images/test-in-simulated-production-environment.png new file mode 100644 index 00000000000..e82045a7f95 Binary files /dev/null and b/static/images/test-in-simulated-production-environment.png differ diff --git a/static/images/workload-isolation-based-on-workload-group.png b/static/images/workload-isolation-based-on-workload-group.png new file mode 100644 index 00000000000..1e43fd59792 Binary files /dev/null and b/static/images/workload-isolation-based-on-workload-group.png differ --------------------------------------------------------------------- To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org