tanvipenumudy commented on code in PR #7550: URL: https://github.com/apache/ozone/pull/7550#discussion_r2136404264
########## hadoop-hdds/docs/content/design/preallocate-blocks/preallocate-blocks.md: ########## @@ -0,0 +1,124 @@ +--- +title: PreAllocate and Cache Blocks in OM +summary: Prefetch blocks from SCM asynchronously and cache them within OM to reduce dependency on SCM for every write. +date: 2025-04-30 +jira: HDDS-11894 +status: draft +--- +<!-- + Licensed under the Apache License, Version 2.0 (the "License"); + you may not use this file except in compliance with the License. + You may obtain a copy of the License at + http://www.apache.org/licenses/LICENSE-2.0 + Unless required by applicable law or agreed to in writing, software + distributed under the License is distributed on an "AS IS" BASIS, + WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + See the License for the specific language governing permissions and + limitations under the License. See accompanying LICENSE file. +--> + +## Introduction + +Whenever a write operation occurs, OM calls into SCM for allocateBlock. This adds overhead to each create request, including added latency to reach the create and allocateBlock request, as well as resource starvation since the thread in OM is blocked for the duration SCM responds to the call. + +Frequent RPC calls to SCM can become a performance bottleneck under heavy workloads. To address this, what if we could prefetch (preallocate) blocks from SCM asynchronously and cache them within OM to reduce this dependency on SCM? + +This design proposes a mechanism to preallocate blocks in OM and cache them for future use. This will reduce the number of network RPC calls to SCM and improve the overall performance of writes. + +#### Current Write Flow + + +#### New Write Flow + + +## Proposed Approach + +#### 1. Empty Cache on Initialization: + +- The cache starts with no pre-allocated blocks (0 cache size). +- Blocks are fetched and cached only upon the completion of a client write request (ref: Figure 1). + +#### Figure 1 + + +#### 2. Block Pre-Fetching: + +- For every client write request made, we see if the cache size falls below the minimum threshold (configurable property: ozone.om.prefetch.min.blocks), if yes - the background thread is triggered to asynchronously prefetch blocks from SCM while adhering to a maximum cache size limit (configurable property: ozone.om.prefetch.max.blocks). +- Example: A client write request for x blocks with an empty cache size results in min_threshold blocks being cached asynchronously in the background once the actual client request is through (ref: Figure 2). + +#### Figure 2 + + +#### 3. Block Usage and Refilling: + +- Cached blocks are used for the request if available. +- If the cache has fewer blocks than requested, all cached blocks are used, and additional blocks are fetched via a synchronous RPC call to SCM. + - Example (ref: Figure 3): If the client requests x blocks (where x < min_threshold), x of the min_threshold cached blocks are used. Post this step, Step 2. is repeated to refill the cache asynchronously (if needed). + + #### Figure 3 +  + + - Example (ref: Figure 4): If the client requests x blocks (where x > min_threshold), and min_threshold blocks remain cached, x - min_threshold additional blocks are fetched from SCM synchronously. In total x blocks are returned to the client (cache + SCM). Post this step, Step 2. is repeated to refill the cache asynchronously (if needed). Review Comment: Thank you for the review, this is a great idea - will be making the required changes here! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@ozone.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@ozone.apache.org For additional commands, e-mail: issues-h...@ozone.apache.org